Information Characterization & Exploitation

Information Characterization & Exploitation

Intelligent Systems: From Theory to Practice

Learning to See: An Adaptive Model for Tagging Complex Scenes

poster

Click Image To See Video

In this project, we explore how intensive pre-processing affects the ability of deep learning methods to detect and accurately identify objects in video. We propose an all-in-one model for object classification, which improves standard classification techniques with noise removal, background subtraction, and object detection. These steps ensure that as many objects as possible are tagged in a given video frame.Modern computer vision-based neural networks are highly accurate and efficient in object detection and classification in standard videos. Video scenes with a static camera, background, and only a few objects of interest can be easily tagged. However, these methods suffer degradation of accuracy in more dynamic scenes, where the camera, background, and multiple subjects of interest are in motion. In this paper, we explore how intensive pre-processing affects the ability of deep learning and computer vision methods to detect and accurately identify objects in video. We develop an all-in-one model for object classification, which improves standard classification techniques with the integration of noise removal, background subtraction, and object detection. These steps ensure that the greatest number of objects possible are extracted from a given video frame. For real-time functionality, we use GPU parallelization with PyCUDA, which gives access to CUDA's parallel computation API in Python. Using Python as the basis of our model allows our system to be modular and reusable; since most neural network APIs are Python-based, the steps of our system can be easily swtiched out for newer and better versions.

CATCH-20: Taking Analytics to the Edge

poster

Click Image To See Video

This project targeted developing machine learning techniques and edge analytics capable of exploiting low-fidelity, cyber-physical signatures. The concept involved the use of data from sensors of mobile devices, collected as time series data and analyzed on an Android device. The research team developed a mobile application, Catch-20, that integrates high resolution data visualization, event detection, feature extraction and deep learning classification. Initially, the team explored the idea of developing the application using Python due to its extensive machine learning and data analysis libraries. However, the team faced interoperability issues between Python, Java, and Android Studio and reached the conclusion that the use of Python to develop native looking application is not yet mature enough. Catch-20 was programmed using Java and Android Studio. To demonstrate the capability of the application, infrasound data were collected from the microphones of mobile devices to classify space rocket launches. The team used an open-source library for visualization, a sliding windows algorithm and the Bhattacharya distance for detection, and a multi-layer perceptron (MLP) architecture for data classification. The use of a standard mobile device  information-rich signals and joined with new machine learning based signal processing is an innovative framework designed to characterize several phenomena, including environmental hazards monitoring, traffic control, financial analysis, and medical analysis.

Near-field Infrasound Classification of Rocket Launch Signatures

Poster

This project targeted developing machine learning techniques and edge analytics capable of exploiting low-fidelity, cyber-physical signatures. The concept involved the use of data from sensors of mobile devices, collected as time series data and analyzed on an Android device. The research team developed a mobile application, Catch-20, that integrates high resolution data visualization, event detection, feature extraction and deep learning classification. Initially, the team explored the idea of developing the application using Python due to its extensive machine learning and data analysis libraries. However, the team faced interoperability issues between Python, Java, and Android Studio and reached the conclusion that the use of Python to develop native looking application is not yet mature enough. Catch-20 was programmed using Java and Android Studio. To demonstrate the capability of the application, infrasound data were collected from the microphones of mobile devices to classify space rocket launches. The team used an open-source library for visualization, a sliding windows algorithm and the Bhattacharya distance for detection, and a multi-layer perceptron (MLP) architecture for data classification. The use of a standard mobile device  information-rich signals and joined with new machine learning based signal processing is an innovative framework designed to characterize several phenomena, including environmental hazards monitoring, traffic control, financial analysis, and medical analysis.

A Deep Learning Approach for Enhanced Classification of Global Seismic Waveforms

Poster

Over the past 50 years global seismic detection has matured considerably due to the importance of differentiating seismic events: man-made (mining explosions) from natural phenomena (earthquakes, volcanoes). Where there is still some prevalent concern with these seismological advancements is in separating these classes. Tis concern derives from the fact that a seismic signature lacks easily defned features which complicates the discrimination process for a variety of detection methods. Tis paper aims to address this issue, by evaluating state-of-the-art machine learning algorithms, and compare several deep classifcation techniques. We target deep learning methods (Deep Neural Networks, Convolutional Neural Networks, and Long-Short Term Memory Networks) due to their ability to mimic human-like neural connections and support data-specifc feature extraction to classify waveform signatures. Our proposed algorithms show noteworthy accuracy, opening up the realization of automated network seismic discrimination. Our results show evidence that deep learning should be considered the leading candidate for classifying seismic waveforms and will further advance the gap between machine learning and geological sciences.

Deep Wavelet Scattering Features for Infrasonic Threat Identification

Poster

Infrasonic waves continue to be a staple of threat identification due to their presence in a variety of natural and man-made events, along with their low-frequency characteristics supporting detection over great distances.  Considering the large set of phenomena that produce infrasound, it is critical to develop methodologies that  exploit the unique signatures generated by such events to aid in threat identification. In this work, we propose a new infrasonic time-series classification technique based on the recently introduced Wavelet Scattering Transform (WST). Leveraging concepts from wavelet theory and signal processing, the WST induces a deep feature mapping on time series that is locally time invariant and stable to time-warping deformations through cascades of signal filtering and modulus operators. We demonstrate that the WST features can be utilized with a variety of classification methods to gain better discrimination. Experimental validation on the Library of Typical Infrasonic Signals|containing infrasound events from mountain associated waves, microbaroms, auroral infrasonic waves and volcanic eruptions|illustrates the effectiveness of our approach and demonstrate it to be competitive with other state-of-the-art classification techniques.

TimeSeries Classification Using Wavelet Scattering Transform

    poster

The recently introduced Wavelet Scattering Transform (WST) induces a feature mapping on time series that is locally time invariant and stable to time-warping deformations. These multiscale features capture a rich set of local time and frequency characteristics that more familiar methods like Fourier and mel-cepstrum transforms miss. This project explores the effectiveness of these hierarchical features for time series classification tasks. The WST framework is leveraged to produce a simple extension incorporating the Discrete Wavelet Transfrom (DWT). In addition, the combination of multiple or independent scattering coefficient layers are empirically evaluated for their discrimination capabilities. Experimental evaluations on several benchmark datasets demonstrate the utility of the WST, with these high discriminative features strengthening the performance of even native classifiers like k-nearest neighbors.

Infrasound threat classfication: A Statistical comparison of deep learning architectures

 

Infrasound propagation through various atmospheric conditions and interaction with environmental factors induce highly non-linear and non-stationary effects that make it difficult to extract reliable attributes for classification. We present featureless classification results on the Library of Typical Infrasonic Signals using several
deep learning techniques, including long short-term memory, self-normalizing, and fully convolutional neural networks with statistical analysis to establish significantly superior models. In general, the deep classifiers achieve near-perfect classification accuracies on the four classes of infrasonic events including mountain associated waves, microbaroms, auroral infrasonic waves, and volcanic eruptions. Our results provide evidence that deep neural network architectures be considered the leading candidate for classifying infrasound waveforms which can directly benefit applications that seek to identify infrasonic events such as severe weather forecasting, natural disaster early warning systems, and nuclear weapons monitoring.

Infrasound classification using long short-term memory recurrent neural networks

poster 

Infrasound waves are sub-audible waves that can be caused by natural as well as man-made events. Many of these events, such as nuclear explosions and volcanic eruptions, can needlessly cost lives if no warning is given. Since the United Nations adopted the Comprehensive Nuclear-Test-Ban Treaty (CTBT) in 1996, infrasound monitoring arrays have been constructed around the world. Ultimately, one would wish to develop a system that can continuously monitor raw data from these arrays and warn the appropriate authorities about dangerous events. Any machine learning approach to infrasound classification must be able to learn long time-dependencies. Neural Networks have previously been used in infrasound classification, but rely on expensive feature extraction methods. This project develops a featureless long short-term memory (LSTM) network that classifies infrasound events using waveform data. The model achieves varied results on standard benchmark time series classification tasks but outperforms previous benchmarks on infrasound classification on 2 and 3 class problems. Using the Alaska dataset of four classes of natural events, the baseline model consistently achieves near-perfect accuracies using very few training epochs. The baseline model also performs extremely well on raw rail degradation data which suggests that it may be possible to implement a continuous monitoring system.

Optimizing the Classification by Discriminative Interpolation Algorithm

poster

This project works with the recent, state-of-the-art machine learning method Classification by Discriminative Interpolation (CDI). CDI is a supervised learning framework for functional data that uses k-nearest neighbors and discriminative interpolation to classify incoming curves. Intuitively, CDI learns a model by warping the training curves such that curves in the same class look more like each other while pushing away curves from different classes. The testing phase follows the same methodology, for every incoming curve, it is discriminatively interpolated to be closer to curves from every class. The label of the class that renders the lowest error is then assigned to the input curve. CDI showed promising results but was originally implemented in Matlab, which resulted in slow execution times and limited the algorithm to smaller datasets. The aim of this research is to optimize CDI to allow application on larger, more complex, and new world problems. With a faster execution time, the algorithm becomes more practical to use. This will be done through a re-designed implementation in c++. We will explore a variety of improvements through different programming optimization techniques and parallelization. The resulting c++ version of CDI has significantly faster execution times across a variety of datasets.

Micro-Copter UAV

SEP18 - Unmanned Aerial Vehicles (UAVs) are now a proven commodity of the modern day battlefield, providing real-time imagery data feeds, targeting, prosecution, and round-the-clock monitoring of high-value targets. However, the command, control, and tasking decisions of UAV assets are still executed at locations remote from the frontlines. The buzz around the next generation of UAV systems is to equip frontline soldiers with scaled-down UAV™s; thus, giving them an invaluable resource to aid with timely and well-informed execution of the observe, orient, decide, and act (OODA) loop.

A micro-copter is a VTOL (Vertical takeoff and landing) platform that provides many advantages over conventional aircraft platforms. These advantages include ease of launch and recovery, hover capability, and a more compact form factor. The ICE Lab has developed hardware and software solutions for several micro-copter applications, including direct georeferencing from smart phones, image analysis, sensor system integration, and custom gimbal design.


Posted By
 Dr. Adrian Peter 

Large-scale Clustering for Big Data Analytics: A Map Reduce Implementation of Hierarchical Affinity Propagation

JUL22 This project allows users to effectively perform a hierarchical clustering algorithm over extremely large datasets. The research team developed a distributed software system which reads in data from multiple input sources using a common interface, clusters the data according to a user-defined similarity metric, and represents the extracted clusters to the user in an interactive, web-based visualization. In order to deal with large “Big Data” datasets, the team derived and implemented a distributed version of the Hierarchical Affinity Propagation (HAP) clustering algorithm using the MapReduce framework. This parallelization allows the algorithm to run in best-case linear time on any cardinality dataset. It also allows execution of the software within a scalable cloud-computing framework such as Amazon’s Elastic Compute Cloud (EC2).

 

 

Posted By Dr. Adrian Peter

Density Estimation for Streaming Data Analytics

JUN12 - Stream computing is rapidly gaining momentum in markets ranging from healthcare to commercial businesses to defense—an accelerated adoption driven by the promise of delivering actionable decision support in a timely manner. By processing high-volume data on the wire, we can greatly mitigate the need to rely on traditional paradigms of data warehousing and batch mining of information sources. To deliver on the promise of on-the-wire actionable intelligence, the backend data ingest and routing infrastructure must be supported by advanced analytic algorithms that can extract the value-added information from the stream and enable application-specific analysis and discovery. Amplified by the demand function for stream computing, there exists an immediate need to accelerate the development of analytics for data on the move. We propose to directly address this deficiency by investigating and delivering solutions that significantly advance the state of the art in statistical methodologies at the heart of advanced analytics. In particular, this project seeks to develop novel density estimation techniques that will enable robust data characterization, in an incremental and computationally efficient manner suitable for the streaming paradigm.


Posted By
 Michal Frystacky 

Sliding Wavelets for Indexing and Retrieval

APR06 - Shape representation and retrieval of stored shape models are becoming increasingly more prominent in fields such as medical imaging, molecular biology and remote sensing. We present a novel framework that directly addresses the necessity for a rich and compressible shape representation, while simultaneously providing an accurate method to index stored shapes. The core idea is to represent point-set shapes as the square-root of probability densities expanded in a wavelet basis. We then use this representation to develop a natural similarity metric that respects the geometry of these distributions, i.e. under the wavelet expansion distributions are points on a unit hypersphere and the distance between distributions is given by the separating arc length. The process uses a linear assignment solver for non-rigid alignment between densities prior to matching; this has the connotation of ``sliding'' wavelet coefficients akin to the sliding block puzzle L'Âne Rouge. (We acknowledge support from the National Science Foundation, NSF IIS-0307712.)

Posted By Michal Frystacky

 

Wavelet Density Estimation

 

APR06 - Wavelet based density estimators have gained in popularity due to their ability to approximate a large class of functions; adapting well to difficult situations such as when densities exhibit abrupt changes. The decision to work with wavelet density estimators brings along with it theoretical considerations (e.g. non-negativity, integrability) and empirical issues (e.g. computation of basis coefficients) that must be addressed in order to obtain a bona fide density. We present a new method to accurately estimate a non-negative, density which directly addresses many of the problems in practical wavelet density estimation. We cast the estimation procedure in a maximum likelihood framework that estimates the square root of the density √p; allowing us to obtain the natural non-negative density representation (√p)². Analysis of this method brings to light a remarkable theoretical connection with the Fisher information of the density and consequently leads to an efficient constrained optimization procedure to estimate the wavelet coefficients. (We acknowledge support from the National Science Foundation, NSF IIS-0307712.)

Posted By Michal Frystacky

Shape Analysis with Parametric Mixtures

 

MAR23 - Shape matching plays a prominent role in the analysis of medical and biological structures. We present a unifying framework for shape matching that uses mixture-models to couple both the shape representation and deformation. The theoretical foundation is drawn from information geometry where information matrices are used to establish intrinsic distances between parametric densities. When a parameterized probability density function is used to represent a landmark-based shape, the modes of deformation are automatically established through the information matrix of the density. We first show that given two shapes parameterized by Gaussian mixture models, the well known Fisher information matrix of the mixture model is a natural, intrinsic tool for computing shape geodesics. We have also developed a new Riemannian metric based on generalized φ-entropy measures. In sharp contrast to the Fisher-Rao metric, our new metric is available in closed-form. Geodesic computations using the new metric are considerably more efficient.

Posted By Michal Frystacky