Date on Master's Thesis/Doctoral Dissertation
Computer Engineering and Computer Science
Computer Science and Engineering, PhD
Park, Juw Won
Data augmentation; semi-supervised deep learning; generative models; object recognition
Deep learning models have achieved state of the art performances, especially for computer vision applications. Much of the recent successes can be attributed to the existence of large, high quality, labeled datasets. However, in many real-world applications, collecting similar datasets is often cumbersome and time consuming. For instance, developing robust automatic target recognition models from infrared images still faces major challenges. This is mainly due to the difficulty of acquiring high resolution inputs, sensitivity to the thermal sensors' calibration, meteorological conditions, targets' scale and viewpoint invariance. Ideally, a good training set should contain enough variations within each class for the model to learn the most optimal decision boundaries. However, when there are under-represented regions in the training feature space, especially in low data regime or in presence of low-quality inputs, the model risks learning sub-optimal decision boundaries, resulting in sub-optimal predictions. This dissertation presents novel data augmentation (DA) strategies aimed at improving the performance of machine learning models in low data regimes. The proposed techniques are designed to augment limited labeled datasets, providing the models with additional information to learn from.\\ The first contribution of this work is the development of Confidence-Guided Generative Augmentation (CGG-DA), a technique that trains and learns a generative model, such as Variational Autoencoder (VAE) and Deep Convolutional Generative Adversarial Networks (DCGAN), to generate synthetic augmentations. These generative models can generate labeled and/or unlabeled data by drawing from the same distribution as the under-performing samples based on a baseline reference model. By augmenting the training dataset with these synthetic images, CGG-DA aims to bridge the performance gap across different regions of the training feature space. We also introduce a Tool-Supported Contextual Augmentation (TSC-DA) technique that leverages existing ML models, such as classifiers or object detectors, to label available unlabeled data. Samples with consistent and high confidence predictions are used as labeled augmentations. On the other hand, samples with low confidence predictions might still contain some information even though they are more likely to be noisy and inconsistent. Hence, we keep them and use them as unlabeled samples during. Our third proposed DA explores the use of existing ML tools and external image repositories for data augmentation. This approach, called Guided External Data Augmentation (EG-DA), leverages external image repositories to augment the available dataset. External repositories are typically noisy, and might include a lot of out-of-distribution (OOD) samples. If included in the training process without proper handling, OOD samples can confuse the model and degrade the performance. To tackle this issue, we design and train a VAE-based anomaly detection component and use it to filter out any OOD samples. Since our DA includes both labeled data and a larger set of unlabeled data, we use semi-supervised training to exploit the information contained in the generated augmentations. This can guide the network to learn complex representations, and generalize to new data. The proposed data augmentation techniques are evaluated on two computer vision applications, and using multiple scenarios. We also compare our approach, using benchmark datasets, to baseline models trained on the initial labeled data only, and to existing data augmentation techniques. We show that each proposed augmentation consistently improve the results. We also perform an in-depth analysis to justify the observed improvements.
Khmaissia, Fadoua, "Guided data augmentation for improved semi-supervised image classification in low data regime." (2023). Electronic Theses and Dissertations. Paper 4088.
Retrieved from https://ir.library.louisville.edu/etd/4088