Date on Master's Thesis/Doctoral Dissertation
5-2018
Document Type
Master's Thesis
Degree Name
M.S.
Department
Computer Engineering and Computer Science
Degree Program
Computer Science, MS
Committee Chair
Frigui, Hichem
Committee Co-Chair (if applicable)
Xuwen, Zhu
Committee Member
Xuwen, Zhu
Committee Member
Amini, Amir
Committee Member
Nasraoui, Olfa
Author's Keywords
MIR; multiple instance regression; bags; labels; robust fuzzy clustering
Abstract
Multiple instance regression (MIR) operates on a collection of bags, where each bag contains multiple instances sharing an identical real-valued label. Only few instances, called primary instances, contribute to the bag labels. The remaining instances are noise and outliers observations. The goal in MIR is to identify the primary instances within each bag and learn a regression model that can predict the label of a previously unseen bag. In this thesis, we introduce an algorithm that uses robust fuzzy clustering with an appropriate distance to learn multiple linear models from a noisy feature space simultaneously. We show that fuzzy memberships are useful in allowing instances to belong to multiple models, while possibilistic memberships allow identification of the primary instances of each bag with respect to each model. We also use possibilistic memberships to identify and ignore noisy instances and determine the optimal number of regression models. We evaluate our approach on a series of synthetic data sets, remote sensing data to predict the yearly average yield of a crop and application to drug activity prediction. We show that our approach achieves higher accuracy than existing methods.
Recommended Citation
Trabelsi, Mohamed, "Robust fuzzy clustering for multiple instance regression." (2018). Electronic Theses and Dissertations. Paper 2975.
https://doi.org/10.18297/etd/2975