Date on Master's Thesis/Doctoral Dissertation

5-2018

Document Type

Master's Thesis

Degree Name

M.S.

Department

Computer Engineering and Computer Science

Degree Program

Computer Science, MS

Committee Chair

Frigui, Hichem

Committee Co-Chair (if applicable)

Xuwen, Zhu

Committee Member

Xuwen, Zhu

Committee Member

Amini, Amir

Committee Member

Nasraoui, Olfa

Abstract

Multiple instance regression (MIR) operates on a collection of bags, where each bag contains multiple instances sharing an identical real-valued label. Only few instances, called primary instances, contribute to the bag labels. The remaining instances are noise and outliers observations. The goal in MIR is to identify the primary instances within each bag and learn a regression model that can predict the label of a previously unseen bag. In this thesis, we introduce an algorithm that uses robust fuzzy clustering with an appropriate distance to learn multiple linear models from a noisy feature space simultaneously. We show that fuzzy memberships are useful in allowing instances to belong to multiple models, while possibilistic memberships allow identification of the primary instances of each bag with respect to each model. We also use possibilistic memberships to identify and ignore noisy instances and determine the optimal number of regression models. We evaluate our approach on a series of synthetic data sets, remote sensing data to predict the yearly average yield of a crop and application to drug activity prediction. We show that our approach achieves higher accuracy than existing methods.

Included in

Robotics Commons

Share

COinS