Date on Master's Thesis/Doctoral Dissertation

5-2019

Document Type

Doctoral Dissertation

Degree Name

Ph. D.

Department

Computer Engineering and Computer Science

Degree Program

Computer Science and Engineering, PhD

Committee Chair

Nasraoui, Olfa

Committee Co-Chair (if applicable)

Altiparmak, Nihat

Committee Member

Altiparmak, Nihat

Committee Member

Frigui, Hichem

Committee Member

Yampolskiy, Roman

Committee Member

Sanders, Scott

Author's Keywords

algorithmic bias; machine learning; interaction; iterated

Abstract

Algorithmic bias consists of biased predictions born from ingesting unchecked information, such as biased samples and biased labels. Furthermore, the interaction between people and algorithms can exacerbate bias such that neither the human nor the algorithms receive unbiased data. Thus, algorithmic bias can be introduced not only before and after the machine learning process but sometimes also in the middle of the learning process. With a handful of exceptions, only a few categories of bias have been studied in Machine Learning, and there are few, if any, studies of the impact of bias on both human behavior and algorithm performance. Although most research treats algorithmic bias as a static factor, we argue that algorithmic bias interacts with humans in an iterative manner producing a long-term effect on algorithms' performance. Recommender systems involve the natural interaction between humans and machine learning algorithms that may introduce bias over time during a continuous feedback loop, leading to increasingly biased recommendations. Therefore, in this work, we view a Recommender system environment as generating a continuous chain of events as a result of the interactions between users and the recommender system outputs over time. For this purpose, In the first part of this dissertation, we employ an iterated-learning framework that is inspired from human language evolution to study the impact of interaction between machine learning algorithms and humans. Specifically, our goal is to study the impact of the interaction between two sources of bias: the process by which people select information to label (human action); and the process by which an algorithm selects the subset of information to present to people (iterated algorithmic bias mode). Specifically, we investigate three forms of iterated algorithmic bias (i.e. personalization filter, active learning, and a random baseline) and how they affect the behavior of machine learning algorithms. Our controlled experiments which simulate content-based filters, demonstrate that the three iterated bias modes, initial training data class imbalance, and human action affect the models learned by machine learning algorithms. We also found that iterated filter bias, which is prominent in personalized user interfaces, can lead to increased inequality in estimated relevance and to a limited human ability to discover relevant data. In the second part of this dissertation work, we focus on collaborative filtering recommender systems which suffer from additional biases due to the popularity of certain items, which when coupled with the iterated bias emerging from the feedback loop between human and algorithms, leads to an increased divide between the popular items (the haves) and the unpopular items (the have-nots). We thus propose several debiasing algorithms, including a novel blind spot aware matrix factorization algorithm, and evaluate how our proposed algorithms impact both prediction accuracy and the trends of increase or decrease in the inequality of the popularity distribution of items over time. Our findings indicate that the relevance blind spot (items from the testing set whose predicted relevance probability is less than 0.5) amounted to 4\% of all relevant items when using a content-based filter that predicts relevant items. A similar simulation using a real-life rating data set found that the same filter resulted in a blind spot size of 75\% of the relevant testing set. In the case of collaborative filtering for synthetic rating data, and when using 20 latent factors, Conventional Matrix Factorization resulted in a ranking-based blind spot (items whose predicted ratings are below 90\% of the maximum predicted ratings) ranging between 95\% and 99\% of all items on average. Both Propensity-based Matrix Factorization methods resulted in blind spots consisting of between 94\% and 96\% of all items; while the Blind spot aware Matrix Factorization resulted in a ranking-based blind spot with around 90\% to 94\% of all items. For a semi-synthetic data (a real rating data completed with Matrix Factorization), Matrix Factorization using 20 latent factors, resulted in a ranking-based blind spot containing between 95\% and 99\% of all items. Popularity-based and Poisson based propensity-based Matrix Factorization resulted in a ranking-based blind spot with between 96\% and 97\% if all items; while the blind spot aware Matrix Factorization resulted in a ranking-based blind spot with between 92\% and 96\% of all items. Considering that recommender systems are typically used as gateways that filter massive amounts of information (in the millions) for relevance, these blind spot percentage result differences (every 1\% amounts to tens of thousands of items or options) show that debiasing these systems can have significant repercussions on the amount of information and the space of options that can be discovered by humans who interact with algorithmic filters.

Share

COinS