Date on Master's Thesis/Doctoral Dissertation
5-2017
Document Type
Master's Thesis
Degree Name
M.S.
Department
Computer Engineering and Computer Science
Degree Program
Computer Science, MS
Committee Chair
Frigui, Hichem
Committee Co-Chair (if applicable)
Sunkara, Mahendra
Committee Member
Sunkara, Mahendra
Committee Member
Nasraoui, Olfa
Author's Keywords
machine learning; materials science; bandgap; chalcopyrites; defect-induced magnetism; materials genome initiative
Abstract
The high pace of nowadays industrial evolution is creating an urgent need to design new cost efficient materials that can satisfy both current and future demands. However, with the increase of structural and functional complexity of materials, the ability to rationally design new materials with a precise set of properties has become increasingly challenging. This basic observation has triggered the idea of applying machine learning techniques in the field, which was further encouraged by the launch of the Materials Genome Initiative (MGI) by the US government since 2011. In this work, we present a novel approach to apply machine learning techniques for materials science applications. Guided by knowledge from domain experts, our approach focuses on machine learning to accelerate data-driven discovery of materials properties. Our objectives are two folds: (i) Identify the optimal set of features that best describes a given predicted variable. (ii) Boost prediction accuracy via applying various regression algorithms. Ordinary Least Square, Partial Least Square and Lasso regressions, combined with well adjusted feature selection techniques are applied and tested to predict key properties of semiconductors for two types of applications. First, we propose to build a more robust prediction model for band-gap energy (BG-E) of chalcopyrites, commonly used for solar cells industry. Compared to the results reported in [1-3] , our approach shows that learning and using only a subset of relevant features can improve the prediction accuracy by about 40%. For the second application, we propose to determine the underlying factors responsible for Defect-Induced Magnetism (DIM) in Dilute Magnetic Semiconductors (DMS) through the analysis of a set of 30 features for different DMS systems. We show that 8 of these features are more likely to contribute to this property. Using only these features to predict the total magnetic moment of new candidate DMSs has reduced the mean square error by about 90% compared to the models trained using the whole set of features. Given the scarcity of the available data sets for similar applications, this work aims not only to build robust models but also to establish a collaborative platform for future research.
Recommended Citation
Khmaissia, Fadoua, "Data driven discovery of materials properties." (2017). Electronic Theses and Dissertations. Paper 2700.
https://doi.org/10.18297/etd/2700
Included in
Other Chemical Engineering Commons, Other Computer Engineering Commons, Other Engineering Science and Materials Commons, Other Materials Science and Engineering Commons