Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Computer Engineering and Computer Science

Degree Program

Computer Science and Engineering, PhD

Committee Chair

Elmaghraby, Adel

Committee Co-Chair (if applicable)

El-Baz, Ayman

Committee Member

Zhang, Hi

Committee Member

Imam, Ibrahim

Committee Member

Dunlap, Neal

Author's Keywords

CT; lung cancer; medical imaging; machine learning


The enormity of changes and development in the field of medical imaging technology is hard to fathom, as it does not just represent the technique and process of constructing visual representations of the body from inside for medical analysis and to reveal the internal structure of different organs under the skin, but also it provides a noninvasive way for diagnosis of various disease and suggest an efficient ways to treat them. While data surrounding all of our lives are stored and collected to be ready for analysis by data scientists, medical images are considered a rich source that could provide us with a huge amount of data, that could not be read easily by physicians and radiologists, with valuable information that could be used in smart ways to discover new knowledge from these vast quantities of data. Therefore, the design of computer-aided diagnostic (CAD) system, that can be approved for use in clinical practice that aid radiologists in diagnosis and detecting potential abnormalities, is of a great importance. This dissertation deals with the development of a CAD system for lung cancer diagnosis, which is the second most common cancer in men after prostate cancer and in women after breast cancer. Moreover, lung cancer is considered the leading cause of cancer death among both genders in USA. Recently, the number of lung cancer patients has increased dramatically worldwide and its early detection doubles a patient’s chance of survival. Histological examination through biopsies is considered the gold standard for final diagnosis of pulmonary nodules. Even though resection of pulmonary nodules is the ideal and most reliable way for diagnosis, there is still a lot of different methods often used just to eliminate the risks associated with the surgical procedure. Lung nodules are approximately spherical regions of primarily high density tissue that are visible in computed tomography (CT) images of the lung. A pulmonary nodule is the first indication to start diagnosing lung cancer. Lung nodules can be benign (normal subjects) or malignant (cancerous subjects). Large (generally defined as greater than 2 cm in diameter) malignant nodules can be easily detected with traditional CT scanning techniques. However, the diagnostic options for small indeterminate nodules are limited due to problems associated with accessing small tumors. Therefore, additional diagnostic and imaging techniques which depends on the nodules’ shape and appearance are needed. The ultimate goal of this dissertation is to develop a fast noninvasive diagnostic system that can enhance the accuracy measures of early lung cancer diagnosis based on the well-known hypotheses that malignant nodules have different shape and appearance than benign nodules, because of the high growth rate of the malignant nodules. The proposed methodologies introduces new shape and appearance features which can distinguish between benign and malignant nodules. To achieve this goal a CAD system is implemented and validated using different datasets. This CAD system uses two different types of features integrated together to be able to give a full description to the pulmonary nodule. These two types are appearance features and shape features. For the appearance features different texture appearance descriptors are developed, namely the 3D histogram of oriented gradient, 3D spherical sector isosurface histogram of oriented gradient, 3D adjusted local binary pattern, 3D resolved ambiguity local binary pattern, multi-view analytical local binary pattern, and Markov Gibbs random field. Each one of these descriptors gives a good description for the nodule texture and the level of its signal homogeneity which is a distinguishable feature between benign and malignant nodules. For the shape features multi-view peripheral sum curvature scale space, spherical harmonics expansions, and different group of fundamental geometric features are utilized to describe the nodule shape complexity. Finally, the fusion of different combinations of these features, which is based on two stages is introduced. The first stage generates a primary estimation for every descriptor. Followed by the second stage that consists of an autoencoder with a single layer augmented with a softmax classifier to provide us with the ultimate classification of the nodule. These different combinations of descriptors are combined into different frameworks that are evaluated using different datasets. The first dataset is the Lung Image Database Consortium which is a benchmark publicly available dataset for lung nodule detection and diagnosis. The second dataset is our local acquired computed tomography imaging data that has been collected from the University of Louisville hospital and the research protocol was approved by the Institutional Review Board at the University of Louisville (IRB number 10.0642). These frameworks accuracy was about 94%, which make the proposed frameworks demonstrate promise to be valuable tool for the detection of lung cancer.