Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Bioinformatics and Biostatistics

Degree Program

Biostatistics, PhD

Committee Chair

Wu, Dongfeng

Committee Co-Chair (if applicable)

Rai, Shesh

Committee Member

Rai, Shesh

Committee Member

Kloecker, Goetz

Committee Member

Zheng, Qi

Committee Member

Gaskins, Jeremy

Committee Member

Mitra, Ritendranath

Author's Keywords

cancer screening; lung cancer; sensitivity; sojourn time; lead time; probability


This dissertation contains three research projects on cancer screening probability modeling. Cancer screening is the primary technique for early detection. The goal of screening is to catch the disease early before clinical symptoms appear. In these projects, the three key parameters and lead time distribution were estimated to provide a statistical point of view on the effectiveness of cancer screening programs. In the first project, cancer screening probability model was used to analyze the computed tomography (CT) scan group in the National Lung Screening Trial (NLST) data. Three key parameters were estimated using Bayesian approach and Markov Chain Monte Carlo simulations. The NLST CT arm data have been used for the estimation. The sensitivity for lung cancer screening using CT scan is much higher than those screening using X-ray. The transition probability from disease-free to preclinical state has a peak around age 70 for both genders. The posterior mean sojourn time is around 1.5 years for all groups. The second project is dealing with lead time distribution estimation. Since the lead time is unobservable, the effectiveness of screening exams regarding the survival benefits becomes a major concern. In this study, the estimates for the projected lead time was presented by using the NLST CT arm data. Simulation results show that the probability of no-early-detection increases monotonically when the screening interval increases for both genders. The mean lead time appears longer for women than for men. In previous study, it was assumed that a person has no screening history before entering the study. However, the participants of the screening programs are usually aged population and they may already have at least one prior screening exam in the past and look healthy. In the third project, we extended the previously developed lead time distribution to consider an individual's screening history and to see how much this history will affect the lead time. We did simulation for each combination of initial screening ages, sensitivities, mean sojourn times, current ages and screening schedules in the past and in the future. We also applied the newly developed lead time distribution to the NLST data.

Included in

Biostatistics Commons