Date on Master's Thesis/Doctoral Dissertation


Document Type

Master's Thesis

Degree Name


Department (Legacy)

Department of Biostatistics

Committee Chair

Thompson, Caryn M.


Heart--Diseases--Patients--Rehabilitation; Data mining


The purpose of this paper is to examine the process of text mining and using the results to show the possible benefits of cardiopulmonary rehabilitation. The 555 patients enrolled in the study were receiving inpatient cardiopulmonary rehabilitation. Each patient had comorbidity codes associated with them. These codes are secondary diagnoses to the cardiac or pulmonary event that resulted in their hospitalization. The patients had secondary conditions ranging in number from 1 to 10. The patients were assessed at admission and discharge for functional independence. Since there are numerous comorbidity codes for each patient, it would be difficult to analyze each one. Therefore, we can text mine these codes to create meaningful clusters to help in the analysis. This paper explains the process and theory of text mining and clustering. We use these results to perform statistical analysis to examine the benefits of cardiopulmonary rehabilitation.