Date of Award


Degree Name

Master of Science


Computational Science


Ming-Ying Leung


The purpose of this study is to integrate multiple sources of information from patients with acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) to construct organized datasets that would enable downstream bioinformatics and statistical analyses of the patientsâ?? survival status and overall survival times in relation to their demographic, clinical, and genomic mutation profiles. With NIH Genomic Data Commons as the primary data resource and cBioPortal as the access portal, datasets on 149 and 603 unique patients with AML and ALL, respectively, were obtained. Python scripts were written to compile individual patientsâ?? single nucleotide variant (SNV) data files into one dataset for each patient group. In both groups, over 95% of the SNVs occurred only in tumor samples while less than 0.02% only in normal samples. Compared to normal variants, tumor SNV change types favored mutations that reduced GC content of genes in both patient groups. Additional results showed shifts of variant densities on all chromosomes, most noticeably on chromosome 11 in patients with AML and chromosome 2 in patients with ALL. One important task accomplished in this work was merging the individual patientsâ?? SNV data with their corresponding demographic and clinical information, which includes ethnicity and race, disease classification or staging, as well as survival outcomes among other variables. With the merged data, we propose several bioinformatics studies to investigate the functional effects of SNVs and to select likely leukemia-associated genes not reported to date in published literature. SNV occurrence frequencies in the selected genes will augment the patientsâ?? demographic and clinical information to form the final set of variables to be analyzed. Our goal is to establish a predictive model for patientsâ?? overall survival times to facilitate discoveries of potential gene therapy targets for acute leukemia.




Recieved from ProQuest

File Size

91 p.

File Format


Rights Holder

Amanda Bataycan