Identification of Prostate Cancer-Associated Genomic Alterations by Analyzing Variant Frequencies, Functional Effects, and Protein Interactions
With the ever-increasing varieties of sequencing techniques, the volume and scope of genomic data are explosively expanded, offering unparalleled opportunities for researchers to study gene-disease associations, identify biomarkers, and thus develop more effective diagnostic and therapeutic strategies. In this project, I have developed a computational workflow and a new scoring scheme, which combine statistical frequency-based analyses with two well-established functional effect prediction tools FATHMM and PROVEAN, to evaluate nonsynonymous GSVs and identify potential cancer-related protein-coding genes for downstream enrichment and protein-protein interaction (PPI) studies.This method has been applied to process a collection of 503 whole exome sequencing datasets for patients with prostate cancer (PrCa). The datasets were downloaded from The Cancer Genome Atlas as variant call format (VCF) files containing GSV information for paired tumor and normal samples. Exploratory statistics revealed unusually high level of transitions G→A and C→T among cancer samples. Furthermore, 5 GSVs were found significantly associated with the disease. Among 61 high-scoring genes identified by our scoring scheme, 27 were found by PPI analysis to have degrees of connection ≥ 4 with well-known PrCa-related genes. While 18 of them are reportedly associated with PrCa, 9 genes (TRRAP, EPHB1, HERC2, MCM3, SPTA1, SALL1, HERC1, TTN, and MYH6) have not been previously documented in relation to PrCa. Their potential roles in PrCa could be investigated by further bioinformatics and wet-lab studies.
Wang, Bofei, "Identification of Prostate Cancer-Associated Genomic Alterations by Analyzing Variant Frequencies, Functional Effects, and Protein Interactions" (2021). ETD Collection for University of Texas, El Paso. AAI13886011.