Date of Award

2013-01-01

Degree Name

Master of Science

Department

Computational Science

Advisor(s)

Ming-Ying Leung

Abstract

The rapidly growing amounts of available biomolecular sequence data, which may represent information from small gene fragments to large complete genomes, have led to the a great need for powerful computational resources for data analysis and storage. With the decoding of the human and other genomes, RNA secondary structure prediction has become an important area of interest in biology and medicine because they help in understanding the mechanisms of many biological processes such as gene regulation and viral replication, and in designing RNA-based therapies to treat various diseases. Due to the complexity of their algorithms, many existing and upcoming computational tools for the prediction of RNA secondary structures, require large amounts of memory and processing time, and therefore can only handle RNA sequences of limited length. For example, the pknotsRG program, which can predict RNA secondary structures with pseudoknots, has a limitation of handling no more than 800 nucleotide at one time. However, many RNA, such as the RNA viral genomes, contains thousands of nucleotides, making secondary structure prediction impractical if not impossible. I will present an alternative approach, in which a cutting method to generate chunks of RNA sequences is first applied, then the pknotsRG program is used for prediction, and finally a high-throughput distributed batch computing system called HTCondor is used to reduce the waiting time for the RNA secondary structure prediction.

Language

en

Provenance

Received from ProQuest

File Size

42 pages

File Format

application/pdf

Rights Holder

Gerardo A. Cardenas

Share

COinS