Title: Computational studies on structure and functionality of biomolecular compounds
Abstract:This thesis is on applying standard combinatorial optimization methods, dynamic programming and linear programming, to help solve two important problems in computational molecular biology: (1) predict...This thesis is on applying standard combinatorial optimization methods, dynamic programming and linear programming, to help solve two important problems in computational molecular biology: (1) predicting the secondary structure of RNA molecules and (2) predicting the functionality of small biological compounds.
After 25 years of effort, the RNA secondary structure prediction has proven to be very elusive. Much of the available algorithms are based on total free energy minimization. Yet, despite the numerous attempts to perfect this thermodynamic approach, the end results are far from being practical.
We demonstrate that delocalizing the thermodynamic cost of forming an RNA substructure through energy density notion can significantly improve available secondary structure prediction methods. Because the notion of energy density is nonlinear, the standard dynamic programming approach had to be updated. This updated algorithm can capture the secondary structure of many non-coding RNAs which have been difficult to approximate with alternative methods.
One key application of RNA structure prediction is in understanding how two or more RNAs interact (e.g. an mRNA and a regulatory RNA). In this thesis we formulate the RNA-RNA interaction prediction problem as a combinatorial optimization problem and show how to solve it again via dynamic programming. Because the complexity of the algorithm to solve the most involved formulation of the problem is very high, we also describe heuristic shortcuts, which, in practice, are highly accurate.
The second set of problems we tackle are related to small chemical molecules, which have key cellular functions. In particular we focus on structural similarity search among small chemical molecules, a standard approach used for in-silico drug discovery. It is possible to use structural similarity to deduce the bioactivities of new compounds provided that the notion of similarity reflects the bioactivity in question and we have efficient data structures to perform structural similarity search.
This thesis shows how to computationally design the optimal weighted Minkowski distance wLp for maximizing the discrimination between active and inactive compounds with respect to a bioactivity. It also demonstrates how to construct an iterative pruning based data structure for performing nearest search under the weighted Lp distance computed.
Keywords: rna secondary structure prediction, energy density, rna-rna joint secondary structure prediction, small chemical compounds, k-nearest neighbor classification.Read More
Publication Year: 2007
Publication Date: 2007-01-01
Language: en
Type: dissertation
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot