Mona Singh

Photo of Mona Singh
Associated Faculty Computer Science and the Lewis-Sigler Institute for Integrative Genomics
Computer Science Building, 420

Research Area

Genetics & Genomics

Research Focus

Computational molecular biology

My group focuses on developing and applying computational techniques to problems in molecular biology. We are particularly interested in developing algorithms for genome-level analysis of protein structure and protein-protein interactions.

Since a genome contains a complete 'parts list' of an organism, whole-genome data allows one to begin to address exhaustively the problem of determining and predicting which proteins can interact with each other. Traditionally, knowledge of protein-protein interactions has been accumulated from biochemical and genetic experiments; however, as whole-genome data accumulates, it becomes increasingly necessary to develop computational methods for predicting these interactions. Computational methods have already proven to be a useful first step for rapid genome-wide identification of putative protein function and structure, but research in the problem of computationally determining biologically relevant partners for given protein sequences is just beginning.

The difficulty of the general protein structure prediction problem precludes prediction at a detailed structural level (e.g., at the atomic level). Additionally, the constraint of genomic-level analysis favors a focus on fast, informatics-based methods. Thus, we simplify the problem of predicting protein-protein interactions in two complementary ways, one structural and the other genomic. Our structural approach has been to focus on particular structural motifs that mediate protein-protein interactions, and to develop fast, computational methods both for recognizing these motifs within protein sequences as well as for predicting which of these sequences interact with each other. Our genomic approach has been to exploit and integrate information gleaned from whole- and cross- genome analysis. Instead of explicitly using information about protein structure, these methods exploit the following ideas: (1) if two proteins interact in one genome, their homologues in other genomes are likely to interact as well and (2) regulatory information present in whole-genome sequence data or genome-wide expression data can be used to make predictions about protein function and protein-protein interactions.

Thus far, much of our work on predicting protein structure and protein-protein interactions has focused on the coiled coil motif. The coiled coil is a common and important structural motif that mediates protein-protein interactions, and is found in proteins involved in transcription, in cell-cell and viral-cell fusion events, and in maintaining the structural identity of cells. We have developed highly effective sequence-based methods for identifying whether a given protein sequence can take part in a coiled coil structure, and are currently developing novel computational techniques to predict whether two coiled coil proteins interact with each other, and if so, what the nature of this interaction is.