Mona Singh

Contact
msingh@cs.princeton.eduResearch Area
Genetics & GenomicsResearch Focus
Computational molecular biologyMy group focuses on developing and applying computational techniques to problems in molecular biology. We are particularly interested in developing algorithms for genome-level analysis of protein structure and protein-protein interactions.
Since a genome contains a complete 'parts list' of an organism, whole-genome data allows one to begin to address exhaustively the problem of determining and predicting which proteins can interact with each other. Traditionally, knowledge of protein-protein interactions has been accumulated from biochemical and genetic experiments; however, as whole-genome data accumulates, it becomes increasingly necessary to develop computational methods for predicting these interactions. Computational methods have already proven to be a useful first step for rapid genome-wide identification of putative protein function and structure, but research in the problem of computationally determining biologically relevant partners for given protein sequences is just beginning.
The difficulty of the general protein structure prediction problem precludes prediction at a detailed structural level (e.g., at the atomic level). Additionally, the constraint of genomic-level analysis favors a focus on fast, informatics-based methods. Thus, we simplify the problem of predicting protein-protein interactions in two complementary ways, one structural and the other genomic. Our structural approach has been to focus on particular structural motifs that mediate protein-protein interactions, and to develop fast, computational methods both for recognizing these motifs within protein sequences as well as for predicting which of these sequences interact with each other. Our genomic approach has been to exploit and integrate information gleaned from whole- and cross- genome analysis. Instead of explicitly using information about protein structure, these methods exploit the following ideas: (1) if two proteins interact in one genome, their homologues in other genomes are likely to interact as well and (2) regulatory information present in whole-genome sequence data or genome-wide expression data can be used to make predictions about protein function and protein-protein interactions.
Thus far, much of our work on predicting protein structure and protein-protein interactions has focused on the coiled coil motif. The coiled coil is a common and important structural motif that mediates protein-protein interactions, and is found in proteins involved in transcription, in cell-cell and viral-cell fusion events, and in maintaining the structural identity of cells. We have developed highly effective sequence-based methods for identifying whether a given protein sequence can take part in a coiled coil structure, and are currently developing novel computational techniques to predict whether two coiled coil proteins interact with each other, and if so, what the nature of this interaction is.
-
Learning probabilistic protein-DNA recognition codes from DNA-binding specificities using structural mappings. Genome Res. 2022 ;32(9):1776-86. .
-
Neuronal identities derived by misexpression of the POU IV sensory determinant in a protovertebrate. Proc Natl Acad Sci U S A. 2022 ;119(4). .
-
Metabolite discovery through global annotation of untargeted metabolomics data. Nat Methods. 2021 ;18(11):1377-1385. .
-
Comparative genomic analysis reveals varying levels of mammalian adaptation to coronavirus infections. PLoS Comput Biol. 2021 ;17(11):e1009560. .
-
dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains. Nucleic Acids Res. 2021 ;49(13):e78. .
-
Improved inference of tandem domain duplications. Bioinformatics. 2021 ;37(Suppl_1):i133-i141. .
-
uKIN Combines New and Prior Information with Guided Network Propagation to Accurately Identify Disease Genes. Cell Syst. 2020 ;10(6):470-479.e3. .
-
PertInInt: An Integrative, Analytical Approach to Rapidly Uncover Cancer Driver Genes with Perturbed Interactions and Functionalities. Cell Syst. 2020 ;11(1):63-74.e7. .
-
Sharing DNA-binding information across structurally similar proteins enables accurate specificity determination. Nucleic Acids Res. 2020 ;48(2):e9. .
-
Differential Allele-Specific Expression Uncovers Breast Cancer Genes Dysregulated by Cis Noncoding Mutations. Cell Syst. 2020 ;10(2):193-203.e4. .
-
DeMaSk: a deep mutational scanning substitution matrix and its use for variant impact prediction. Bioinformatics. 2020 ;36(22-23):5322-9. .
-
Universality and diversity in human song. Science. 2019 ;366(6468). .
-
Systematic domain-based aggregation of protein structures highlights DNA-, RNA- and other ligand-binding positions. Nucleic Acids Res. 2019 ;47(2):582-593. .
-
Two critical positions in zinc finger domains are heavily mutated in three human cancer types. PLoS Comput Biol. 2018 ;14(6):e1006290. .
-
Network-Based Coverage of Mutational Profiles Reveals Cancer Genes. Cell Syst. 2017 ;5(3):221-229.e4. .
-
Integrative analysis unveils new functions for the Cutoff protein in noncoding RNA biogenesis and gene regulation. RNA. 2017 ;23(7):1097-1109. .
-
Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes. Genome Med. 2017 ;9(1):79. .
-
Domain prediction with probabilistic directional context. Bioinformatics. 2017 ;33(16):2471-2478. .
-
TCT-230 Rotational Atherectomy vs Orbital Atherectomy in Calcified Coronary Artery Disease: A Contemporary Retrospective Comparative Analysis (ROCC study). J Am Coll Cardiol. 2016 ;68(18S):B93-B94. .
-
Genome-Wide Detection and Analysis of Multifunctional Genes. PLoS Comput Biol. 2015 ;11(10):e1004467. .
-
Beyond the E-Value: Stratified Statistics for Protein Domain Prediction. PLoS Comput Biol. 2015 ;11(11):e1004509. .
-
Pervasive variation of transcription factor orthologs contributes to regulatory network evolution. PLoS Genet. 2015 ;11(3):e1005011. .
-
A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Res. 2015 ;43(3):1965-84. .
-
Stratification of coronary artery disease patients for revascularization procedure based on estimating adverse effects. BMC Med Inform Decis Mak. 2015 ;15:9. .
-
molBLOCKS: decomposing small molecule sets and uncovering enriched fragments. Bioinformatics. 2014 ;30(14):2081-3. .