Inferring interaction partners from protein sequences.

TitleInferring interaction partners from protein sequences.
Publication TypeJournal Article
Year of Publication2016
AuthorsBitbol, A-F, Dwyer, RS, Colwell, LJ, Wingreen, NS
JournalProc Natl Acad Sci U S A
Volume113
Issue43
Pagination12180-12185
Date Published2016 Oct 25
ISSN1091-6490
KeywordsAlgorithms, ATP-Binding Cassette Transporters, Bacteria, Entropy, Histidine Kinase, Protein Binding, Protein Interaction Maps, Sequence Analysis, Protein, Signal Transduction
Abstract

<p>Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multiprotein complexes and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners, causing their sequences to be correlated. Here we exploit these correlations to accurately identify, from sequence data alone, which proteins are specific interaction partners. Our general approach, which employs a pairwise maximum entropy model to infer couplings between residues, has been successfully used to predict the 3D structures of proteins from sequences. Thus inspired, we introduce an iterative algorithm to predict specific interaction partners from two protein families whose members are known to interact. We first assess the algorithm's performance on histidine kinases and response regulators from bacterial two-component signaling systems. We obtain a striking 0.93 true positive fraction on our complete dataset without any a priori knowledge of interaction partners, and we uncover the origin of this success. We then apply the algorithm to proteins from ATP-binding cassette (ABC) transporter complexes, and obtain accurate predictions in these systems as well. Finally, we present two metrics that accurately distinguish interacting protein families from noninteracting ones, using only sequence data.</p>

DOI10.1073/pnas.1606762113
Alternate JournalProc Natl Acad Sci U S A
PubMed ID27663738
PubMed Central IDPMC5087060
Grant ListR01 GM082938 / GM / NIGMS NIH HHS / United States