Sharing DNA-binding information across structurally similar proteins enables accurate specificity determination.

TitleSharing DNA-binding information across structurally similar proteins enables accurate specificity determination.
Publication TypeJournal Article
Year of Publication2020
AuthorsWetzel, JL, Singh, M
JournalNucleic Acids Res
Volume48
Issue2
Paginatione9
Date Published2020 01 24
ISSN1362-4962
KeywordsBinding Sites, Biochemical Phenomena, Biophysical Phenomena, CYS2-HIS2 Zinc Fingers, DNA-Binding Proteins, Protein Binding, Structural Homology, Protein
Abstract

<p>We are now in an era where protein-DNA interactions have been experimentally assayed for thousands of DNA-binding proteins. In order to infer DNA-binding specificities from these data, numerous sophisticated computational methods have been developed. These approaches typically infer DNA-binding specificities by considering interactions for each protein independently, ignoring related and potentially valuable interaction information across other proteins that bind DNA via the same structural domain. Here we introduce a framework for inferring DNA-binding specificities by considering protein-DNA interactions for entire groups of structurally similar proteins simultaneously. We devise both constrained optimization and label propagation algorithms for this task, each balancing observations at the individual protein level against dataset-wide consistency of interaction preferences. We test our approaches on two large, independent Cys2His2 zinc finger protein-DNA interaction datasets. We demonstrate that jointly inferring specificities within each dataset individually dramatically improves accuracy, leading to increased agreement both between these two datasets and with a fixed external standard. Overall, our results suggest that sharing protein-DNA interaction information across structurally similar proteins is a powerful means to enable accurate inference of DNA-binding specificities.</p>

DOI10.1093/nar/gkz1087
Alternate JournalNucleic Acids Res
PubMed ID31777934
PubMed Central IDPMC7028011