A sequence-based global map of regulatory activity for deciphering human genetics.

TitleA sequence-based global map of regulatory activity for deciphering human genetics.
Publication TypeJournal Article
Year of Publication2022
AuthorsChen, KM, Wong, AK, Troyanskaya, OG, Zhou, J
JournalNat Genet
Volume54
Issue7
Pagination940-949
Date Published2022 Jul
ISSN1546-1718
KeywordsChromatin, Epigenomics, Human Genetics, Humans, Quantitative Trait Loci, Regulatory Sequences, Nucleic Acid
Abstract

<p>Epigenomic profiling has enabled large-scale identification of regulatory elements, yet we still lack a systematic mapping from any sequence or variant to regulatory activities. We address this challenge with Sei, a framework for integrating human genetics data with sequence information to discover the regulatory basis of traits and diseases. Sei learns a vocabulary of regulatory activities, called sequence classes, using a deep learning model that predicts 21,907 chromatin profiles across >1,300 cell lines and tissues. Sequence classes provide a global classification and quantification of sequence and variant effects based on diverse regulatory activities, such as cell type-specific enhancer functions. These predictions are supported by tissue-specific expression, expression quantitative trait loci and evolutionary constraint data. Furthermore, sequence classes enable characterization of the tissue-specific, regulatory architecture of complex traits and generate mechanistic hypotheses for individual regulatory pathogenic mutations. We provide Sei as a resource to elucidate the regulatory basis of human health and disease.</p>

DOI10.1038/s41588-022-01102-2
Alternate JournalNat Genet
PubMed ID35817977
PubMed Central IDPMC9279145
Grant ListHHSN272201000054C / AI / NIAID NIH HHS / United States
R01 HG005998 / HG / NHGRI NIH HHS / United States
R01 GM071966 / GM / NIGMS NIH HHS / United States
U54 HL117798 / HL / NHLBI NIH HHS / United States
DP2 GM146336 / GM / NIGMS NIH HHS / United States