Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

GREAT improves functional interpretation of cis-regulatory regions

Abstract

We developed the Genomic Regions Enrichment of Annotations Tool (GREAT) to analyze the functional significance of cis-regulatory regions identified by localized measurements of DNA binding events across an entire genome. Whereas previous methods took into account only binding proximal to genes, GREAT is able to properly incorporate distal binding sites and control for false positives using a binomial test over the input genomic regions. GREAT incorporates annotations from 20 ontologies and is available as a web application. Applying GREAT to data sets from chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) of multiple transcription-associated factors, including SRF, NRSF, GABP, Stat3 and p300 in different developmental contexts, we recover many functions of these factors that are missed by existing gene-based tools, and we generate testable hypotheses. The utility of GREAT is not limited to ChIP-seq, as it could also be applied to open chromatin, localized epigenomic markers and similar functional data sets, as well as comparative genomics sets.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Enrichment analysis of a set of cis-regulatory regions.
Figure 2: Binding profiles and their effects on statistical tests.
Figure 3: Distal binding events contribute substantially to accurate functional enrichments of p300 limb peaks.

Similar content being viewed by others

References

  1. Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).

    Article  CAS  Google Scholar 

  2. Mardis, E.R. ChIP-seq: welcome to the new frontier. Nat. Methods 4, 613–614 (2007).

    Article  CAS  Google Scholar 

  3. Park, P.J. ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10, 669–680 (2009).

    Article  CAS  Google Scholar 

  4. Ji, H. et al. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat. Biotechnol. 26, 1293–1300 (2008).

    Article  CAS  Google Scholar 

  5. Kharchenko, P.V., Tolstorukov, M.Y. & Park, P.J. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol. 26, 1351–1359 (2008).

    Article  CAS  Google Scholar 

  6. Rozowsky, J. et al. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat. Biotechnol. 27, 66–75 (2009).

    Article  CAS  Google Scholar 

  7. Tuteja, G., White, P., Schug, J. & Kaestner, K.H. Extracting transcription factor targets from ChIP-Seq data. Nucleic Acids Res. 37, e113 (2009).

    Article  Google Scholar 

  8. Valouev, A. et al. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat. Methods 5, 829–834 (2008).

    Article  CAS  Google Scholar 

  9. Khatri, P. & Draghici, S. Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595 (2005).

    Article  CAS  Google Scholar 

  10. Allison, D.B., Cui, X., Page, G.P. & Sabripour, M. Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7, 55–65 (2006).

    Article  CAS  Google Scholar 

  11. Dopazo, J. Functional interpretation of microarray experiments. OMICS 10, 398–410 (2006).

    Article  CAS  Google Scholar 

  12. Lowe, C.B., Bejerano, G. & Haussler, D. Thousands of human mobile element fragments undergo strong purifying selection near developmental genes. Proc. Natl. Acad. Sci. USA 104, 8005–8010 (2007).

    Article  CAS  Google Scholar 

  13. Taher, L. & Ovcharenko, I. Variable locus length in the human genome leads to ascertainment bias in functional inference for non-coding elements. Bioinformatics 25, 578–584 (2009).

    Article  CAS  Google Scholar 

  14. Ashburner, M. et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  Google Scholar 

  15. Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).

    Article  CAS  Google Scholar 

  16. Bejerano, G. et al. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature 441, 87–90 (2006).

    Article  CAS  Google Scholar 

  17. Dostie, J. et al. Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res. 16, 1299–1309 (2006).

    Article  CAS  Google Scholar 

  18. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  Google Scholar 

  19. Schoenfelder, S. et al. Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat. Genet. 42, 53–61 (2010).

    Article  CAS  Google Scholar 

  20. Spitz, F. & Duboule, D. Global control regions and regulatory landscapes in vertebrate development and evolution. Adv. Genet. 61, 175–205 (2008).

    Article  CAS  Google Scholar 

  21. Huang, da W. et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 35, W169–W175 (2007).

    Article  Google Scholar 

  22. Chai, J. & Tarnawski, A.S. Serum response factor: discovery, biochemistry, biological roles and implications for tissue injury healing. J. Physiol. Pharmacol. 53, 147–157 (2002).

    CAS  PubMed  Google Scholar 

  23. Miano, J.M., Long, X. & Fujiwara, K. Serum response factor: master regulator of the actin cytoskeleton and contractile apparatus. Am. J. Physiol. Cell Physiol. 292, 70–81 (2007).

    Article  Google Scholar 

  24. Ruan, J. et al. TreeFam: 2008 update. Nucleic Acids Res. 36, D735–D740 (2008).

    Article  CAS  Google Scholar 

  25. Linhart, C., Halperin, Y. & Shamir, R. Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets. Genome Res. 18, 1180–1189 (2008).

    Article  CAS  Google Scholar 

  26. Natesan, S. & Gilman, M. YY1 facilitates the association of serum response factor with the c-fos serum response element. Mol. Cell. Biol. 15, 5975–5982 (1995).

    Article  CAS  Google Scholar 

  27. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102, 15545–15550 (2005).

    Article  CAS  Google Scholar 

  28. Cerami, E.G., Bader, G.D., Gross, B.E. & Sander, C. cPath: open source software for collecting, storing, and querying biological pathways. BMC Bioinformatics 7, 497 (2006).

    Article  Google Scholar 

  29. Bertolotto, C. et al. Cleavage of the serum response factor during death receptor-induced apoptosis results in an inhibition of the c-FOS promoter transcriptional activity. J. Biol. Chem. 275, 12941–12947 (2000).

    Article  CAS  Google Scholar 

  30. Poser, S., Impey, S., Trinh, K., Xia, Z. & Storm, D.R. SRF-dependent gene expression is required for PI3-kinase-regulated cell proliferation. EMBO J. 19, 4955–4966 (2000).

    Article  CAS  Google Scholar 

  31. Lee, H.J. et al. SRF is a nuclear repressor of Smad3-mediated TGF-beta signaling. Oncogene 26, 173–185 (2007).

    Article  CAS  Google Scholar 

  32. Chen, C.R., Kang, Y., Siegel, P.M. & Massagué, J. E2F4/5 and p107 as Smad cofactors linking the TGFbeta receptor to c-myc repression. Cell 110, 19–32 (2002).

    Article  CAS  Google Scholar 

  33. Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009).

    Article  CAS  Google Scholar 

  34. Blake, J.A. et al. The Mouse Genome Database genotypes:phenotypes. Nucleic Acids Res. 37, D712–D719 (2009).

    Article  CAS  Google Scholar 

  35. Wilkie, A.O. & Morriss-Kay, G.M. Genetics of craniofacial development and malformation. Nat. Rev. Genet. 2, 458–468 (2001).

    Article  CAS  Google Scholar 

  36. Capdevila, J. & Izpisúa Belmonte, J.C. Patterning mechanisms controlling vertebrate limb development. Annu. Rev. Cell Dev. Biol. 17, 87–132 (2001).

    Article  CAS  Google Scholar 

  37. Kretzschmar, M. & Massagué, J. SMADs: mediators and regulators of TGF-beta signaling. Curr. Opin. Genet. Dev. 8, 103–111 (1998).

    Article  CAS  Google Scholar 

  38. Bult, C.J., Eppig, J.T., Kadin, J.A., Richardson, J.E. & Blake, J.A. The Mouse Genome Database (MGD): mouse biology and model systems. Nucleic Acids Res. 36, D724–D728 (2008).

    Article  CAS  Google Scholar 

  39. Niswander, L. Pattern formation: old models out on a limb. Nat. Rev. Genet. 4, 133–143 (2003).

    Article  CAS  Google Scholar 

  40. Zhou, C.J., Borello, U., Rubenstein, J.L. & Pleasure, S.J. Neuronal production and precursor proliferation defects in the neocortex of mice with loss of function in the canonical Wnt signaling pathway. Neuroscience 142, 1119–1131 (2006).

    Article  CAS  Google Scholar 

  41. Wurst, W. & Bally-Cuif, L. Neural plate patterning: upstream and downstream of the isthmic organizer. Nat. Rev. Neurosci. 2, 99–108 (2001).

    Article  CAS  Google Scholar 

  42. Park, C.C. et al. Fine mapping of regulatory loci for mammalian gene expression using radiation hybrids. Nat. Genet. 40, 421–429 (2008).

    Article  CAS  Google Scholar 

  43. Chen, X. et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117 (2008).

    Article  CAS  Google Scholar 

  44. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  Google Scholar 

  45. Hsu, F. et al. The UCSC Known Genes. Bioinformatics 22, 1036–1046 (2006).

    Article  CAS  Google Scholar 

  46. The ENCODE Project Consortium Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  47. Lettice, L.A. et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum. Mol. Genet. 12, 1725–1735 (2003).

    Article  CAS  Google Scholar 

  48. Maston, G.A., Evans, S.K. & Green, M.R. Transcriptional regulatory elements in the human genome. Annu. Rev. Genomics Hum. Genet. 7, 29–59 (2006).

    Article  CAS  Google Scholar 

  49. Levings, P.P. & Bungert, J. The human beta-globin locus control region. Eur. J. Biochem. 269, 1589–1599 (2002).

    Article  CAS  Google Scholar 

  50. Spitz, F., Gonzalez, F. & Duboule, D. A global control region defines a chromosomal regulatory landscape containing the HoxD cluster. Cell 113, 405–417 (2003).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank M. Sirota for an early survey of ontologies, F. Sathira for developing an intermediary core calculation engine, T. Capellini for critical reading of the manuscript, M. Davis and S. Gutierrez for system administration and the communities of ontology developers and curators for providing invaluable data sources. C.Y.M. is supported by a Bio-X graduate fellowship. M.H. is supported by a German Research Foundation Fellowship (Hi 1423/2-1) and the Human Frontier Science Program (fellowship LT000896/2009-l). S.L.C. is a Howard Hughes Medical Institute Gilliam Fellow. A.M.W. is supported by a Stanford Graduate Fellowship. G.B. is a Packard Fellow, Searle Scholar, Microsoft Research Faculty Fellow and an Alfred P. Sloan Fellow. Research was also supported by an Edward Mallinckrodt, Jr. Foundation junior faculty grant and US National Institutes of Health grant 1R01HD059862 to G.B.

Author information

Authors and Affiliations

Authors

Contributions

C.Y.M. developed the core calculation engine, processed ontologies, analyzed data sets and co-wrote the manuscript. D.B. designed and developed the web application. M.H. added key ontologies and calculated ontology statistics. S.L.C. performed and wrote the SRF analysis. B.T.S. contributed to data set analysis and manuscript writing. A.M.W. guided website design and wrote user documentation. G.B. and C.B.L. devised the different enrichment tests and developed early core calculation engines. G.B. supervised the project and co-wrote the manuscript. All authors edited the manuscript.

Corresponding author

Correspondence to Gill Bejerano.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Note, Supplementary Figures 1–4 and Supplementary Tables 1–46 (PDF 5181 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

McLean, C., Bristor, D., Hiller, M. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol 28, 495–501 (2010). https://doi.org/10.1038/nbt.1630

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.1630

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research