Our group develops bioinformatics methods for comparative genomics. We are interested in how genomes and their encoded functions change over time scales from developmental transitions in human cells to evolution across the tree of life. The models and tools we create are unified by statistical rigor, a phylogenetic perspective, and massive integration of public data. Collaboration, creating open source software, and promoting open science are priorities. Many of our projects involve machine-learning, stochastic process models of genome evolution, longitudinal modeling, sequence analysis, and multiple hypothesis testing.
Projects:
Human Accelerated Regions (HARs): We pioneered a statistical phylogenetic approach to identify the fastest evolving regions of the human genome (list) and showed that many of these sequences are developmental enhancers.
- Methodology: Continuous time Markov models, machine-learning
- Ongoing work: Massively parallel reporter assays and CRISPR screens of HARs, fast evolving DNA in other lineages and in cancer
- 721 HARs identified by us
- Merged list of 2649 HARs from us and other labs (Capra et al. PTRSB, 2013)
- Human acceleration in mammal conserved regions
- Human acceleration in primate conserved regions
- Primate acceleration in mammal conserved regions
Lists of accelerated regions:
Metagenomics: We are designing methods to study the human microbiome and other microbial communities at the resolution of individual genes and genetic mutations.
- Methodology: Population genetics, phylogenetic regression, ecological statistics, bootstrap
- Ongoing work: Rapid metagenotyping, longitudinal microbiome dynamics, phylogenetically aware tests for association with host traits
Regulatory genomics: The lab has several projects related to predicting and validating regulatory enhancers and investigating the role of fine-scale chromatin organization in gene regulation across evolution and disease.
- Methodology: Machine-learning, polymer simulations, motif models, clustering
- Ongoing work: Evolution of chromatin boundaries, functional characterization of enhancers and enhancer mutations
Collaborations:
Chan Zuckerberg Biohub Microbiome Initiative – engineering the human microbiome to reveal its role in nutrition, immune function, and drug metabolism
PsychENCODE – Massively parallel characterization of psychiatric disease associated regulatory elements in defined cell types
B2B – Bench to Bassinet: The epigenetic landscape of heart development
BioFulcrum – team science projects to overcome disease
Gladstone Bioinformatics Core – provides expertise on experimental design and analysis of complex data sets, with a specialization on large-scale data sets acquired from various cutting-edge technologies.