Research

Our group develops bioinformatics methods for comparative genomics. We are interested in how genomes and their encoded functions change over time scales from developmental transitions in human cells to evolution across the tree of life. The models and tools we create are unified by statistical rigor, a phylogenetic perspective, and massive integration of public data. Collaboration, creating open source software, and promoting open science are priorities. Many of our projects involve machine-learning, stochastic process models of genome evolution, longitudinal modeling, sequence analysis, and multiple hypothesis testing.

Projects:

Human Accelerated Regions (HARs): We pioneered a statistical phylogenetic approach to identify the fastest evolving regions of the human genome (list) and showed that many of these sequences are developmental enhancers.

Metagenomics: We are designing methods to study the human microbiome and other microbial communities at the resolution of individual genes and genetic mutations.

  • Methodology: Population genetics, phylogenetic regression, ecological statistics, bootstrap
  • Ongoing work: Rapid metagenotyping, longitudinal microbiome dynamics, phylogenetically aware tests for association with host traits

Regulatory genomics: The lab has several projects related to predicting and validating regulatory enhancers and investigating the role of fine-scale chromatin organization in gene regulation across evolution and disease.

  • Methodology: Machine-learning, polymer simulations, motif models, clustering
  • Ongoing work: Evolution of chromatin boundaries, functional characterization of enhancers and enhancer mutations

Collaborations:

Chan Zuckerberg Biohub Microbiome Initiative – engineering the human microbiome to reveal its role in nutrition, immune function, and drug metabolism

PsychENCODE – Massively parallel characterization of psychiatric disease associated regulatory elements in defined cell types

B2B – Bench to Bassinet: The epigenetic landscape of heart development

BioFulcrum – team science projects to overcome disease

Gladstone Bioinformatics Core – provides expertise on experimental design and analysis of complex data sets, with a specialization on large-scale data sets acquired from various cutting-edge technologies.