Resources

These additional datasets are in /wynton/home/database:

Wynton Bioinformatics Open Data Resource

Bioinformatics Data Repository on Wynton

The Pollard lab established a shared data repository stored on Wynton for the UCSF Bioinformatics community. The repository includes publicly available bioinformatics datasets from individual manuscripts, consortia, and databases. These files are readable by anyone with a Wynton account and can be computed on locally from Wynton HPC cluster nodes.

The repository is organized into two directories found at:

  • /wynton/group/datasets (data released with a specific publication)
  • /wynton/group/databases (all other data, including consortium publications)

The current collection of commonly used resources is based on input gathered from Bioinformatics faculty, students and postdocs. There is sufficient storage to add other datasets, as long as these are used across multiple labs and are freely available for download to enable automated maintenance.

Questions and requests can be directed to Katie Pollard.

This table lists the datasets in /wynton/group/databases:

1000 Genomesftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
1000 Genomes high coveragehttp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20190425_NYGC_GATK
AlphaFold2https://github.com/deepmind/alphafold
Ancient DNAhttps://reich.hms.harvard.edu/downloadable-genotypes-present-day-and-ancient-dna-data-compiled-published-papers
ANNOVAR-humandbhttps://annovar.openbioinformatics.org/en/latest/user-guide/download/
Blasthttps://ftp.ncbi.nlm.nih.gov/blast/
BrainSpanhttps://www.brainspan.org/static/download.html
ChEMBLhttps://www.ebi.ac.uk/chembl/
Clinvarhttps://www.ncbi.nlm.nih.gov/clinvar/
dbSNPhttps://www.ncbi.nlm.nih.gov/snp/
Ensemblhttps://useast.ensembl.org/info/data/index.html
ENCODE humanhttps://www.encodeproject.org/
ENCODE mousehttps://www.encodeproject.org/
FANTOMhttps://fantom.gsc.riken.jp/
GENCODEhttps://www.gencodegenes.org/
Gene Ontologyhttp://geneontology.org/
gnomADhttps://gnomad.broadinstitute.org/downloads
GTEXhttps://gtexportal.org/home/
GWAS Cataloghttps://www.ebi.ac.uk/gwas/downloads
HMP1_I-IIhttps://portal.hmpdacc.org/
Human Protein Atlashttps://www.proteinatlas.org/about/download
InterProhttps://www.ebi.ac.uk/interpro/
Korea 1K genomeshttp://1000genomes.kr/
MetaHIThttps://www.ncbi.nlm.nih.gov/sra/?term=ERA000116
NCBI GEOhttps://www.ncbi.nlm.nih.gov/geo
NCBI PubChemhttps://pubchem.ncbi.nlm.nih.gov/
NCI TCGAhttps://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga
PDBhttps://www.rcsb.org/docs/programmatic-access/file-download-services
PDB redohttps://pdb-redo.eu/download
Refseq microbial/viral genomeshttps://www.ncbi.nlm.nih.gov/refseq/
RoadMap Epigenomicshttps://www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/
SIFTShttps://www.ebi.ac.uk/pdbe/docs/sifts/
Simons Genome Diversity Projecthttps://reichdata.hms.harvard.edu/pub/datasets/sgdp/
Swiss-Model Repositoryhttps://swissmodel.expasy.org/repository
UCSC Genome Browserhttp://hgdownload.soe.ucsc.edu/downloads.html
UHGG_v1http://ftp.ebi.ac.uk/pub/databases/metagenomics/mgnify_genomes/
UK Biobank Summary Statisticshttps://pan.ukbb.broadinstitute.org/downloads/index.html
UniProtftp://ftp.uniprot.org/pub/databases/uniprot/current_release/
ZINC-15via symlink to wynton/group/bks/zinc15