Bioinformatics

The Bioinformatics programme develops and applies methods to process, store and analyse data generated by high-throughput projects.

Its principal aims are to infer genomic knowledge through computational analysis and integration of data and to generate resources of lasting value to biomedical research.

  • Projects - a list of Faculty-led research projects in Bioinformatics
  • Collaborations and resources - a list of collaborations and resources in Bioinformatics in which the Sanger Institute plays a leading role

[Wellcome Library, London]

Projects

*Computational genome biology
Chris Ponting's Associate Faculty group analyses next-generation sequencing data to better understand basic biological and disease processes
*Genome informatics
Richard Durbin's team explores various types of sequence and variation informatics, often involving evolutionary analysis
*Genomics of gene regulation
Daniel Gaffney's team explores how genetic variation, both within and between species, affects the regulation of gene expression
*Population genomics of adaptation
Ville Mustonen's team develops computational methods to discover and understand functionally relevant genetic and phenotypic variation
*Using outbred genetic variation to understand basic biology
Ewan Birney's group is using genomic data to study basic biological process and changes in shape and form in whole organisms and individual cells
*Vertebrate genome analysis
Tim Hubbard's team generates vertebrate genome annotations and maintains the reference human, mouse and zebrafish sequences

Collaborations and resources

*1000 Genomes project
Sequences the genomes of a large number of people, to provide a comprehensive resource on human genetic variation
*Distributed Annotation System (DAS)
A data exchange protocol for open sharing of biological information
*ENCODE and GENCODE
Aims to identify all functional elements across the entire human genome sequence and annotate evidence-based gene features at a high accuracy
*Ensembl genome browser
Produces genome databases for vertebrates and other eukaryotic species and makes this information freely available online
* Genome Reference Consortium (GRC)
Aims to ensure that the human, mouse and zebrafish reference assemblies are biologically relevant by closing gaps, fixing errors and representing complex variation
*GeneDB
Provides the latest sequence data and annotation for pathogens sequenced at the Institute and tools to aid the community in accessing and obtaining the maximum value from these data
*HAVANA
The HAVANA group provides the manual annotation of human, mouse, zebrafish and other vertebrate genomes that appears in the Vega browser
*iPfam
Describes Pfam domain interactions that are observed in Protein Data Bank entries
*MEROPS
Provides the internationally recognised classification of peptidases and their inhibitors
*miRBase
A searchable database of published microRNA sequences and annotation with downloadable data
*MitoCheck
An integrated research project which systematically studies the regulation of mitosis in human cells
*Pfam
Provides a classification of proteins into families and domains using hidden Markov models
*Rfam
Provides a classification of RNA families using covariance models
*Sanger Institute-EBI Single-Cell Genomics Centre
Explores the DNA, RNA and epigenetic features of single cells in order to better understand normal biology and disease
*Tiffin
A database of predicted regulatory motifs, a subset with predicted functional annotation
*TreeFam
Provides curated phylogenetic trees of animal genes, that give reliable ortholog and paralog assignments
* UK10K
Aims to understand the link between low-frequency and rare genetic changes, and human disease by studying the genetic code of 10,000 people
*Vertebrate Genome Annotation database (VEGA)
Central database repository for high-quality manual annotation of vertebrate finished genome sequence
* quick link - http://q.sanger.ac.uk/75002gor