Bioinformatics

The Bioinformatics programme develops and applies methods to process, store and analyse data generated by high-throughput projects.

Its principal aims are to infer genomic knowledge through computational analysis and integration of data and to generate resources of lasting value to biomedical research.

  • Projects - a list of Faculty-led research projects in Bioinformatics
  • Collaborations and resources - a list of collaborations and resources in Bioinformatics in which the Sanger Institute plays a leading role

[Wellcome Library, London]

Projects

*Classification of proteins and RNAs
Alex Bateman's team classifies proteins and RNAs into families to allow researchers to understand their properties and functions
*Genome informatics
Richard Durbin's team explores various types of sequence and variation informatics, often involving evolutionary analysis
*Population genomics of molecular phenotypes
Ville Mustonen's team develops population genetic methods for integrated sequencing and functional data to help explain natural variation
*Vertebrate genome analysis
Tim Hubbard's team generates vertebrate genome annotations and maintains the reference human, mouse and zebrafish sequences

Collaborations and resources

*1000 Genomes project
Sequences the genomes of a large number of people, to provide a comprehensive resource on human genetic variation
*Distributed Annotation System (DAS)
A data exchange protocol for open sharing of biological information
*ENCODE and GENCODE
Aims to identify all functional elements across the entire human genome sequence and annotate evidence-based gene features at a high accuracy
*Ensembl genome browser
Produces genome databases for vertebrates and other eukaryotic species and makes this information freely available online
* Genome Reference Consortium (GRC)
Aims to ensure that the human, mouse and zebrafish reference assemblies are biologically relevant by closing gaps, fixing errors and representing complex variation
*GeneDB
Provides the latest sequence data and annotation for pathogens sequenced at the Institute and tools to aid the community in accessing and obtaining the maximum value from these data
*HAVANA
The HAVANA group provides the manual annotation of human, mouse, zebrafish and other vertebrate genomes that appears in the Vega browser
*iPfam
Describes Pfam domain interactions that are observed in Protein Data Bank entries
*MEROPS
Provides the internationally recognised classification of peptidases and their inhibitors
*miRBase
A searchable database of published microRNA sequences and annotation with downloadable data
*MitoCheck
An integrated research project which systematically studies the regulation of mitosis in human cells
*Pfam
Provides a classification of proteins into families and domains using hidden Markov models
*Rfam
Provides a classification of RNA families using covariance models
*Tiffin
A database of predicted regulatory motifs, a subset with predicted functional annotation
*TreeFam
Provides curated phylogenetic trees of animal genes, that give reliable ortholog and paralog assignments
* UK10K
Aims to understand the link between low-frequency and rare genetic changes, and human disease by studying the genetic code of 10,000 people
*Vertebrate Genome Annotation database (VEGA)
Central database repository for high-quality manual annotation of vertebrate finished genome sequence
* quick link - http://q.sanger.ac.uk/x4mw6v38