Sequence Analysis and Management (SAM) | Scientific Operations

Sequence Analysis and Management (SAM) | Scientific Operations

Sequence Analysis and Management (SAM)

sam-infomatics-3.jpgSanger Institute, Genome Research Limited

Our Research and Approach

SAM contributes to various software packages for processing DNA sequence data, including samtools, htslib, biobambam and the Staden package. We also submit raw sequencing data to the EBI on behalf of the research groups.


Robert Davies
Group Leader

Robert joined the Sanger Centre (as it was then) in 1994. Since then he has worked on robotics, Pathogen assemblies, a variety of database systems, archiving tools and sequence processing pipelines.

Show Alumni


Key Projects, Collaborations, Tools & Data

Programmes, Associate Research Programmes and Facilities

Partners and Funders

Internal Partners


  • Comparison of high-throughput sequencing data compression tools.

    Numanagić I, Bonfield JK, Hach F, Voges J, Ostermann J et al.

    Nature methods 2016;13;12;1005-1008

  • Gap5--editing the billion fragment sequence assembly.

    Bonfield JK and Whitwham A

    Bioinformatics (Oxford, England) 2010;26;14;1699-703

  • Intra- and interhost evolutionary dynamics of equine influenza virus.

    Murcia PR, Baillie GJ, Daly J, Elton D, Jervis C et al.

    Journal of virology 2010;84;14;6943-54

  • Genome-wide end-sequenced BAC resources for the NOD/MrkTac() and NOD/ShiLtJ() mouse genomes.

    Steward CA, Humphray S, Plumb B, Jones MC, Quail MA et al.

    Genomics 2010;95;2;105-10

  • The Universal Protein Resource (UniProt) in 2010.

    UniProt Consortium

    Nucleic acids research 2010;38;Database issue;D142-8

  • Improvements to services at the European Nucleotide Archive.

    Leinonen R, Akhtar R, Birney E, Bonfield J, Bower L et al.

    Nucleic acids research 2010;38;Database issue;D39-45

  • The Universal Protein Resource (UniProt) 2009.

    UniProt Consortium

    Nucleic acids research 2009;37;Database issue;D169-74

  • Petabyte-scale innovations at the European Nucleotide Archive.

    Cochrane G, Akhtar R, Bonfield J, Bower L, Demiralp F et al.

    Nucleic acids research 2009;37;Database issue;D19-25

  • Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database.

    Cochrane G, Akhtar R, Aldebert P, Althorpe N, Baldwin A et al.

    Nucleic acids research 2008;36;Database issue;D5-12

  • EMBL Nucleotide Sequence Database in 2006.

    Kulikova T, Akhtar R, Aldebert P, Althorpe N, Andersson M et al.

    Nucleic acids research 2007;35;Database issue;D16-20

  • EMBL Nucleotide Sequence Database: developments in 2005.

    Cochrane G, Aldebert P, Althorpe N, Andersson M, Baker W et al.

    Nucleic acids research 2006;34;Database issue;D10-5

  • A genome-wide, end-sequenced 129Sv BAC library resource for targeting vector construction.

    Adams DJ, Quail MA, Cox T, van der Weyden L, Gorick BD et al.

    Genomics 2005;86;6;753-8

  • Shotgun haplotyping: a novel method for surveying allelic sequence variation.

    Lindsay SJ, Bonfield JK and Hurles ME

    Nucleic acids research 2005;33;18;e152

  • The genome of the African trypanosome Trypanosoma brucei.

    Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H et al.

    Science (New York, N.Y.) 2005;309;5733;416-22

  • The genome of the kinetoplastid parasite, Leishmania major.

    Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G et al.

    Science (New York, N.Y.) 2005;309;5733;436-42

  • Complex haplotypes, copy number polymorphisms and coding variation in two recently divergent mouse strains.

    Adams DJ, Dermitzakis ET, Cox T, Smith J, Davies R et al.

    Nature genetics 2005;37;5;532-6

  • The EMBL Nucleotide Sequence Database.

    Kanz C, Aldebert P, Althorpe N, Baker W, Baldwin A et al.

    Nucleic acids research 2005;33;Database issue;D29-33

  • Mutagenic insertion and chromosome engineering resource (MICER).

    Adams DJ, Biggs PJ, Cox T, Davies R, van der Weyden L et al.

    Nature genetics 2004;36;8;867-71

  • The EMBL Nucleotide Sequence Database.

    Kulikova T, Aldebert P, Althorpe N, Baker W, Bates K et al.

    Nucleic acids research 2004;32;Database issue;D27-30

  • The DNA sequence of chromosome I of an African trypanosome: gene content, chromosome organisation, recombination and polymorphism.

    Hall N, Berriman M, Lennard NJ, Harris BR, Hertz-Fowler C et al.

    Nucleic acids research 2003;31;16;4864-73

  • The EMBL Nucleotide Sequence Database: major new developments.

    Stoesser G, Baker W, van den Broek A, Garcia-Pastor M, Kanz C et al.

    Nucleic acids research 2003;31;1;17-22

  • Sequence of Plasmodium falciparum chromosomes 1, 3-9 and 13.

    Hall N, Pain A, Berriman M, Churcher C, Harris B et al.

    Nature 2002;419;6906;527-31

  • Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18.

    Parkhill J, Dougan G, James KD, Thomson NR, Pickard D et al.

    Nature 2001;413;6858;848-52

  • Genome sequence of Yersinia pestis, the causative agent of plague.

    Parkhill J, Wren BW, Thomson NR, Titball RW, Holden MT et al.

    Nature 2001;413;6855;523-7

  • An SNP map of human chromosome 22.

    Mullikin JC, Hunt SE, Cole CG, Mortimore BJ, Rice CM et al.

    Nature 2000;407;6803;516-20

  • Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491.

    Parkhill J, Achtman M, James KD, Bentley SD, Churcher C et al.

    Nature 2000;404;6777;502-6

  • The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences.

    Parkhill J, Wren BW, Mungall K, Ketley JM, Churcher C et al.

    Nature 2000;403;6770;665-8