The Sanger Institute has been funded by Beowulf Genomics to perform
comparative sequencing of five Escherichia coli and
Shigella strains in collaboration with
Dr. Christoph Tang of the
Centre for Molecular Microbiology and Infection ,
Imperial College, London, Prof. Mark Pallen and Dr. Ian Henderson of the
Division of Immunity and Infection ,
University of Birmingham , UK
Dr. Claude Parsot and Dr. Phillipe Sansonetti of the
Unite de Pathologie Microbienne,
Institut Pasteur , France,
Dr. Gadi Frankel, of the
Department of Biochemistry ,
Imperial College , London,
and
Dr. Stuart Knutton of the
Institute of Child Health , Birmingham.
The following have been chosen for sequencing:
Shigella dysenteriae M131649 (M131)
Shigella sonnei 53G
Enteroaggregative E. coli 042
Enteropathogenic E. coli E2348/69
E. coli non-K1 invasive clinical isolate (a recent
isolate from a location distinct from previous isolates,
and where the burden of disease is high).
Each of these strains will be sequenced to completion to allow full
comparison of the genomes.
Sequencing Progress:
S. dysenteriae
Status: Finishing/gap closure
Assembly: 206 contigs >1kb (79 contigs >2kb); total size 4.901 Mb
Shotgun: 61,084 reads totalling 36.525 Mb , theoretical genome coverage of 99.97%
S. sonnei
Status: Finished
Assembly: 5 contigs >1kb (5 contigs >2kb); total size 5.220 Mb
Shotgun: 61,084 reads totalling 39.036 Mb , theoretical genome coverage of 99.95%
E. coli 042
Status: Finished
Assembly: 2 contigs >1kb (2 contigs >2kb); total size 5.355 Mb
Shotgun: 70,203 reads totalling 40.910 Mb , theoretical genome coverage of 99.97%
Preliminary gene predictions: Chromosome , Plasmid
E. coli E2348/69
Status: Published in Iguchi A, Thomson NR, Ogura Y, Saunders D, Ooka T, Henderson IR, Harris D, Asadulghani M, Kurokawa K, Dean P, Kenny B, Quail MA, Thurston S, Dougan G, Hayashi T, Parkhill J, Frankel G. Complete genome sequence and comparative genome analysis of enteropathogenic Escherichia coli O127:H6 strain E2348/69. J Bacteriol. 2009 Jan;191(1):347-54. Epub 2008 Oct 24.
Assembly: 3 contigs >1kb (3 contigs >2kb); total size 5.069 Mb
Shotgun: 75,606 reads totalling 43.896 Mb , theoretical genome coverage of 99.98%
Annotation and sequence in the EMBL database:Chromosome (acc:FM180568) Plasmid pMAR2 (acc:FM180568) Plasmid pE2348-2 (acc:FM180570)
E. coli non-K1
Status: Funded
The databases are available for searching on our BLAST server , and for download from our
FTP
site
Data Release Statement
This sequencing centre plans on publishing the
completed and annotated sequences in a peer-reviewed journal as soon as possible.
Permission of the principal investigator should be obtained before publishing
analyses of the sequence/open reading frames/genes on a chromosome or genome scale.
For genomes in finishing:
Finishing warning
Please note that although finishing has begun, the
assembly database may still contain errors and misassemblies, and
E. coli and vector contamination. The contig numbers are not
stable, and will vary with each data release. The sequences in the
shotgun database are single reads from ABI sequencers; they will
contain errors, along with both E. coli and vector
contamination.
For finished genomes:
Statement on annotation
Annotation of the sequence is ongoing, and the full annotation will
be released upon publication. Please note that, although the
sequence is finished, and we believe it to be accurate, it is
possible that errors and missassemblies may remain. The sequence
should be considered as preliminary until final publication. We
would ask users of the data to read our Data release policy and
Guidelines on
use of data in publications
Sequencing enquiries
Please address all sequencing enquiries to: parkhill@sanger.ac.uk