Contact WTSI Webmaster Printer friendly format Login to WTSI resources WTSI RSS feed
All Sequencing
  • Human (HGP)
  • Pathogens
  • Blast
  • T. brucei
  • T. brucei Home
  • GeneDB
  • omniBLAST
    Server
  • Blast Server
  • FTP site
  • T. b. gambiense project
  • T. congolense project
  • T. vivax project
  • Trypanosoma Genome Network
  • Website Search
  • People Search
  • Library Services
  • Site Map
  • Feedback / Help
Retrieve BLAST result
Overview of Data for Sequence Similarity Searches
Data types available for searching
  • Sequence reads
    These are the individual sequence reads, generally 500-600 bp in length. The sequence read has chromosome of origin followed by the shotgun clone id and either "q1c" or "p1c" for either forward or reverse primer in its name.
  • Contig sequences
    • Contig sequences represent secondary sequence data, in that they are the condensation of a number of shotgun reads. Contig reflect more reliably the finished sequence data because the depth of coverage of assembled shotgun reads ensures that the majority of ambiquities are identified and at least partially resolved. This is not to say that contigs do not contain insertion and/or deletion events, usually as a conseqeunce of the algorithim used to create a consensus. Please not, that currently the Sanger Institute is unable to track contigs through assembly and therefore, contig id's will change.
    • Individual contig sequences which are highlighted by Blast analysis can be retrieved by following the 'Sequence' link in the returned HTML page.
    • If contigs have already been submitted to the HTG section of public databases, then a link will take you there
    • Only contigs greater than 2 kb are present in the Blast searchable dataset.
  • annotated genes and proteins
    Annotated and curated tRNA, snRNA, rRNA and protein-coding genes and pseudogenes on manually annotated chromosomes/contigs available through GeneDB
  • automatically predicted gene and peptides
    Automatic predictions and analyses of open reading frames and putative protein products available through GeneDB
  • EMBL
    All avaliable data with T. brucei listed as organism submitted to the public databases
  • GSS
    See here for further detail
  • GSS/EST clusters
    See here for further detail
Searching data
  • Searching the contig database
    is the most direct method of searching for a gene. This should be your initial dataset to search : use BlastN with a homologous DNA queries to identifying an exact match, TBlastX with paralogous DNA or DNA from a related species to identify weaker matches. Finally use TBlastN with a peptide query to match proteins back to the genomic DNA.
  • Searching data available through GeneDB
    will allow you to see whether your sequence of interest has already been annotated and if so, how.
Tips when searching data
  • When interpreting the Blast output, remember what query and Blast type you have been running. BlastN results should essentially be cut and dry (searching for exact match with a %identity in the high 90's). Search algorithms such as TBlastX and TBlastN are matching similarities at the peptide level and hence matches are likely to be confinded to conserved regions of coding exons. Look for co-linear matches along the contig sequence.
Information Projects Other Services
Sanger Home
Sitemap
Site Search
Information
Careers
Press
News
Seminars
Workshops
Publications
Staff Theses
Travel Directions
Research Teams
Research Faculty
Personnel Search
Human Genetics
Model Organism Genetics
Pathogen Genetics
Bioinformatics
Sequencing
Library
Helpdesk
Webmail
VPN Access
Sign In
SSO Pass. Reset

webmaster@sanger.ac.uk

Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK  Tel:+44 (0)1223 834244

Last Modified Tue Oct 21 16:37:36 2003

Genome Research Limited is a charity registered in England with number 1021457

Data Sharing Policy | Conditions of Use | Copyright