BLAST

Welcome to the sequencing projects' BLAST Search Services.

The Wellcome Trust Sanger Institute provides to the scientific community its own computing resources and database storage.

Below are listed the various BLAST and BLAT services we provide.

[Genome Research Limited]

Sanger Genome Sequencing Projects BLAST

Human Genome Project
Non Human Vertebrates
Worm & Parasitic Helminths
Fungi
Microbial Pathogens
Plasmids
Viruses & bacteriophage
Protozoan Pathogens
Fly & Others
All Sequencing Projects

FASTA Format

Fasta format looks like this:-

>Your_Sequence_Name_here 
TGAATTTTTGTTATGTTTGAATACCAACTACCTTCGAAGCAAAAAGAAAGACAGAAAAAC
AACATTTTCCGTGTTAAGGTAAAAGGGCACAAGTTTGTATCTTTGTAGTCCGCAAATTGT
ACTATGAAATGCCGAAAACGTGTTCAATTTTTAAAGCAGGAAGCAATCACTCTTCAAAGT
TTATCTAGGTTCCCACGACCAAAAAGATCTGGTAAAAAATTATTGGCATCTGGTTTATTT
TATAGTTTGAACTTTTGTTTCAATCAGAAAAAAACGGGGAAATGCTGTAACTCTATGTAT
AATTTCAAAGTACACAGTGGACTTCCAAATTTTTTTAATTATATATTGCAGGTTGCAAAA
AGGTGTTTTTTCTTTACAAAACTCACACGAGATAAAGAGTATTTTTTAACAAAAAGCTTT
TTCAACCTTAAAAAAATATTAGAAAAAAAACAATAAATTTCCACGAACCCCCACATTAAA
CAAAAAATTCAAAAAAAAAGTAAAATTGCACACGGGAATTGACATCAATCTCTGGTGTAT
TGTGTTGCTAGAGTTGCATCGCCATGGGGTAGAAAACGCAAAAAATGGAGGAGCGAATGG
AAAGTAAGAGAGGAAGAGAGAAAGCACGCGCTTTCTATCTCACTCTCTGCGTTTCTCTTA
TTCCCCGTCACTATTTCATTTTTGAGAGAAAAGACAGTGTACTTGGAGTTGTAGGGCACT
TGGTGACACATACGCAGACGACTAAAGGAAATAAACACATTGATATCACGTGGGGTTGAA
GATCAGTCAACGTTTTTTATGAGGACGAATATTTACAACCTTACATGGATTTTGGCGGTT
TTCGACCGCTTGTAAATTACGAATTTCACAATTTTTGGGAACCGTTTTAGAGGGTTTTTA
TAGAAAATAGCATTTGAAAACAAAGTTGAAACTGAAGTTATATATGTTTTGTTTGTACCA
ATTTTGTATAAATTACAATTTTTGTAAAGTCTCGGAGCGCTTTCCAATATTTTTGTAGGA
ACCAAAACTGATGGAAATAGTCTAGTTTGTTCCGAATCACATTTCACTGCAAGTGGTTTG
AATATCAAGAAATCAACGCAAACGAGGCTCAAAAATCTGCAAACCACCTCTATCCTTGTA
GAATTTTATTTGTCGCTACACCGTTCTCCAAAAAAATTTGAACGAGGTATTTACGGTATT
GAAATTTGTAGTTTCGCTGGAATTGAATTTTACAATTTATCAATTTTAATTTTCCCCCGG
TTTCACATTTTATCGCCTCATCACTTTATCATCCTAATAACCAAAGAACCGCCCATCTTC
ACTCTCAACTATTCGTGAGCTTTGGTTCTTCCTTGTGTCCCACCTCCCTCCTCACCAGCC
AGCAACTTTAAATACACAACACAAAATATGACAGAGATCACTATGTATTACCATACGAAA
CCAAAAAACGTGTTCTTCGCATTGGTGTTTGGGCAGAAAATGGAGATTATGGCTTGATGC
GGTCTTGTCCCCGATTCTCAAGAATTCTCCTTACAATCGGAATACTAACTAGACTATGGT
TCGGTAAAAAAAGAATCTGGTAATTTTCAAATATCTCTATCCGCGATCTTGAGAAAAACC
ATAAATAGAATATTTCCACAGAAGACCAGTCACACTTTTTTCATGAACACCAGGTTTTCT
GAAAACATTAAAAATCATATAAAATGATAGAATGCCATGATAGAACATTACGAAAATTGC
TGGAGTTTTGAAAACATGTCTGCCAAAATTTTGGCAAGTTGTCAAATTCTTGAATTCTGA
ATGAATCTACTTACATTAATTCAGAGCACAAAATTTAAAATATTACTGACTCAAAAGTCA
TCAAGCTTTCACTATTATCAGTTACAAGAAAAGGGAAAAGTGTGCTTGAGCACATGGTTT
TTTTGGTTTTGATGATAAATAACTTGGGTTTGAAGCGGGAAAAATGGAAAGTTTGAATGT
GTTTGTAGTTAGATTG
  

Low complexity filtering

The server filters your query sequence for low compositional complexity regions by default. Low complexity regions commonly give spuriously high scores that reflect compositional bias rather than significant position-by- position alignment. Filtering can elminate these potentially confounding matches (e.g., hits against proline-rich regions or poly-A tails) from the blast reports, leaving regions whose blast statistics reflect the specificity of their pairwise alignment. Queries searched with the blastn program are filtered with DUST. Other programs use SEG.

Low complexity sequence found by a filter program is substituted using the letter "N" in nucleotide sequence (e.g., "NNNNNNNNNNNNN") and the letter "X" in protein sequences (e.g., "XXXXXXXXX"). Users may turn off filtering by using the "Filter" by selecting the checkbox provided.

One potential cause of confusion which users should be aware of is that if a sequence has a 100% percent match (not counting the filtered region) this figure may be reduced if this region contains filtered residues.

e.g.for the match

           GPLHIGIGXXXXXXGHILCCHPN
           ||||||||      |||||||||
           GPLHIGIGPPPPPPGHILCCHPN
  

Even though every potential match is observed, the reported percent match will be 73%.

Blast and the low-complexity filters are provided by WU-BLAST.

* quick link - http://q.sanger.ac.uk/70rs9mrp