Doublescan is a program for comparative ab initio prediction of protein coding genes in mouse and human DNA.
Doublescan takes two input DNA sequences (one from mouse, one from human) which are known to be or which seem to be similar to eachother and simultaneously predicts the genes of both sequences as well as the alignment of the two sequences. Doublescan can model partial, complete and multiple genes (as well as no genes at all) and can also align pairs of genes which are related by events of exon-fusion or exon-splitting. The mathematical method underlying Doublescan is a pair hidden Markov model.
[Genome Research Limited]
Doublescan takes two input DNA sequences (one from mouse, one from human) which are known to be or which seem to be similar to eachother and simultaneously predicts the genes of both sequences as well as the alignment of the two sequences. Doublescan can model partial, complete and multiple genes (as well as no genes at all) and can also align pairs of genes which are related by events of exon-fusion or exon-splitting. The mathematical method underlying Doublescan is a pair hidden Markov model.
Bioinformatics (Oxford, England) 2002;18;10;1309-18
PUBMED: 12376375
Input: two DNA sequences to the submission web site.
Output: the two output files can be retrieved on the retrieval web site.
Email me at irmtraud.meyer@cantab.net.
Doublescan uses the absolute coordinates in its output files to indicate the position of the predicted genes and features. In order to to this, the input files have to come in a variant of the default fasta format which requires a header line of the format below.
>name start_position-end_position orientation
where
>Mm.X13235.5 100-737 forward gggaatgaagtttttctgcaggatttaaatgtggtctttaagagacaccgcatgcaaaga atagctggggcttgctagccaatgaaaacattcagattccaatgacgcatccttttttct ccacccccttccaagacccggattcggaaaccccgcctaacgctctagttttcaaccagg tccgcagaaggcctatttaaagggacgattgctgtctccctgctgtcataaccatgtctg gacgtggcaagggtggtaaaggccttgggaaaggcggcgctaagcgccaccgtaaggttc tccgcgataacatccagggcatcaccaagcctgccatccgccgcctggcccggcgcgggg gagtgaagcgcatctccggcctcatctacgaggagacccgcggtgtgctgaaggtgttcc tggagaacgtgatccgcgacgccgtcacctacacggagcacgccaagcgcaagaccgtca ccgccatggacgtggtctacgcgctcaagcgccagggccgcactctctacggattcggcg gttaatcgactaacaaacgattttccactgtcaacaaaaggcccttttcagggccaccca caaattcctagaaggagttgttcacttaccgaagctt