NestedMICA is a method for discovering over-represented short motifs in large sets of strings. Typical applications include finding candidate transcription factor binding sites in DNA sequences.
[Genome Research Limited]
NestedMICA works by optimizing a probabilistic model which treats the input data as a mixture of interesting motifs and background sequence. NestedMICA uses a new and robust inference technique called nested sampling, and a novel mosaic background model to acheive extremely high sensitivity. More information about the algorithm, and some quantitative performance comparisons, are given in the papers:
BMC bioinformatics 2008;9;19
PUBMED: 18194537; PMC: 2267705; DOI: 10.1186/1471-2105-9-19
Nucleic acids research 2005;33;5;1445-53
PUBMED: 15760844; PMC: 1064142; DOI: 10.1093/nar/gki282
As well as sensitivity improvements, NestedMICA has a number of features which make it particularly suited to discoving multiple motifs ('regulatory vocabularies') in large datasets - up to and including whole-genome promoter/enhancer sets.
PLoS computational biology 2007;3;1;e7
PUBMED: 17238282; PMC: 1779301; DOI: 10.1371/journal.pcbi.0030007
Nature genetics 2006;38;4;431-40
PUBMED: 16518401; DOI: 10.1038/ng1760
NestedMICA was developed on Linux (ia32) and Mac OS X (powerpc) systems. The main program is written in Java, with a small amount of C code, It should run on any Unix-like platform with a good Java implemention. As of version 0.7.0, NestedMICA requires a Java 5 platform (Sun JDK 1.5.0 or later).
If you use NestedMICA, we suggest you join the nmica-users mailing list to receive information about new releases.
We now recommend that you use MotifExplorer to examine motif-sets learned using NestedMICA.
Also, a few old tools have been removed. The only one that most users might notice is motifviewer, which has been replaced by the MotifExplorer tool.
nminfer can now automatically infer motif lengths. To take advantage of this feature, you need to
specify a length range using the -minLength and -maxLength options.
"-threads N" where N is greater than zero. Please consult the manual for more details.
-numMotifs parameter. This should be fixed in the next release. For now, we would suggest removing all
sequences less than about 50 bases when running nminfer with a large -numMotifs value.
motifscanner
-workerThreads option has been renamed -threads
JAVA_HOME for a Java runtime
printmosaicbg program to inspect background model files
MotifExplorer is a graphical Java program for viewing and manipulating collections of short sequence motifs --
typically transcription factor binding sites. It is designed to work well with NestedMICA and uses XMS
(NestedMICA's output format) as its native file format
Users of Windows and Linux systems with a reasonably up-to-date Java version installed (1.5 or later) should be able to start MotifExplorer via Java Webstart by clicking on the link below.
Note that you'll see a warning about the certificate that was used to digitally sign this MotifExplorer distribution. Hopefully future releases will be signed using a trusted certificate.
The webstart package currently doesn't work well on Mac OS X. Please use this downloadable Mac-specific version instead. (requires Mac OS 10.4.0 or later and Java 1.5 or later).
Questions and comments about NestedMICA should be sent to Thomas Down at thomas.down@gurdon.cam.ac.uk.