Mascot Percolator is a software package that interfaces the database search algorithm Mascot [1] with Percolator [2], a well performing machine learning algorithm for rescoring database search results.
We have demonstrate it to be amenable for both low and high accuracy mass spectrometry data, outperforming all available Mascot scoring schemes as well as providing reliable significance measures [3].
[The Wellcome Trust Sanger Institute]
Follow the instructions as provided by SUN. Type "java -version" at the
command line to check the installation and the version.
Unzip package.
Extract the files and copy everything from within the java subfolder into the root of the MascotPercolator folder. That should comprise two files: msparser.jar, libmsparserj.so (for Linux only) and msparser.dll (for Windows only).
You now need to compile Percolator (see README file in Percolator package). On UNIX machines you might need to
make Percolator executable after compilation by performing the following command: "chmod u+x
percolator". You should then be able to run Percolator with "./percolator". Path to the
executable must be specified in the config file as described in the next step.
- specify the path to the root folder of the Mascot results files by modifying the available example.
- specify the path to the Percolator executable by modifying the available example.
- enable or disable specific features as described in [3]. Not recommended.
java -jar MascotPercolator.jar".
For help regarding the installation or execution, feel free to contact mb8[at]sanger[.]ac[.]uk (Markus Brosch).
java -cp MascotPercolator.jar cli.MascotPercolator [options ...]
Parameters (replacing the "[options ...]" expression):
Example:
java -cp MascotPercolator.jar cli.MascotPercolator -rankdelta 1 -newDat -target 11083 -decoy 11084 -out
11083-11084
Mascot Percolator extracts all necessary data from the Mascot dat file(s), trains Percolator and writes the results to the specified summary file. Mascot Percolator requires a separate target and decoy search, which can be achieved in two ways:
1. Either a Mascot search is performed with the Mascot auto-decoy option enabled. In this case, the "-target" and "-decoy" parameter refer to the same logID or results file.
2. Two independent searches against a target and decoy database are performed, using identical search parameter settings. The "-target" and "-decoy" parameters are set accordingly.
(*) Note: Given the Mascot results are in the default results folder as specified in the config file, then the 'log ID' is the integer part of the Mascot result file of interest. Example: given /mascot/results/ is the root folder of the Mascot results and /mascot/results/20090330/F001234.dat is the results file of interest, then the 'log ID' would be 1234.
The queueing system was implemented to distribute the Mascot Percolator processes onto various machines (nodes). Thereby the post processing time can be reduced linearly with the number of machines available.
If you have a Load Sharing Facility (LSF) installed and your nodes have access to the Mascot results files, you are certainly better off using LSF directly.
WARNING: Even though we run this queuing system without any problems in our IT environment, the distributed computing package shall be seen as experimental. Please feel free to send us bug reports.
There are four separate components involved:
java -cp libs/hsqldb.jar org.hsqldb.Server -database.0 file:mascotPercolatorLogDB -dbname.0
mascotPercolatorLog
This example starts up a hsqldb database server.
database.0' specifies the file where the database is saved
dbname.0' specifies the database name.
You can connect to this SQL database using the HSQLDB server JDBC driver:
'jdbc:hsqldb:hsql://localhost:9001/mascotpercolatorlog' with user 'sa' and no password.
Please notice that user 'sa' has full read/write access.
java -Djava.rmi.server.hostname=yourhost -cp MascotPercolator.jar queue.Server [options ...]
Replace 'yourhost' with the hostname of your machine.
Parameters (replacing the "[options ...]" expression):
Example:
java -Djava.rmi.server.hostname=mascotsrv -cp MascotPercolator.jar queue.Server -dbHost localhost -dbAlias
mascotpercolatorlog -htmlStatusFile /mascot/mascot/html/percolator/index.html -port 1198
Please note that Nodes cannot connect to the Mascot queue.Server and will fail, unless you allow them
specifically to do so. For this, you need to create a file called 'server.policy' before
starting the queue.server and set specific permissions that grant access to local system resources. Please read:
http://java.sun.com/developer/onlineTraining/Programming/JDCBook/appA.html.
We use 'AllPermission' setting, but make sure you understand the implications. We do not take any responsibility
for your chosen settings.
java -cp MascotPercolator.jar queue.Node [options ...]
Parameters (replacing the "[options ...]" expression):
Example:
java -cp MascotPercolator.jar queue.Node -copyDat -server mascotsrv
Note: 'copyDat' is currently only supported for UNIX machines. For this to work successfully, make sure you run
server and node processes as the same user to have no file permission issues. If you have not all nodes
in your ssh fingerprint, the server will halt and ask for manual confirmation whenever it connects a new unknown
node. We set 'StrictHostKeyChecking no' in the ssh config to auto accept all new hosts. Make sure
you understand the implications.
java -cp MascotPercolator.jar queue.SubmitJob [options ...]
Parameters (replacing the "[options ...]" expression):
Example:
java -cp MascotPercolator.jar queue.SubmitJob -server mascotsrv -user 'markus' -target 12787 -decoy 12789
-out '/tmp/12787-12789'
If you have a LSF queue implemented on your system, but no access to the Mascot results files, this queue package is still useful by using 'OneShotNodes instead of the standard Nodes. Instead of starting up nodes manually and submitting jobs individually, a OneShotNode takes care of both and can thereby be embedded into a standard LSF command. A OneShotNode has a job associated upon start-up and unlike the standard nodes, terminates upon successful completion. The basic command is like that:
java -cp MascotPercolator.jar queue.OneShotNode [options ...]
Options are a superset of queue.SubmitJob and queue.Node.
Example of using OneShotNode as part of a bsub LSF command:
bsub -q long -M7500000 -R'select[mem>7500] rusage[mem=7500]' -o /lustre/log/percolator/9 "java
-Djava.io.tmpdir=/lustre/temp -cp MascotPercolator.jar queue.OneShotNode -server mascotsrv -serverPort 1198
-copyDat -user mb8 -target 12865 -decoy 12866 -out /lustre/percolator/12865-12866"
» Please refer to Ref. [4], Ref. [5] and Ref. [6] at the end of this document.
» Percolator requires the pre- and post-fixes to be set, however, Mascot Percolator does not apportion the proteins and since a peptide can match several proteins, we keep these blank ("X").
Electrophoresis 1999;20;18;3551-67
PUBMED: 10612281; DOI: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
Nature methods 2007;4;11;923-5
PUBMED: 17952086; DOI: 10.1038/nmeth1113
Journal of proteome research 2009;8;6;3176-81
PUBMED: 19338334; DOI: 10.1021/pr800982s; PMC: 2734080
Journal of proteome research 2008;7;1;40-4
PUBMED: 18052118; DOI: 10.1021/pr700739d
Journal of proteome research 2008;7;1;29-34
PUBMED: 18067246; DOI: 10.1021/pr700600n