Powered By Bing

Peng Lab: JUMPm

A software program for mass spectrometry-based metabolite identification

Introduction

JUMPm is a program for untargeted metabolite identification based on liquid chromatography and tandem mass spectrometry. The computer algorithm determines chemical formulas from either unlabeled or stable-isotope labeled metabolome data, and derives possible structures by predictive fragmentation during database search. JUMPm uses a target-decoy strategy based on the octet rule to estimate the rate of false discovery (FDR). The user specifies a target FDR and JUMPm will filter the data to reach the target. FDR is a critical measure of confidence which researchers can use in the analysis of their data.

The program is written in perl, Java and R. It is designed for high-performance parallel computing systems. More detailed information can be found in README file in the compressed source code package.

Download JUMPm

JUMPm is open-source and free for academic and non-profit use. 

Download the source code

Two sample datasets with MISSILE labeling can be downloaded: negative data and positive data.

  1. A pre-built formula database
    The mass-formula database is used to determine the chemical formulas of observed peaks.

  2. A pre-built composite database
    Includes structures from PubChem, HMDB and YMDB. The structure database is used for MS2 metabolite identification once a formula is determined.

Additional datasets:

Reverse phase + mode:

HILIC - mode:

File Preparation

JUMPm runs natively with the ThermoTM .raw file format. For LCMS data from other vendors, convert files to the .mzXML file format using ReAdW or MSConvert. A screen shot with example parameters is shown below for MSConvert. When converting, MS level 2 is required for JUMPm to perform structure identification. If desired JUMPm can search MS1 only data to identify just metabolite formulas.

Usage

Basic usage of requires only one command. There are 3 parts to the search: 1) the raw file to be analyzed 2) the structure database which will be searched 3) the parameter file which describes the search settings. Simply point JUMPm to the structure database of your choice, the raw file to be analyzed, and the search parameters file:

  • jumpm -p <JUMPm parameter file> <Raw MS/MS data file(s)>
  • -p <file> specifies a JUMPm parameter file
  • <Raw MS/MS data file(s)> specifies one or multiple MS/MS data file(s)

For example, to analyze "Yeast_data_set_1," first move to the file directory where the raw file to be search is located. The choice of database and search parameters can be edited inside the .param file. To run the search, type the following command in the command prompt/terminal:

jumpm -p jumpm.params Yeast_data_set_1.mzXML

*Tip: the .param (parameters file) is most easily edited in "wordpad," not "notepad."

Installation

Your computer system must meet these minimum requirements to run JUMPm.

System Requirements

Hardware

To run on a cluster system

  • Ø SGE or LSF job management system
  • Ø 32 GB memory on each node

To run on a single server

  • Ø 32 GB memory
  • Ø 2 GHz CPU processors with a minimum of 4 cores

Software

WINE installed if analyzing .raw files, but .mzXML files can be processed without WINE program.

The main program of JUMPm is written in perl (v5.8 or above). These following modules are needed:

Perl modules:
Parallel::ForkManager
Class::Std
Statistics::R
Statistics::Basic
Statistics::Descriptive
Set::Partition
Regexp::Common
Number::Format

R v3.1.0:

The R source code is provided in the JUMPm folder. You need to rebuild it as follows:

  1. tar -vxf R-3.1.0.tar
  2. ./configure --with-readline=no
  3. make

JAVA v1.7

More detailed information about installation can be found in README file.

Support

Please contact Xusheng Wang and Junmin Peng for software support.