Skip to content

userbz/DeMix-Q

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DeMix-Q: Quantification-centered LC-MS/MS data processing


Dependencies:

Optional:


Procedures:

First of all, in order to enable external tools for XIC extraction:

  • Copy the consensus2edta.ttd config file to the OpenMS installation path (e.g. C:\Program Files\OpenMS-2.0\share\OpenMS\TOOLS\EXTERNAL).

After copying the configure file to the OpenMS path:

  • Load the DeMix_PoseCluster.toppas (new version, recommended) or the DeMix.toppas pipeline (old version) in the TOPPAS program.

for DeMix_PoseCluster.toppas:

  • Prepare centroid MS1 spectra in mzML format and load in "Node 1".
  • Load the corresponding MS/MS identification files (".idXML" format) in "Node 2". MS-GF+ resultant mzid files can be converted by the python script: mzid2idxml_theomass.py . (Note: the script only fetches RT information from spectrum titles that are generated by DeMix. In the case that you do not work with the DeMix identification workflow and MS-GF+, please revise it before use.)
  • In "Node 3", load the corresponding ".FeatureXML" files in (i.e. individual feature maps) that are previously generated by the FeatureFinderCentroided (for example, from the DeMix workflow).
  • In "Node 4", load ONE ".FeatureXML" file as the reference for retention time alignment. (Recommend to choose one of the files in Node 3 that has the largest number of features. RT scales of other runs will be re-calibrated to the scale of the reference run after alignment).
  • Modify "Node 14" in the pipeline, set the converter_script option to point to the python script: consensus2edta.py

alternatively, for DeMix.toppas:

  • Prepare centroid MS1 spectra in mzML format and load in "Node 1".
  • Load the corresponding MS/MS identification files (".idXML" format) in "Node 3". MS-GF+ resultant mzid files can be converted by the python script: mzid2idxml_theomass.py . (Note: the script only fetches RT information from spectrum titles that are generated by DeMix. In the case that you do not work with the DeMix identification workflow and MS-GF+, please revise it before use.)
  • In "Node 4", load ONE ".idXML" file as the reference for retention time alignment. (Recommend to choose one of the files in Node 3 that has the largest number of identifications. RT scales of other runs will be re-calibrated to the scale of the reference run after alignment.)
  • Modify "Node 17" in the pipeline, set the converter_script option to point to the python script: consensus2edta.py

After configuration,

  • Execute the pipeline in TOPPAS (by pressing F5).
  • Collect two output files from processes of TextExporter and EICExtractor.

Finally,

  • Apply the post processing Python script DeMixQ_data_processing.py in the script folder. Set parameters as follows:

-consensus The consensus feature map (default: consensus.csv) -eic The extracted ion-chromatography (default: eic.csv) -n_samples number of samples (default: 4) -n_repeats number of replicate experiments per sample (default: 3) -fdr Quality cutoff by estimating FDR from decoy extractions (default: 0.05) -knn_k K nearest neighbors for abundance correction, set 0 to disable correction (default: 5) -knn_step Number of sequential features used for estimating local medians (default: 500) -out Output table in csv format (default: peptides.DeMixQ.csv)

  • Alternatively, there is an example iPython notebook in the script folder, which can be executed step-by-step using Jupyter notebook (Python2.7).

Note:
The example workflow is tested with high resolution Orbitrap (FTMS 70,000) datasets. You may need to adjust the parameters to fit with your dataset.

Map RT alignment can be done without MS/MS identifications, or done with other tools then convert into the OpenMS-compatible format (i.e. trafoXML).

For a large dataset (e.g. >10 samples), feature detection and linking may require considerable computing resources: CPU, RAM and disk space.


Download the example data set
https://ki.box.com/s/15kv73pu2cvi4gwakzd6kdxcwch7dbl0


Reference
  1. Zhang, B., Käll, L. & Zubarev, R. A., (2016). DeMix-Q: Quantification-centered Data Processing Workflow. Molecular & Cellular Proteomics, mcp.O115.055475 https://www.ncbi.nlm.nih.gov/pubmed/26729709
  2. Zhang, B., Pirmoradian, M., Chernobrovkin, A., & Zubarev, R. A. (2014). DeMix Workflow for Efficient Identification of Co-fragmented Peptides in High Resolution Data-dependent Tandem Mass Spectrometry. Molecular & Cellular Proteomics : MCP. doi:10.1074/mcp.O114.038877 https://www.ncbi.nlm.nih.gov/pubmed/25100859

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published