Sina Jasim

The Open Proteomics Journal

2010, 3 : 8-19
Published online 2010 May 31. DOI: 10.2174/1875039701003010008
Publisher ID: TOPROTJ-3-8

The Proteome Discovery Pipeline – A Data Analysis Pipeline for Mass Spectrometry-Based Differential Proteomics Discovery

Catherine P. Riley , Erik S. Gough , Jing He , Shrinivas S. Jandhyala , Brad Kennedy , Seza Orcun , Mourad Ouzzani , Charles Buck , Ali M. Roumani and Xiang Zhang

Institute for Intrinsically Disordered Protein Research, Center for Computational Biology & Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University Schools of Medicine and Informatics, 410 W. 10th Street, HS 5009, Indianapolis, IN 46202, USA.

ABSTRACT

Proteomics approaches enable interrogation of large numbers of proteins to provide a more comprehensive understanding of biological systems. High throughput proteomics typically utilizes liquid chromatography – mass spectrometry technology for data acquisition. Bioinformatic analysis tools are essential to manage and mine resulting high volume proteomics data sets. Data analysis is a current bottleneck for many proteomics researchers because complete and freely accessible already-developed systems are not available. In addition, most analysis systems require experienced bioinformatician input immediately upon data acquisition. For proteomics to achieve greatest impact in biology, data analysis must be more efficient and effective.

We present the Proteome Discovery Pipeline (PDP), a web-based analysis platform that provides proteomics data analysis without requirement for specialized hardware or input from bioinformatics specialists for initial data analyses. Functionalities of the PDP include spectrum visualization, deconvolution, alignment, normalization, statistical significance tests, and pattern recognition. The PDP provides proteomic researchers with a user-friendly web-based data analysis package that can handle multiple file formats and facilitates data analysis from multiple proteomics technology platforms. The system is flexible and extensible to enable further development. In this paper the PDP development is described and the system capabilities are illustrated through a case study of human plasma proteomics data analysis.

Keywords:

Proteomic pipline, Data mining, Data analysis, oligomerization, Mass spectrometry.