A Pseudo Covariance Wavelet-based Feature Extraction Method to Biomarker Selection from Ovarian Cancer Proteomic Patterns
Subject Areas : electrical and computer engineeringH. Montazery Kordy 1 , M. H. Miran-Baygi 2 , M. H. Moradi 3
1 - Tarbiat Modares University
2 - Tarbiat Modares University
3 -
Keywords: Proteomicspattern recognitiondiscrete wavelet transformpseudo-covariance weight functionbiomarker,
Abstract :
Pathological changes within an organ can be reflected as proteomic patterns in blood. The mass spectrometry has been used as powerful tools to generate proteomic patterns from serum. The produced profiles can be viewed as high dimensional and correlation data for which the features of scientific interest are the peaks. Due to this complexity of data, an appropriate analysis method is needed such as wavelet transform. In this study, we proposed a pseudo-covariance wavelet-based feature extraction method for dimension reduction and de-correlation between mass spectra data. Our algorithm was applied to datasets of ovarian cancer obtained from the National Cancer Institute of USA. The proposed algorithm was used to extract the set of proteins as potential biomarkers in each dataset from reconstructed mass spectra. The selected biomarkers were able to diagnose ovarian cancer patients from non-cancer with high accurate results using standard diagnosis criteria. Using different classification algorithms, our approach yielded an accuracy of 98%, specificity of 97%, and sensitivity of 98%.
[1] E. F. Petricoin and L. A. Liotta, "SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer," Analytical Biotechnology, Science Direct, vol. 15, no. 1, pp. 24-30, Feb. 2004.
[2] H. Kuruma, S. Egawa, and et al., "Proteome analysis of prostate cancer," Prostate Cancer and Prostatic Disease, vol. 8, no. 1, pp. 14-21, 2005.
[3] T. P. Conrads, M. Zhou, E. F. Petricoin III, L. Liotta, and T. D. Veenstra, "Cancer diagnosis using proteomics patterns," Expert Rev. Mol. Diagn., vol. 3, no. 4, pp. 411-420, 2003.
[4] E. F. Petricoin III, D. K. Ornstein, C. P. Paweletz, A. M. Ardekani, P. S. Hackett, B. A. Hitt, A. Velassco, C. Trucco, L. Wiegand, K. Wood, C. B. Simone, P. J. Levine, W. M. Linehan, M. R. Emmert - Buck, S. M. Steinberg, E. C. Kohn, and L. A. Liotta, "Serum proteomic patterns for detection of prostate cancer," J. of National Cancer Institute, vol. 94, no. 20, pp. 1576-1578, Oct. 2002.
[5] E. J. Finehout and K. H. Lee, "An introduction to mass spectrometry applications in biological research," Biochemistry and Molecular Biology Education, vol. 32, no. 2, pp. 93-100, 2004.
[6] A. Jemal, R. Siegel, E. Ward, Y. Hao, J. Xu, T. Murray, and M. J. Thun, "Cancer statistics, 2008" CA Cancer J. Clin., vol. 58, pp. 71-96, 2008.
[7] J. S. Morris, K. R. Coombes, J. Koomen, K. A. Baggerly, and R. Kobayashi, "Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum," Bioinformatics, vol. 21, no. 9, pp. 1764-1775, 2005.
[8] E. F. Petricoin III, A. M. Ardekani, B. A. Hitt, P. J. Levine, V. A. Fusaro, S. M. Steinberg, G. B. Mills, C. Simone, D. A. Fishman, E. C. Kohn, and L. A. Liotta, "Use of proteomic patterns in serum to identify ovarian cancer," The Lancet, vol. 359, pp. 572-577, Feb. 2002.
[9] B. L. Adam, A. Vlahou, O. J. Semmes, and G. L. Wright, "Proteomic approaches to biomarker discovery in prostate and baladder cancers," Proteomics, vol. 1, no. 10, pp. 1264-1270, Oct. 2001.
[10] Y. Qu, B. L. Adam, Y. Yasui, M. D. Ward, S. Nasim, P. F. Schellhammer, Z. Feng, O. J. Semmes, and G. L. Wright, "Boosted decision tree analysis of SELDI mass spectral serum profiles discriminates prostate cancer from noncancer patients," Clinical Chemistry, vol. 48, no. 10, pp. 1835-1843, Oct. 2002.
[11] J. M. Sorace and M. Zhan, "A data review and re - assessment of ovarian cancer serum proteomic profiling," BMC Bioinformatics, vol. 4, no. 24, pp. 1-13, Jun. 2003.
[12] K. A. Baggerly, J. S. Morris, S. R. Edmonson, and K. R. Coombes, "Signal in noise: evaluating reported reproducibility of serum proteomic tests for ovarian cancer," J. of National Cancer Institute, vol. 97, no. 4, pp. 307-309, Feb. 2005.
[13] M. Hilario and A. Kalousis, "Approaches to dimensionality reduction in proteomic biomarker studies," Briefings in Bioinformatics, vol. 9, no. 2, pp. 102-118, Feb. 2008.
[14] K. A. Baggerly, J. S. Morris, and K. R. Coombes, "Reproducibility of SELDI - TOF protein patterns in serum: comparing datasets from different experiments," Bioinformatics, vol. 20, no. 5, pp. 777-785, 22 Mar. 2004.
[15] G. M. Boratyn, M. L. Merchant, and J. B. Klein, "Utilization of human expert techniques for detection of low-abundant peaks in high-resolution mass spectra," 28th IEEE EMBS Annual Int. Conf., pp. 5798-5801, New York City, US, 30 Aug-3 Sep. 2006.
[16] A. G. Hanbury and J. Serra, "Morphological operators on the unit circle," IEEE Trans. Image Processing, vol. 10, no. 12, pp. 1842-1850, Dec. 2001.
[17] S. Mallat, "A wavelet tour of signal processing," Academic Press, 1998.
[18] D. L. Donoho and I. M. Johnstone, "Threshold selection for wavelet shrinkage of noisy data," in Proc. 16th Annual Conf. of the IEEE Engineering in Medicine and Biology Society, vol. 1, pp. 24a-25a, Nov. 1994.
[19] D. L. Donoho, "Denoising by soft-thresholding," IEEE Trans. on Information Theory, vol. 41, no. 3, pp. 613-627, May 1995.
[20] J. Ojanen, T. Miettinen, J. Heikkonen, and J. Rissanen, "Robust denoising of electrophoresis and mass spectrometry signals with minimum description length principle," Federation of European Biochemical Societies Lett.,, vol. 570, no. 1-3, pp. 107-113, 2004.
[21] G. Frosini, B. Lazzerini, and F. Marcelloni, "A modified fuzzy C-means algorithm for feature selection," in Proc. of 19th Int.l Conf. of the North American Fuzzy Information Processing Society, NAFIPS’2000,, Atlanta, US, pp. 148-152, Jul. 2000.
[22] E. D. Hoffman and V. Stroobant, Mass Spectrometry: Principles and Applications, John Wiley and Sons Ltd., 2001.
[23] W. Windig and J. Guilment, "Interactive self-modeling mixture analysis," Analytical Chemistry, vol. 63, no. 14, pp. 1425-1432, 15 Jul. 1991.
[24] L. Cao, P. B. Harrington, and J. Liu, "SIMPLISMA and ALS applied to tow-way nonlinear wavelet compressed ion mobility spectra of chemical warfare agent simulates," Analytical Chemistry, vol. 77, no. 8, pp. 2575-2586, Apr. 2005.
[25] S. A. Astakhov, H. Stogbauer, A. Kraskov, and P. Grassberger, "Monte carlo algorithm for least dependent non-negative mixture decomposition," Analytical Chemistry, vol. 78, no. 5, pp. 1620-1627, 2006.
[26] M. Vannucci, N. Sha, and P. J. Brown, "NIR and mass spectra classification: bayesian methods for wavelet-based feature selection," Chemometrics and Intelligent Laboratory Systems, vol. 77, no. 1-2, pp. 139-148, May 2005.
[27] H. Montazery Kordy, M. H. Miranbaygi, and M. H. Moradi, "Ovarian cancer diagnosis using discrete wavelet transform based feature extraction from serum proteomic patterns," in Cairo Int. Biomedical Engineering Conf., vol. 1, pp. 139-142, Cairo, Egypt, Dec. 2006.
[28] H. Montazery Kordy, M. H. Miranbaygi, and M. H. Moradi, "Diagnosis of prostate cancer by wavelet based feature extraction method using blood proteomic patterns," in Proc. 13th Iranian Conf. in Biomedical Engineering, Tehran, Iran, Jan. 2007.
[29] L. Li, H. Tang, Z. Wu, J. Gong, M. Gruidl, J. Zou, M. Tockman, and R. A. Clark, "Data mining techniques for cancer detection using serum proteomic profiling," Artificial Intelligence in Medicine, vol. 32, no. 2, pp. 71-83, Mar. 2004.
[30] B. Wu, T. Abbott, D. Fishman, W. McMurray, G. Mor, K. Stone, D. Ward, K. Williams, and H. Zhao, "Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data," Bioinformatics, vol. 9, no. 13, pp. 1636-1643, Jul. 2003.