Converting protein sequence to image for classification with convolutional neural network
Subject Areas : Generalreza ahsan 1 , mansour ebrahimi 2 , dianat dianat 3
1 - عضو هیات علمی
2 -
3 -
Keywords: : Converting protein sequence to image, Gabor filter, Convolution Neural Network, Protein classification.,
Abstract :
Since methods for sequencing machine learning sequences were not successful in classifying healthy and cancerous proteins, it is imperative to find a way to represent these sequences to classify healthy and ill individuals with deep learning approaches. In this study different methods of protein sequence representation for classification of protein sequence of healthy individuals and leukemia have been studied. Results showed that conversion of amino acid letters to one-dimensional feature vectors in classification of 2 classes was not successful and only one disease class was detected. By changing the feature vector to colored numbers, the accuracy of the healthy class recognition was slightly improved. The binary protein sequence representation method was more efficient than the previous methods with the initiative of sequencing the sequences in both one-dimensional and two-dimensional (image by Gabor filtering). Protein sequence representation as binary image was classified by applying Gabor filter with 100% accuracy of the protein sequence of healthy individuals and 98.6% protein sequence of those with leukemia. The findings of this study showed that the representation of protein sequence as binary image by applying Gabor filter can be used as a new effective method for representation of protein sequences for classification
[1] A. Gupta, H. Wang, and M. Ganapathiraju, "Learning structure in gene expression data using deep architectures, with an application to gene clustering," 2015, pp. 1328-1335.
[2] Y. Liu, S. Zhou, and Q. Chen, "Discriminative deep belief networks for visual data classification," Pattern Recognition, vol. 44, pp. 2287-2296, 2011.
[3] J. Chen, R. Swofford, J. Johnson, B. B. Cummings, N. Rogel, K. Lindblad-Toh, et al., "A quantitative framework for characterizing the evolutionary history of mammalian gene expression," Genome research, vol. 29, pp. 53-63, 2019.
[4] T. Hardy, J. Feng, D. Lawrence, T. Fullston, and H. Scott, "Application of Artificial Intelligence To Analysis of The Embryonic Genome For Preimplantation Genetic Diagnosis," Pathology, vol. 51, p. S65, 2019.
[5] C. S. Boddy and S. Ma, "Frontline therapy of CLL: evolving treatment paradigm," Current hematologic malignancy reports, vol. 13, pp. 69-77, 2018.
[6] K. He, D. Ge, and M. He, "Big data analytics for genomic medicine," International journal of molecular sciences, vol. 18, p. 412, 2017.
[7] C. Angermueller, T. Pärnamaa, L. Parts, and O. Stegle, "Deep learning for computational biology," Molecular systems biology, vol. 12, p. 878, 2016.
[8] M. Leung, H. Xiong, L. Lee, and B. Frey, "Deep learning of the tissueregulated splicing code," Bioinformatics 30, pp. i121 – i129, 2014.
[9] H. Xiong, B. Alipanahi, L. Lee, H. Bretschneider, D. Merico, R. Yuen, et al., "The human splicing code reveals new insights into the genetic determinants of disease," Science 347, p. 1254806, 2015.
[10] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, "How transferable are features in deep neural networks?," Advances in Neural Information Processing Systems 27, pp. 3320-3328, 2014.
[11] B. Alipanahi, A. Delong, M. Weirauch, and B. Frey, "Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning," Nat Biotechnol 33, pp. 831 – 838, 2015.
[12] J. Zhou and O. Troyanskaya, "Predicting effects of noncoding variants with deep learning-based sequence model," Nat Methods 12, pp. 931 – 934, 2015.
[13] A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, "CNN features off-the-shelf: an astounding baseline for recognition," 2018, pp. 512-519.
[14] W. Sun, T.-L. B. Tseng, J. Zhang, and W. Qian, "Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data," Computerized Medical Imaging and Graphics, vol. 57, pp. 4-9, 2017.
[15] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Machine learning, vol. 46, pp. 389-422, 2002.
[16] M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014, pp. 818-833.
[17] M. Biswas, A. Tiwari, M. Turk, J. Laird, C. Asare, L. Saba, et al., "A Review on a Deep Learning Perspective in Brain Cancer Classification," Cancers, vol. 11, 2019.
[18] J. Schmidhuber, "Deep learning in neural networks: An overview," Neural networks, vol. 61, pp. 85-117, 2015.
[19] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, et al., "Recent advances in convolutional neural networks," Pattern Recognition, vol. 77, pp. 354-377, 2018.
[20] M. A. Jafri, S. A. Ansari, M. H. Alqahtani, and J. W. Shay, "Roles of telomeres and telomerase in cancer, and advances in telomerase-targeted therapies," Genome medicine, vol. 8, p. 69, 2016. "
[21] X. Chu and K. L. Chan, "Rotation and scale invariant texture analysis with tunable Gabor filter banks," in Pacific-Rim Symposium on Image and Video Technology, 2009, pp. 83-93.
[22] R. C. González, R. E. Woods, and S. L. Eddins, Digital Image Processing Using MATLAB: Pearson, 2004.
[23] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," Journal of machine learning research, vol. 3, pp. 1157-1182, 2003.
[24] H. Liu and L. Yu, "Toward integrating feature selection algorithms for classification and clustering," IEEE Transactions on Knowledge & Data Engineering, pp. 491-502, 2005.