Membrane Cholesterol Prediction from Human Receptor using Rough Set based Mean-Shift Approach
محورهای موضوعی : Machine learningRudra Kalyan Nayak 1 , Ramamani Tripathy 2 , Hitesh Mohapatra 3 , Amiya Kumar Rath 4 , Debahuti Mishra 5
1 - School of Computing Science and Engineering,VIT Bhopal University, Bhopal-Indore Highway, Kothrikalan, Sehore, MP, India
2 - Department of Computer Science and Engineering, Chitkara University Himachal Pradesh Campus, Pinjore-Nalagarh National Highway, Dist-Baddi, Himachal Pradesh, India
3 - School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar 751024, Odisha, India
4 - Department of Computer Science and Engineering,Veer Surendra Sai University of Technology, Burla, Odisha 768018, India
5 - Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan (Deemed to be) University, Bhubaneswar, Odisha, India
کلید واژه: GPCR, CRAC, CARC, ANN, Decision Tree, Rough set, Mean shift,
چکیده مقاله :
In human physiology, cholesterol plays an imperative part in membrane cells which regulates the function of G-protein-coupled receptors (GPCR) family. Cholesterol is an individual type of lipid structure and about 90 percent of cellular cholesterol is present at plasma membrane region. Cholesterol Recognition/interaction Amino acid Consensus (CRAC) sequence, generally referred as the CRAC (L/V)-X1−5-(Y)-X1−5-(K/R) and the new cholesterol-binding domain is similar to the CRAC sequence, but exhibits the inverse orientation along the polypeptide chain i.e. CARC (K/R)-X1−5-(Y/F)-X1−5-(L/V). GPCR is treated as a biggest super family in human physiology and probably more than 900 protein genes included in this family. Among all membrane proteins GPCR is responsible for novel drug discovery in all pharmaceuticals industry. In earlier researches the researchers did not find the required number of valid motifs in terms of helices and motif types so they were lacking clinical relevance. The research gap here is that they were not able to predict the motifs effectively which are belonging to multiple motif types. To find out better motif sequences from human GPCR, we explored a hybrid computational model consisting of hybridization of Rough Set with Mean-Shift algorithm. In this paper we made comparison among our resulted output with other techniques such as fuzzy C-means (FCM), FCM with spectral clustering and we concluded that our proposed method targeted well on CRAC region in comparison to CARC region which have higher biological relevance in medicine industry and drug discovery.
In human physiology, cholesterol plays an imperative part in membrane cells which regulates the function of G-protein-coupled receptors (GPCR) family. Cholesterol is an individual type of lipid structure and about 90 percent of cellular cholesterol is present at plasma membrane region. Cholesterol Recognition/interaction Amino acid Consensus (CRAC) sequence, generally referred as the CRAC (L/V)-X1−5-(Y)-X1−5-(K/R) and the new cholesterol-binding domain is similar to the CRAC sequence, but exhibits the inverse orientation along the polypeptide chain i.e. CARC (K/R)-X1−5-(Y/F)-X1−5-(L/V). GPCR is treated as a biggest super family in human physiology and probably more than 900 protein genes included in this family. Among all membrane proteins GPCR is responsible for novel drug discovery in all pharmaceuticals industry. In earlier researches the researchers did not find the required number of valid motifs in terms of helices and motif types so they were lacking clinical relevance. The research gap here is that they were not able to predict the motifs effectively which are belonging to multiple motif types. To find out better motif sequences from human GPCR, we explored a hybrid computational model consisting of hybridization of Rough Set with Mean-Shift algorithm. In this paper we made comparison among our resulted output with other techniques such as fuzzy C-means (FCM), FCM with spectral clustering and we concluded that our proposed method targeted well on CRAC region in comparison to CARC region which have higher biological relevance in medicine industry and drug discovery.
[1] D. M. Rosenbaum, S. G. Rasmussen, and B. K. Kobilka, “The structure and function of G-protein-coupled receptors,” Nature, vol. 459, no. 7245, pp. 356-363, May 2009.
[2] C. Ellis, “The state of GPCR research in 2004,” Nature Reviews Drug Discovery, vol. 3, no. 7, pp. 577-626, 2004.
[3] C. J. Baier, J. Fantini, and F. J. Barrantes, “Disclosure of cholesterol recognition motifs in transmembrane domains of the human nicotinic acetylcholine receptor,” Scientific reports, vol. 1, no. 1, pp. 1-7, 2011.
[4] X. Sun and G. R. Whittaker, “Role for influenza virus envelope cholesterol in virus entry and infection,” Journal of virology, vol. 77, no. 3, pp. 12543-12551, 2003.
[5] T. J. Pucadyil, A. Chattopadhyay, “Role of cholesterol in the function and organization of G-protein coupled receptors,” Progress in lipid research, vol. 45, no. 4, pp. 295-333, 2006.
[6] S. Putluri, M. Z. Rahman, C. S. Amara, and N. Putluri, “New exon prediction techniques using adaptive signal processing algorithms for genomic analysis, IEEE Access, vol. no. 7, pp. 80800-80812, 2019.
[7] T. A. Masoodi, N. A. Shaik, S. Burhan, Q. Hasan, G. Shafi, and V. R. Talluri, “Structural prediction, whole exome sequencing and molecular dynamics simulation confirms p. G118D somatic mutation of PIK3CA as functionally important in breast cancer patients,” Computational biology and chemistry, vol. 80, no. 2, pp. 472-479, 2019.
[8] A. Ahilan, G. Manogaran, C. Raja, S. Kadry, S. N. Kumar, C. A. Kumar, T. Jarin, S. Krishnamoorthy, P. M. Kumar, G. C. Babu, and N. S. Murugan, “Segmentation by fractional order darwinian particle swarm optimization based multilevel thresholding and improved lossless prediction based compression algorithm for medical images,” IEEE Access, vol. 7, pp. 89570-89580, 2019.
[9] N. Jayanthi, B. V. Babu, and N. S. Rao, “Survey on clinical prediction models for diabetes prediction,” Journal of Big Data, vol. 4, no. 1, pp. 1-5, 2017.
[10] M. Anila, and G. Pradeepini, “Study of prediction algorithms for selecting appropriate classifier in machine learning,” Journal of Advanced Research in Dynamical and Control Systems, vol. 9, pp. 257-268, 2017.
[11] R. Tripathy, D. Mishra, and V. B. Konkimalla, “A novel fuzzy C-means approach for uncovering cholesterol consensus motif from human G-protein coupled receptors (GPCR),” Karbala International Journal of Modern Science, vol. 1, no. 4, pp. 212-224, 2015.
[12] R. Tripathy, D. Mishra, V. B. Konkimalla, and R. K. Nayak, “A computational approach for mining cholesterol and their potential target against GPCR seven helices based on spectral clustering and fuzzy c-means algorithms,” Journal of Intelligent & Fuzzy Systems, vol. 35, no. 1, pp. 305-314, 2018.
[13] R. M. Epand, A. Thomas, R. Brasseur, and R. F. Epand, “Cholesterol interaction with proteins that partition into membrane domains: an overview,” Cholesterol Binding and Cholesterol Transport Proteins, pp. 253-278, 2010.
[14] R. M. Epand, “Cholesterol and the interaction of proteins with membrane domains. Progress in lipid research,” vol. 45, no. 4, pp. 279-294, 2006.
[15] D. Gurram, and M. N. Rao, ‘A comparative study of support vector machine and logistic regression for the diagnosis of thyroid dysfunction,” International Journal of Engineering & Technology, vol. 7, no. 1.1, pp. 326-328, 2018.
[16] H. Jyothula, S. K. Rao, and V. Vallikumari, “Two phase active counter mechanism embedded with particle swarm optimization technique for segmentation of bio-medical images,” Journal of Advanced Research in Dynamical and Control Systems, vol. 9, no. 6, pp. 232-242, 2017.
[17] S. Razia, M. R. Narasingarao, and P. Bojja, “Development and analysis of support vector machine techniques for early prediction of breast cancer and thyroid,” Journal of Advanced Research in Dynamical and Control Systems, vol. 9, no. 6, pp. 869-878, 2017.
[18] P. Siva Kumar, V. Sarvani, P. Prudhvi Raj, K. Suma, and D. Nandu, “Prediction of heart disease using multiple regression analysis and support vector machines,” Journal of Advanced Research in Dynamical and Control Systems, vol. 9, no. 18, pp. 675-682, 2017.
[19] N. Rajesh, T. Maneesha, S. Hafeez, and H. Krishna, “Prediction of heart disease using machine learning algorithms,” International Journal of Engineering & Technology(UAE), vol. 7, no. 2.32, pp. 363-366, 2018.
[20] Q. B. Gao QB, Z. Z. Wang, “Classification of G-protein coupled receptors at four levels,” Protein Engineering, Design and Selection, vol. 19, no. 11, pp. 511-516, 2006.
[21] Q. Gu, Y. S. Ding, and T. L. Zhang, “Prediction of G-protein-coupled receptor classes in low homology using Chou's pseudo amino acid composition with approximate entropy and hydrophobicity patterns,” Protein and peptide letters, vol. 17, no. 5, pp. 559-567, 2010.
[22] M. Bhasin, and G. P. Raghava, “GPCRsclass: a web tool for the classification of amine type of G-protein-coupled receptors,” Nucleic acids research, vol. 33, no. 2, pp. W143-147, 2005.
[23] http://www.uniprot.org [24] Z. Pawlak, “Rough sets,” International journal of computer & information sciences, vol. 11, no. 5, pp. 341-356, 1982.
[25] A. Skowron, J. Komorowski, Z. Pawlak, and L. Polkowski, “Rough sets perspective on data and knowledge,” In Handbook of data mining and knowledge discovery, pp. 134-149, 2002.
[26] Z. Pawlak, and A. Skowron, “Rough membership functions,” In Advances in the Dempster-Shafer theory of evidence, pp. 251-271, 1994.
[27] L. Polkowski, “Rough sets,” Heidelberg: Physica-Verlag, 2002.
[28] L. Polkowski, and A. Skowron, “Rough mereological calculi of granules: A rough set approach to computation,” Computational Intelligence, vol. 17, no. 3, pp. 472-492, 2001.
[29] M. A. Carreira-Perpinan, “Gaussian mean-shift is an EM algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 5, pp. 767-776, 2007.
[30] H. E. Cetingul, and R. Vidal, “Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds” In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1896-1902, 2009, IEEE.
[31] C. C. Chang, and C. J. Lin, “IJCNN 2001 challenge: Generalization ability and text decoding,” In IJCNN'01 International Joint Conference on Neural Networks, Proceedings (Cat. No. 01CH37222), vol. 2, pp. 1031-1036, 2001, IEEE.
[32] Y. Cheng, “Mean shift, mode seeking, and clustering,” IEEE transactions on pattern analysis and machine intelligence, vol. 17, no. 8, pp. 790-799, 1995.
[33] R. T. Collins, “Mean-shift blob tracking through scale space,” In 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Proceedings, vol. 2, pp. II-234, 2003, IEEE.
[34] D. Comaniciu, and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, pp. 603-619, 2002.
[35] D. S. Wilks, “Statistical methods in the atmospheric sciences,” Academic press, 2011.
[36] M. A. Carreira-Perpinán, “Fast nonparametric clustering with Gaussian blurring mean-shift” In Proceedings of the 23rd international conference on Machine learning, pp. 153-160, 2006.
[37] M. A. Carreira-Perpinan, “Acceleration strategies for Gaussian mean-shift image segmentation,” In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 1, pp. 1160-1167, 2006, IEEE.