Classification Of G Protein Coupled Receptors Using Machine Learning Techniques

Zia-ur-, rehman (2013) Classification Of G Protein Coupled Receptors Using Machine Learning Techniques. Doctoral thesis, Pakistan Institute Of Engineering And Applied Sciences, Islamabad.

[img] Text

Download (15kB)


G protein-coupled receptors (GPCRs) are located at the boundary of a cell, and are used for inter-cellular communications.They are mostly found in Eukaryotic cells; but can also be found in some Prokaryote cells. GPCRs modulate synaptic transmission in spinal cord and brain, and can trigger signaling pathways for the regulation of cell proliferation and gene expression.They are physiologically very important and according to an estimate, more than 50% of the marketed drugs target GPCRs.Computational prediction of unknown GPCRs has great importance in pharmacology because, malfunction of GPCRs can cause many diseases.The goal of this thesis is to propose new methods for the classification of GPCRs using Machine Learning approaches. The work in this thesis is divided into two parts.The first part is based on the classification of GPCRs using Machine Learning methods. We analyze biological, statistical, and transform-domain based feature extraction strategies and exploited various physiochemical properties to generate discriminate features of GPCR sequences. We have developed various GPCR classification methods. In the first method, GPCRs are predicted using the hybridization of pseudo amino acid composition and multi scale energy representation of physiochemical properties.In this method, our focus is on the introduction of various physiochemical properties (hydrophobicity, electronic and bulk property). In the second method, GPCRs are predicted using grey incidence degree measure and principal component analysis, whereby relation between various components of GPCR sequences is exploited. In the third method, we perform weighted ensemble classification of GPCRs using evolutionary information and multi-scale energy based features.The weights for each of the classifier are optimized using genetic algorithm, which provides an improvement in classification performance. Second part of the thesis is based on multiple sequence alignment of GPCRs, whereby, we utilize the structural information of GPCRs.The three-dimensional structures of several Rhodopsin like GPCRs have been resolved at atomic resolution and validates the prediction using sequence information alone that GPCRs fold has a bundle of seven transmembrane helices (TMs).The dataset is aligned initially using multiple sequence alignment methods and TMs are extracted. The dataset is composed of 19 sub families of Rhodopsin receptors, belonging to 62 species.Weights are assigned to avoid bias for a particular specie. Position specific scoring matrices (PSSM) are computed for the seven TMs data and pseudo counts are added. Pseudo counts are added using conventional Blosum62 scoring matrix.The unknown receptors are classified using PSSMs of the known receptors and by the TM similarity methods. Our research may have valuable contributions in the fields of Bioinformatics, Pattern Classification, and Computational Biology, and has yielded comparable results with the existing approaches. We conclude that our research may help the researchers in further exploring membrane protein classification or any other sub cellular localization classification.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: Techniques, Classification, Protein Coupled, Machine, Receptors, Learning
Subjects: Q Science > QA Mathematics > QA76 Computer software
Depositing User: Muhammad Khan Khan
Date Deposited: 05 Sep 2016 10:20
Last Modified: 05 Sep 2016 10:20

Actions (login required)

View Item View Item