School Of Basic And Applied Sciences

Permanent URI for this communityhttps://kr.cup.edu.in/handle/32116/17

Browse

Search Results

Now showing 1 - 1 of 1
  • Item
    Understanding the principles of protein-protein interactions: Designing novel means for virtual proteomics
    (Central University of Punjab, 2020) Kumar, Vicky; Kulharia, Mahesh and Munshi, Anjana
    Proteins are the basic functional units in the cellular world of life. They are nano- machines programmed to associate with other biomolecules in order to enact an array of molecular functions in response to biological events at cellular and system levels. Understanding the biomolecular phenomenon governing such associations may provide insights into the principles of protein chemistry that have a wide-range of applications. In the current work, two databases (PPInS and NRDB) in which the information of interacting protein chains from the experimentally determined protein- protein complexes (PPCs) for which structural information in terms of SCOP superfamily was available, is demarcated in the form of protein-protein interaction interfaces (PPIIs) were developed. The PPIIs contained in these databases were made available on a web server for public use. These were analysed w.r.t. physicochemical and geometrical characteristics of PPI sites. With the belief that designing of computational tools with prediction ability must be trained and tested on real instances of the phenomenon for which it is designed, the analytical information obtained from the analysis of PPIIs from NRDB was incorporated in development of a computational tool, Anveshan, for prediction of putative protein-protein interaction (PPI) sites. The training and test datasets for Anveshan development were obtained from the PPInS. PPInS is a high-performance database of PPIIs in which atomic-level information of the molecular interactions amongst various protein chains in PPCs together with their evolutionary information in Structural Classification of Proteins (SCOPe release 2.06), is made available. Total 32,468 PDB files representing X-ray crystallized multimeric PPCs with structural resolution better than 2.5 Å were shortlisted to demarcate the PPIIs. Total 111,857 PPIIs with approximately 32.24 million atomic contact pairs were generated and made available on a web server, named PPInS, (http://www.cup.edu.in:99/ppins/home.php) for on-site analysis and downloading purpose. A non-redundant database (NRDB) of PPInS containing 2,265 PPIIs with over 1.8 million ACPs corresponding to the 1,931 PPCs was also designed by removing structural redundancies at the level of SCOP superfamily (SCOP release 1.75) was also designed to provide the foundation to the development of Anveshan. All the PPIIs and PPIPs involved in both these databases were analysed w.r.t. residues interface propensity (RIP), hydrophobic content, solvation free energy, compactness of interacting residue’ neighbourhood, planarity, and depth index. The PPIIs were also examined in the context of sequence similarity shared by the protein chains involved in the PPII formation which revealed the presence of homodimers in abundance in PDB. Therefore, prior to analysing the PPIIs w.r.t to other parameters, PPIIs from both the databases were categorized in three PPII classes depicting the low-sequence similarity (LSS), moderate-sequence similarity (MSS), and high- sequence similarity (HSS) between the protein chains involved in PPIIs. Analysis pertaining to RIP showed the presence of aliphatic and aromatic residues on interaction sites in abundance and the least occurrence of charged residues (except Arg). Physicochemical and structural analysis of PPIPs, initially, showed a significant difference between their parametric scores w.r.t. all three PPII classes from PPInS and NRDB. However, on removing less than 1% statistical outliers from each PPII class, the parametric scores from all three classes of PPInS and NRDB converged to a statistical indistinguishable common sub-range and followed the similar distribution trends. This indicates that the principles of molecular recognition among proteins are not driven by their sequence similarity and reinforces the importance of geometrical and electrostatic complementarity as the main determinants for PPIs. The parametric score obtained by analysing 4,530 PPIPs from NRDB w.r.t. their RIP, their hydrophobic content and the amount of solvation free energy associated with them provided the basis for the implementation of Anveshan. By applying Anveshan on another dataset of 4,290 PPIPs from 2,145 PPIIs, the optimal range of these parametric scores and protein-probe van der Waals energy of interaction was determined. Subsequently, taking the optimal range of PPIP parametric scores and threshold for protein-probe van derWaals energy of interaction into the consideration, the Anveshan was tested on a blind dataset of 554 protein chains. Predicting 10 sites for each protein chain and taking the best-predicted patch into account, Anveshan was successful in predicting 69.67% sites correctly with at least 50% accuracy in both precision and coverage separately. On predicting only one PPI site for each protein chain, sites predicted by Anveshan on an average covered 21.91% of actual sites in them. Analysing the sites predicted by SPPIDER, it was found that 22.7% of actual sites were covered in predicted sites. However, on predicting two sites for each protein chain, the percentage coverage of actual sites in the sites predicted by Anveshan exceeded two- fold (i.e. 41.81%), thus making Anveshan a superior approach.