Open Access
Open access
Nucleic Acids Research, volume 40, issue D1, pages D472-D478

AH-DB: collecting protein structure pairs before and after binding

Publication typeJournal Article
Publication date2011-11-13
scimago Q1
SJR7.048
CiteScore27.1
Impact factor16.6
ISSN03051048, 13624962
PubMed ID:  22084200
Genetics
Abstract
This work presents the Apo-Holo DataBase (AH-DB, http://ahdb.ee.ncku.edu.tw/ and http://ahdb.csbb.ntu.edu.tw/), which provides corresponding pairs of protein structures before and after binding. Conformational transitions are commonly observed in various protein interactions that are involved in important biological functions. For example, copper-zinc superoxide dismutase (SOD1), which destroys free superoxide radicals in the body, undergoes a large conformational transition from an 'open' state (apo structure) to a 'closed' state (holo structure). Many studies have utilized collections of apo-holo structure pairs to investigate the conformational transitions and critical residues. However, the collection process is usually complicated, varies from study to study and produces a small-scale data set. AH-DB is designed to provide an easy and unified way to prepare such data, which is generated by identifying/mapping molecules in different Protein Data Bank (PDB) entries. Conformational transitions are identified based on a refined alignment scheme to overcome the challenge that many structures in the PDB database are only protein fragments and not complete proteins. There are 746,314 apo-holo pairs in AH-DB, which is about 30 times those in the second largest collection of similar data. AH-DB provides sophisticated interfaces for searching apo-holo structure pairs and exploring conformational transitions from apo structures to the corresponding holo structures.
Mészáros B., Simon I., Dosztányi Z.
Physical Biology scimago Q2 wos Q3
2011-05-13 citations by CoLab: 52 Abstract  
A frequently neglected aspect of protein-protein interactions is flexibility. Small-scale fluctuations are present even in globular proteins, and alternative conformations can have a significant influence on the binding process. However, flexibility becomes highly prominent in complexes involving intrinsically disordered proteins. The importance of disordered regions in protein interactions has been recognized only relatively recently. In this survey we examine the basic properties of the complexes of disordered and ordered proteins from three different directions. The comparison of the interface properties shows that although disordered proteins can also adopt well-defined conformations in their bound form, their inherently dynamic nature is cast into their complexes. Furthermore, an overview of prediction methods indicates that disordered proteins as well as their binding regions can be recognized from the amino acid sequence by capturing the basic biophysical properties of these segments. Finally, we propose the generalization of the 'energy landscape model' for the description of complex formation that can help to put the various types of protein associations on a common ground.
Rose P.W., Beran B., Bi C., Bluhm W.F., Dimitropoulos D., Goodsell D.S., Prlic A., Quesada M., Quinn G.B., Westbrook J.D., Young J., Yukich B., Zardecki C., Berman H.M., Bourne P.E.
Nucleic Acids Research scimago Q1 wos Q1 Open Access
2010-10-29 citations by CoLab: 513 PDF Abstract  
The RCSB Protein Data Bank (RCSB PDB) web site (http://www.pdb.org) has been redesigned to increase usability and to cater to a larger and more diverse user base. This article describes key enhancements and new features that fall into the following categories: (i) query and analysis tools for chemical structure searching, query refinement, tabulation and export of query results; (ii) web site customization and new structure alerts; (iii) pair-wise and representative protein structure alignments; (iv) visualization of large assemblies; (v) integration of structural data with the open access literature and binding affinity data; and (vi) web services and web widgets to facilitate integration of PDB data and tools with other resources. These improvements enable a range of new possibilities to analyze and understand structure data. The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education.
London N., Movshovitz-Attias D., Schueler-Furman O.
Structure scimago Q1 wos Q2
2010-02-10 citations by CoLab: 361 Abstract  
Peptide-protein interactions are very prevalent, mediating key processes such as signal transduction and protein trafficking. How can peptides overcome the entropic cost involved in switching from an unstructured, flexible peptide to a rigid, well-defined bound structure? A structure-based analysis of peptide-protein interactions unravels that most peptides do not induce conformational changes on their partner upon binding, thus minimizing the entropic cost of binding. Furthermore, peptides display interfaces that are better packed than protein-protein interfaces and contain significantly more hydrogen bonds, mainly those involving the peptide backbone. Additionally, "hot spot" residues contribute most of the binding energy. Finally, peptides tend to bind in the largest pockets available on the protein surface. Our study is based on peptiDB, a new and comprehensive data set of 103 high-resolution peptide-protein complex structures. In addition to improved understanding of peptide-protein interactions, our findings have direct implications for the structural modeling, design, and manipulation of these interactions.
Bai F., Branch R.W., Nicolau D.V., Pilizota T., Steel B.C., Maini P.K., Berry R.M.
Science scimago Q1 wos Q1 Open Access
2010-02-05 citations by CoLab: 169 PDF Abstract  
Complex Cooperativity Cooperativity in multisubunit protein complexes is classically understood in terms of either a concerted model, in which all subunits switch conformation simultaneously, or a sequential model, in which a subunit switches conformation whenever a ligand binds. More recently, a “conformational spread” model has suggested that a conformational coupling between subunits and between subunit and ligand is probabilistic. Using high-resolution optical microscopy, Bai et al. (p. 685 ; see the Perspective by Hilser ) observed multistate switching of the bacterial flagellar switch complex that was previously understood in terms of a concerted allosteric model. The conformational spread model gives quantitative agreement with the data.
You Z., Cao X., Taylor A.B., Hart P.J., Levine R.L.
Biochemistry scimago Q1 wos Q3
2010-01-20 citations by CoLab: 34 Abstract  
In the course of studies on human copper-zinc superoxide dismutase (SOD1), we observed a modified form of the protein whose mass was increased by 158 mass units. The covalent modification was characterized, and we established that it is a novel heptasulfane bridge connecting the two Cys111 residues in the SOD1 homodimer. The heptasulfane bridge was visualized directly in the crystal structure of a recombinant human mutant SOD1, H46R/H48Q, produced in yeast. The modification is reversible, with the bridge being cleaved by thiols, by cyanide, and by unfolding of the protein to expose the polysulfane. The polysulfane bridge can be introduced in vitro by incubation of purified SOD1 with elemental sulfur, even under anaerobic conditions and in the presence of a metal chelator. Because polysulfanes and polysulfides can catalyze the generation of reactive oxygen and sulfur species, the modification may endow SOD1 with a toxic gain of function.
Galaleldeen A., Strange R.W., Whitson L.J., Antonyuk S.V., Narayana N., Taylor A.B., Schuermann J.P., Holloway S.P., Hasnain S.S., Hart P.J.
2009-12-01 citations by CoLab: 67 Abstract  
Amyotrophic lateral sclerosis (ALS) is a fatal, progressive neurodegenerative disease characterized by the destruction of motor neurons in the spinal cord and brain. A subset of ALS cases are linked to dominant mutations in copper-zinc superoxide dismutase (SOD1). The pathogenic SOD1 variants A4V and G93A have been the foci of multiple studies aimed at understanding the molecular basis for SOD1-linked ALS. The A4V variant is responsible for the majority of familial ALS cases in North America, causing rapidly progressing paralysis once symptoms begin and the G93A SOD1 variant is overexpressed in often studied murine models of the disease. Here we report the three-dimensional structures of metal-free A4V and of metal-bound and metal-free G93A SOD1. In the metal-free structures, the metal-binding loop elements are observed to be severely disordered, suggesting that these variants may share mechanisms of aggregation proposed previously for other pathogenic SOD1 proteins.
Lobanov M.Y., Shoemaker B.A., Garbuzynskiy S.O., Fong J.H., Panchenko A.R., Galzitskaya O.V.
Nucleic Acids Research scimago Q1 wos Q1 Open Access
2009-11-10 citations by CoLab: 28 PDF Abstract  
Most of the proteins in a cell assemble into complexes to carry out their function. In this work, we have created a new database (named ComSin) of protein structures in bound (complex) and unbound (single) states to provide a researcher with exhaustive information on structures of the same or homologous proteins in bound and unbound states. From the complete Protein Data Bank (PDB), we selected 24 910 pairs of protein structures in bound and unbound states, and identified regions of intrinsic disorder. For 2448 pairs, the proteins in bound and unbound states are identical, while 7129 pairs have sequence identity 90% or larger. The developed server enables one to search for proteins in bound and unbound states with several options including sequence similarity between the corresponding proteins in bound and unbound states, and validation of interaction interfaces of protein complexes. Besides that, through our web server, one can obtain necessary information for studying disorder-to-order and order-to-disorder transitions upon complex formation, and analyze structural differences between proteins in bound and unbound states. The database is available at http://antares.protres.ru/comsin/.
Dan A., Ofran Y., Kliger Y.
2009-07-07 citations by CoLab: 12 Abstract  
Conformational changes in proteins often involve secondary structure transitions. Such transitions can be divided into two types: disorder-to-order changes, in which a disordered segment acquires an ordered secondary structure (e.g., disorder to alpha-helix, disorder to beta-strand), and order-to-order changes, where a segment switches from one ordered secondary structure to another (e.g., alpha-helix to beta-strand, alpha-helix to turn). In this study, we explore the distribution of these transitions in the proteome. Using a comprehensive, yet highly conservative method, we compared solved three-dimensional structures of identical protein sequences, looking for differences in the secondary structures with which they were assigned. Protein chains in which such secondary structure transitions were detected, were classified into two sets according to the type of transition that is involved (disorder-to-order or order-to-order), allowing us to characterize each set by examining enrichment of gene ontology terms. The results reveal that the disorder-to-order set is significantly enriched with nucleotide binding proteins, whereas the order-to-order set is more diverse. Remarkably, further examination reveals that >22% of the purine nucleotide binding proteins include segments which undergo disorder-to-order transitions, suggesting that such transitions play an important role in this process.
Fong J.H., Shoemaker B.A., Garbuzynskiy S.O., Lobanov M.Y., Galzitskaya O.V., Panchenko A.R.
PLoS Computational Biology scimago Q1 wos Q1 Open Access
2009-03-13 citations by CoLab: 100 PDF Abstract  
We perform a large-scale study of intrinsically disordered regions in proteins and protein complexes using a non-redundant set of hundreds of different protein complexes. In accordance with the conventional view that folding and binding are coupled, in many of our cases the disorder-to-order transition occurs upon complex formation and can be localized to binding interfaces. Moreover, analysis of disorder in protein complexes depicts a significant fraction of intrinsically disordered regions, with up to one third of all residues being disordered. We find that the disorder in homodimers, especially in symmetrical homodimers, is significantly higher than in heterodimers and offer an explanation for this interesting phenomenon. We argue that the mechanisms of regulation of binding specificity through disordered regions in complexes can be as common as for unbound monomeric proteins. The fascinating diversity of roles of disordered regions in various biological processes and protein oligomeric forms shown in our study may be a subject of future endeavors in this area.
Russel D., Lasker K., Phillips J., Schneidman-Duhovny D., Velázquez-Muriel J.A., Sali A.
Current Opinion in Cell Biology scimago Q1 wos Q1
2009-02-15 citations by CoLab: 66 Abstract  
Dynamic processes involving macromolecular complexes are essential to cell function. These processes take place over a wide variety of length scales from nanometers to micrometers, and over time scales from nanoseconds to minutes. As a result, information from a variety of different experimental and computational approaches is required. We review the relevant sources of information and introduce a framework for integrating the data to produce representations of dynamic processes.
Lange O.F., Lakomek N., Farès C., Schröder G.F., Walter K.F., Becker S., Meiler J., Grubmüller H., Griesinger C., de Groot B.L.
Science scimago Q1 wos Q1 Open Access
2008-06-13 citations by CoLab: 918 PDF Abstract  
Protein dynamics are essential for protein function, and yet it has been challenging to access the underlying atomic motions in solution on nanosecond-to-microsecond time scales. We present a structural ensemble of ubiquitin, refined against residual dipolar couplings (RDCs), comprising solution dynamics up to microseconds. The ensemble covers the complete structural heterogeneity observed in 46 ubiquitin crystal structures, most of which are complexes with other proteins. Conformational selection, rather than induced-fit motion, thus suffices to explain the molecular recognition dynamics of ubiquitin. Marked correlations are seen between the flexibility of the ensemble and contacts formed in ubiquitin complexes. A large part of the solution dynamics is concentrated in one concerted mode, which accounts for most of ubiquitin's molecular recognition heterogeneity and ensures a low entropic complex formation cost.
Gao M., Skolnick J.
Nucleic Acids Research scimago Q1 wos Q1 Open Access
2008-05-31 citations by CoLab: 136 PDF Abstract  
The structures of DNA-protein complexes have illuminated the diversity of DNA-protein binding mechanisms shown by different protein families. This lack of generality could pose a great challenge for predicting DNA-protein interactions. To address this issue, we have developed a knowledge-based method, DNA-binding Domain Hunter (DBD-Hunter), for identifying DNA-binding proteins and associated binding sites. The method combines structural comparison and the evaluation of a statistical potential, which we derive to describe interactions between DNA base pairs and protein residues. We demonstrate that DBD-Hunter is an accurate method for predicting DNA-binding function of proteins, and that DNA-binding protein residues can be reliably inferred from the corresponding templates if identified. In benchmark tests on approximately 4000 proteins, our method achieved an accuracy of 98% and a precision of 84%, which significantly outperforms three previous methods. We further validate the method on DNA-binding protein structures determined in DNA-free (apo) state. We show that the accuracy of our method is only slightly affected on apo-structures compared to the performance on holo-structures cocrystallized with DNA. Finally, we apply the method to approximately 1700 structural genomics targets and predict that 37 targets with previously unknown function are likely to be DNA-binding proteins. DBD-Hunter is freely available at http://cssb.biology.gatech.edu/skolnick/webservice/DBD-Hunter/.
Henzler-Wildman K.A., Thai V., Lei M., Ott M., Wolf-Watz M., Fenn T., Pozharski E., Wilson M.A., Petsko G.A., Karplus M., Hübner C.G., Kern D.
Nature scimago Q1 wos Q1
2007-11-18 citations by CoLab: 795 Abstract  
The mechanisms by which enzymes achieve extraordinary rate acceleration and specificity have long been of key interest in biochemistry. It is generally recognized that substrate binding coupled to conformational changes of the substrate–enzyme complex aligns the reactive groups in an optimal environment for efficient chemistry. Although chemical mechanisms have been elucidated for many enzymes, the question of how enzymes achieve the catalytically competent state has only recently become approachable by experiment and computation. Here we show crystallographic evidence for conformational substates along the trajectory towards the catalytically competent ‘closed’ state in the ligand-free form of the enzyme adenylate kinase. Molecular dynamics simulations indicate that these partially closed conformations are sampled in nanoseconds, whereas nuclear magnetic resonance and single-molecule fluorescence resonance energy transfer reveal rare sampling of a fully closed conformation occurring on the microsecond-to-millisecond timescale. Thus, the larger-scale motions in substrate-free adenylate kinase are not random, but preferentially follow the pathways that create the configuration capable of proficient chemistry. Such preferred directionality, encoded in the fold, may contribute to catalysis in many enzymes. The presence of conformational substates of a catalytically competent 'closed' state in the ligand-free form of adenylate kinase is detected. Molecular dynamics simulations indicated that the partially closed conformations were sampled in nanoseconds, and NMR and single-molecule FRET experiments revealed the sampling of a fully closed conformation occurring on the microsecond-to-millisecond timescale.
Michel Espinoza-Fonseca L., Kast D., Thomas D.D.
Biophysical Journal scimago Q1 wos Q2
2007-09-01 citations by CoLab: 58 Abstract  
We have performed molecular dynamics simulations of the phosphorylated (at S-19) and the unphosphorylated 25-residue N-terminal phosphorylation domain of the regulatory light chain (RLC) of smooth muscle myosin to provide insight into the structural basis of regulation. This domain does not appear in any crystal structure, so these simulations were combined with site-directed spin labeling to define its structure and dynamics. Simulations were carried out in explicit water at 310 K, starting with an ideal alpha-helix. In the absence of phosphorylation, large portions of the domain (residues S-2 to K-11 and R-16 through Y-21) were metastable throughout the simulation, undergoing rapid transitions among alpha-helix, pi-helix, and turn, whereas residues K-12 to Q-15 remained highly disordered, displaying a turn motif from 1 to 22.5 ns and a random coil pattern from 22.5 to 50 ns. Phosphorylation increased alpha-helical order dramatically in residues K-11 to A-17 but caused relatively little change in the immediate vicinity of the phosphorylation site (S-19). Phosphorylation also increased the overall dynamic stability, as evidenced by smaller temporal fluctuations in the root mean-square deviation. These results on the isolated phosphorylation domain, predicting a disorder-to-order transition induced by phosphorylation, are remarkably consistent with published experimental data involving site-directed spin labeling of the intact RLC bound to the two-headed heavy meromyosin. The simulations provide new insight into structural details not revealed by experiment, allowing us to propose a refined model for the mechanism by which phosphorylation affects the N-terminal domain of the RLC of smooth muscle myosin.
Theobald D.L., Wuttke D.S.
Bioinformatics scimago Q1 wos Q1 Open Access
2006-06-15 citations by CoLab: 179 PDF Abstract  
THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble.ANSI C source code and selected binaries for various computing platforms are available under the GNU open source license from http://monkshood.colorado.edu/theseus/ or http://www.theseus3d.org.
Raisinghani N., Parikh V., Foley B., Verkhivker G.
2024-12-02 citations by CoLab: 1 PDF Abstract  
Proteins often exist in multiple conformational states, influenced by the binding of ligands or substrates. The study of these states, particularly the apo (unbound) and holo (ligand-bound) forms, is crucial for understanding protein function, dynamics, and interactions. In the current study, we use AlphaFold2, which combines randomized alanine sequence masking with shallow multiple sequence alignment subsampling to expand the conformational diversity of the predicted structural ensembles and capture conformational changes between apo and holo protein forms. Using several well-established datasets of structurally diverse apo-holo protein pairs, the proposed approach enables robust predictions of apo and holo structures and conformational ensembles, while also displaying notably similar dynamics distributions. These observations are consistent with the view that the intrinsic dynamics of allosteric proteins are defined by the structural topology of the fold and favor conserved conformational motions driven by soft modes. Our findings provide evidence that AlphaFold2 combined with randomized alanine sequence masking can yield accurate and consistent results in predicting moderate conformational adjustments between apo and holo states, especially for proteins with localized changes upon ligand binding. For large hinge-like domain movements, the proposed approach can predict functional conformations characteristic of both apo and ligand-bound holo ensembles in the absence of ligand information. These results are relevant for using this AlphaFold adaptation for probing conformational selection mechanisms according to which proteins can adopt multiple conformations, including those that are competent for ligand binding. The results of this study indicate that robust modeling of functional protein states may require more accurate characterization of flexible regions in functional conformations and the detection of high-energy conformations. By incorporating a wider variety of protein structures in training datasets, including both apo and holo forms, the model can learn to recognize and predict the structural changes that occur upon ligand binding.
Raisinghani N., Parikh V., Foley B., Verkhivker G.
2024-11-06 citations by CoLab: 0 Abstract  
AbstractProteins often exist in multiple conformational states, influenced by the binding of ligands or substrates. The study of these states, particularly the apo (unbound) and holo (ligand-bound) forms, is crucial for understanding protein function, dynamics, and interactions. In the current study, we use AlphaFold2 that combines randomized alanine sequence masking with shallow multiple sequence alignment subsampling to expand the conformational diversity of the predicted structural ensembles and capture conformational changes between apo and holo protein forms. Using several well-established datasets of structurally diverse apo-holo protein pairs, the proposed approach enables robust predictions of apo and holo structures and conformational ensembles, while also displaying notably similar dynamics distributions. These observations are consistent with the view that the intrinsic dynamics of allosteric proteins is defined by the structural topology of the fold and favors conserved conformational motions driven by soft modes. Our findings support the notion that AlphaFold2 approaches can yield reasonable accuracy in predicting minor conformational adjustments between apo and holo states, especially for proteins with moderate localized changes upon ligand binding. However, for large, hinge-like domain movements, AlphaFold2 tends to predict the most stable domain orientation which is typically the apo form rather than the full range of functional conformations characteristic of the holo ensemble. These results indicate that robust modeling of functional protein states may require more accurate characterization of flexible regions in functional conformations and detection of high energy conformations. By incorporating a wider variety of protein structures in training datasets including both apo and holo forms, the model can learn to recognize and predict the structural changes that occur upon ligand binding.
Zhang Y., Dong M., Deng J., Wu J., Zhao Q., Gao X., Xiong D.
Communications Biology scimago Q1 wos Q1 Open Access
2024-10-26 citations by CoLab: 2 PDF Abstract  
Assessing mutation impact on the binding affinity change (ΔΔG) of protein–protein interactions (PPIs) plays a crucial role in unraveling structural-functional intricacies of proteins and developing innovative protein designs. In this study, we present a deep learning framework, PIANO, for improved prediction of ΔΔG in PPIs. The PIANO framework leverages a graph masked self-distillation scheme for protein structural geometric representation pre-training, which effectively captures the structural context representations surrounding mutation sites, and makes predictions using a multi-branch network consisting of multiple encoders for amino acids, atoms, and protein sequences. Extensive experiments demonstrated its superior prediction performance and the capability of pre-trained encoder in capturing meaningful representations. Compared to previous methods, PIANO can be widely applied on both holo complex structures and apo monomer structures. Moreover, we illustrated the practical applicability of PIANO in highlighting pathogenic mutations and crucial proteins, and distinguishing de novo mutations in disease cases and controls in PPI systems. Overall, PIANO offers a powerful deep learning tool, which may provide valuable insights into the study of drug design, therapeutic intervention, and protein engineering. PIANO: a deep learning framework providing a powerful tool and potentially unforeseen avenues for the prediction of mutation impact on the binding affinity changes of protein–protein interactions
Feidakis C.P., Krivak R., Hoksza D., Novotny M.
Journal of Molecular Biology scimago Q1 wos Q1
2024-09-01 citations by CoLab: 3 Abstract  
A single protein structure is rarely sufficient to capture the conformational variability of a protein. Both bound and unbound (holo and apo) forms of a protein are essential for understanding its geometry and making meaningful comparisons. Nevertheless, docking or drug design studies often still consider only single protein structures in their holo form, which are for the most part rigid. With the recent explosion in the field of structural biology, large, curated datasets are urgently needed. Here, we use a previously developed application (AHoJ) to perform a comprehensive search for apo-holo pairs for 468,293 biologically relevant protein-ligand interactions across 27,983 proteins. In each search, the binding pocket is captured and mapped across existing structures within the same UniProt, and the mapped pockets are annotated as apo or holo, based on the presence or absence of ligands. We assemble the results into a database, AHoJ-DB (www.apoholo.cz/db), that captures the variability of proteins with identical sequences, thereby exposing the agents responsible for the observed differences in geometry. We report several metrics for each annotated pocket, and we also include binding pockets that form at the interface of multiple chains. Analysis of the database shows that about 24% of the binding sites occur at the interface of two or more chains and that less than 50% of the total binding sites processed have an apo form in the PDB. These results can be used to train and evaluate predictors, discover potentially druggable proteins, and reveal protein- and ligand-specific relationships that were previously obscured by intermittent or partial data. Availability: http://apoholo.cz/db
Feidakis C.P., Krivak R., Hoksza D., Novotny M.
Bioinformatics scimago Q1 wos Q1 Open Access
2022-10-25 citations by CoLab: 7 Abstract  
Abstract Summary Understanding the mechanism of action of a protein or designing better ligands for it, often requires access to a bound (holo) and an unbound (apo) state of the protein. Resources for the quick and easy retrieval of such conformations are severely limited. Apo-Holo Juxtaposition (AHoJ), is a web application for retrieving apo-holo structure pairs for user-defined ligands. Given a query structure and one or more user-specified ligands, it retrieves all other structures of the same protein that feature the same binding site(s), aligns them, and examines the superimposed binding sites to determine whether each structure is apo or holo, in reference to the query. The resulting superimposed datasets of apo-holo pairs can be visualized and downloaded for further analysis. AHoJ accepts multiple input queries, allowing the creation of customized apo-holo datasets. Availability Freely available for non-commercial use at http://apoholo.cz. Source code available at https://github.com/cusbg/AHoJ-project. Supplementary information Supplementary data are available at Bioinformatics online.
Feidakis C.P., Krivak R., Hoksza D., Novotny M.
2022-09-06 citations by CoLab: 0 Abstract  
AbstractUnderstanding the mechanism of action of a protein or designing better ligands for it often requires access to a bound (holo) and an unbound (apo) state of the protein. Resources for the quick and easy retrieval of such conformations are severely limited.Apo-Holo Juxtaposition (AHoJ) is a web application for retrieving apo-holo structure pairs for user-defined ligands. Given a query structure and one or more defined ligands, it retrieves all other structures of the same protein that feature the same binding sites(s), aligns them, and examines the superimposed binding sites to determine whether each structure is apo or holo, in reference to the query. The resulting superimposed datasets of apo-holo pairs can be visualized and downloaded for further analysis. AHoJ accepts multiple input queries, allowing the creation of customized apo-holo datasets. To demonstrate AHoJ’s functionality, we present a newly constructed dataset of apo-holo pairs featuring 13 ion ligands, by complimenting an existing database of biologically relevant holo interactions (BioLiP).Availability and ImplementationFreely available for non-commercial use at http://apoholo.cz.Graphical abstract
Peng C., Zhang X., Xu Z., Chen Z., Yang Y., Cai T., Zhu W.
BMC Bioinformatics scimago Q1 wos Q1 Open Access
2022-02-14 citations by CoLab: 4 PDF Abstract  
Knowledge of protein motions is significant to understand its functions. While currently available databases for protein motions are mostly focused on overall domain motions, little attention is paid on local residue motions. Albeit with relatively small scale, the local residue motions, especially those residues in binding pockets, may play crucial roles in protein functioning and ligands binding. A comprehensive protein motion database, namely D3PM, was constructed in this study to facilitate the analysis of protein motions. The protein motions in the D3PM range from overall structural changes of macromolecule to local flip motions of binding pocket residues. Currently, the D3PM has collected 7679 proteins with overall motions and 3513 proteins with pocket residue motions. The motion patterns are classified into 4 types of overall structural changes and 5 types of pocket residue motions. Impressively, we found that less than 15% of protein pairs have obvious overall conformational adaptations induced by ligand binding, while more than 50% of protein pairs have significant structural changes in ligand binding sites, indicating that ligand-induced conformational changes are drastic and mainly confined around ligand binding sites. Based on the residue preference in binding pocket, we classified amino acids into “pocketphilic” and “pocketphobic” residues, which should be helpful for pocket prediction and drug design. D3PM is a comprehensive database about protein motions ranging from residue to domain, which should be useful for exploring diverse protein motions and for understanding protein function and drug design. The D3PM is available on www.d3pharma.com/D3PM/index.php .
Koike R., Ota M.
2021-01-01 citations by CoLab: 1 Abstract  
Advances in structural biology have provided a wealth of information on protein structures. In many proteins, multiple structures under distinct functional states are available. The comparison of such structures reveals structural changes during the transition between the states, for example, those from ligand-free to -bound states. These structural changes are important for understanding the molecular mechanism of protein function. Currently, a number of computational methods have been developed to compare distinct structural states of the same protein and describe protein structural changes. The resulting structural changes are stored in databases. After a brief introduction of pre-existing methods and databases, we introduce Motion Tree, which illustrates various structural changes using a tree diagram and provides an explanation of how to use Motion Tree. We also introduce PSCDB, which presents structural changes for 837 proteins including homodimers. Structural changes are classified into seven categories based on the types of motions and bound ligands. PSCDB is available via the Internet.
Clark J.J., Benson M.L., Smith R.D., Carlson H.A.
PLoS Computational Biology scimago Q1 wos Q1 Open Access
2019-01-30 citations by CoLab: 63 PDF Abstract  
Understanding how ligand binding influences protein flexibility is important, especially in rational drug design. Protein flexibility upon ligand binding is analyzed herein using 305 proteins with 2369 crystal structures with ligands (holo) and 1679 without (apo). Each protein has at least two apo and two holo structures for analysis. The inherent variation in structures with and without ligands is first established as a baseline. This baseline is then compared to the change in conformation in going from the apo to holo states to probe induced flexibility. The inherent backbone flexibility across the apo structures is roughly the same as the variation across holo structures. The induced backbone flexibility across apo-holo pairs is larger than that of the apo or holo states, but the increase in RMSD is less than 0.5 Å. Analysis of χ1 angles revealed a distinctly different pattern with significant influences seen for ligand binding on side-chain conformations in the binding site. Within the apo and holo states themselves, the variation of the χ1 angles is the same. However, the data combining both apo and holo states show significant displacements. Upon ligand binding, χ1 angles are frequently pushed to new orientations outside the range seen in the apo states. Influences on binding-site variation could not be easily attributed to features such as ligand size or x-ray structure resolution. By combining these findings, we find that most binding site flexibility is compatible with the common practice in flexible docking, where backbones are kept rigid and side chains are allowed some degree of flexibility.
Uroshlev L.A., Kulakovskiy I.V., Esipova N.G., Tumanyan V.G., Rahmanov S.V., Makeev V.J.
2017-01-22 citations by CoLab: 2 Abstract  
Structures of many metal-binding proteins are often obtained without structural cations in their apoprotein forms. Missing cation coordinates are usually updated from structural templates constructed from many holoprotein structures. Such templates usually do not include structural water, the important contributor to the ion binding energy. Structural templates are also inconvenient for taking into account structural modifications around the binding site at apo-/holo- transitions. An approach based upon statistical potentials readily takes into account structural modifications associated with binding as well as contribution of structural water molecules. Here, we construct a set of statistical potentials for Mg2+, Ca2+, and Zn2+ contacting with protein atoms of a different type or structural water oxygens. Each type of the cations tends to form tight contacts with protein atoms of specific types. Structural water contributes relatively more into the binding pseudo-energy of Mg2+ and Ca2+ than of Zn2+. We have developed PIONCA (Protein-Ion Calculator), a fast CUDA GPGPU-based algorithm that predicts ion-binding sites in apoproteins. Comparative tests demonstrate that PIONCA outperforms most of the tools based on structural templates or docking. Our software can be also used for locating bound cations in holoprotein structures with missing cation heteroatoms. PIONCA is equipped with an interactive web interface based upon JSmol.
Shen Q., Wang G., Li S., Liu X., Lu S., Chen Z., Song K., Yan J., Geng L., Huang Z., Huang W., Chen G., Zhang J.
Nucleic Acids Research scimago Q1 wos Q1 Open Access
2015-09-13 citations by CoLab: 112 PDF Abstract  
Allosteric regulation, the most direct and efficient way of regulating protein function, is induced by the binding of a ligand at one site that is topographically distinct from an orthosteric site. Allosteric Database (ASD, available online at http://mdl.shsmu.edu.cn/ASD) has been developed to provide comprehensive information featuring allosteric regulation. With increasing data, fundamental questions pertaining to allostery are currently receiving more attention from the mechanism of allosteric changes in an individual protein to the entire effect of the changes in the interconnected network in the cell. Thus, the following novel features were added to this updated version: (i) structural mechanisms of more than 1600 allosteric actions were elucidated by a comparison of site structures before and after the binding of an modulator; (ii) 261 allosteric networks were identified to unveil how the allosteric action in a single protein would propagate to affect downstream proteins; (iii) two of the largest human allosteromes, protein kinases and GPCRs, were thoroughly constructed; and (iv) web interface and data organization were completely redesigned for efficient access. In addition, allosteric data have largely expanded in this update. These updates are useful for facilitating the investigation of allosteric mechanisms, dynamic networks and drug discoveries.
Chen X., Xuan J., Wang C., Shajahan A.N., Riggins R.B., Clarke R.
2013-11-01 citations by CoLab: 12 Abstract  
Reliable inference of transcription regulatory networks is a challenging task in computational biology. Network component analysis (NCA) has become a powerful scheme to uncover regulatory networks behind complex biological processes. However, the performance of NCA is impaired by the high rate of false connections in binding information. In this paper, we integrate stability analysis with NCA to form a novel scheme, namely stability-based NCA (sNCA), for regulatory network identification. The method mainly addresses the inconsistency between gene expression data and binding motif information. Small perturbations are introduced to prior regulatory network, and the distance among multiple estimated transcript factor (TF) activities is computed to reflect the stability for each TF's binding network. For target gene identification, multivariate regression and t-statistic are used to calculate the significance for each TF-gene connection. Simulation studies are conducted and the experimental results show that sNCA can achieve an improved and robust performance in TF identification as compared to NCA. The approach for target gene identification is also demonstrated to be suitable for identifying true connections between TFs and their target genes. Furthermore, we have successfully applied sNCA to breast cancer data to uncover the role of TFs in regulating endocrine resistance in breast cancer.
Chang D.T., Li W., Bai Y., Wu W.
Gene scimago Q2 wos Q2
2013-04-01 citations by CoLab: 5 Abstract  
The advance of high-throughput experimental technologies generates many gene sets with different biological meanings, where many important insights can only be extracted by identifying the biological (regulatory/functional) features that are distinct between different gene sets (e.g. essential vs. non-essential genes, TATA box-containing vs. TATA box-less genes, induced vs. repressed genes under certain biological conditions). Although many servers have been developed to identify enriched features in a gene set, most of them were designed to analyze one gene set at a time but cannot compare two gene sets. Moreover, the features used in existing servers were mainly focused on functional annotations (GO terms), pathways, transcription factor binding sites (TFBSs) and/or protein–protein interactions (PPIs). In yeast, various important regulatory features, including promoter bendability, nucleosome occupancy, 5′-UTR length, and TF–gene regulation evidence, are available but have not been used in any enrichment analysis servers. This motivates us to develop the Yeast Genes Analyzer (YGA), a web server that simultaneously analyzes various biological (regulatory/functional) features of two gene sets and performs statistical tests to identify the distinct features between them. Many well-studied gene sets such as essential, stress-response, TATA box-containing and cell cycle genes were pre-compiled in YGA for users, if they have only one gene set, to compare with. In comparison with the existing enrichment analysis servers, YGA tests more comprehensive regulatory features (e.g. promoter bendability, nucleosome occupancy, 5′-UTR length, experimental evidence of TF–gene binding and TF–gene regulation) and functional features (e.g. PPI, GO terms, pathways and functional groups of genes, including essential/non-essential genes, stress-induced/-repressed genes, TATA box-containing/-less genes, occupied/depleted proximal-nucleosome genes and cell cycle genes). Furthermore, YGA uses various statistical tests to provide objective comparison measures. The two major contributions of YGA, comprehensive features and statistical comparison, help to mine important information that cannot be obtained from other servers. The sophisticated analysis tools of YGA can identify distinct biological features between two gene sets, which help biologists to form new hypotheses about the underlying biological mechanisms responsible for the observed difference between these two gene sets. YGA can be accessed from the following web pages: http://cosbi.ee.ncku.edu.tw/yga/ and http://yga.ee.ncku.edu.tw/ . ► YGA can compare various biological features between two yeast gene sets. ► YGA includes six unique regulatory features and one unique functional feature. ► YGA provides an overall index for biological features with multiple tests. ► YGA proposes a novel procedure to compare two profiles. ► YGA can form hypotheses responsible for the observed difference between gene sets.
Fan C., Bai Y., Huang C., Yao T., Chiang W., Chang D.T.
Gene scimago Q2 wos Q2
2013-04-01 citations by CoLab: 1 Abstract  
This work presents the Protein Association Analyzer (PRASA) ( http://zoro.ee.ncku.edu.tw/prasa/ ) that predicts protein interactions as well as interaction types. Protein interactions are essential to most biological functions. The existence of diverse interaction types, such as physically contacted or functionally related interactions, makes protein interactions complex. Different interaction types are distinct and should not be confused. However, most existing tools focus on a specific interaction type or mix different interaction types. This work collected 7234058 associations with experimentally verified interaction types from five databases and compiled individual probabilistic models for different interaction types. The PRASA result page shows predicted associations and their related references by interaction type. Experimental results demonstrate the performance difference when distinguishing between different interaction types. The PRASA provides a centralized and organized platform for easy browsing, downloading and comparing of interaction types, which helps reveal insights into the complex roles that proteins play in organisms. ► The Protein Association Analyzer (PRASA) predicts protein interactions with types. ► Available at http://prasa.ee.ncku.edu.tw/ and http://prasa.csbb.ntu.edu.tw/ ► Accommodates three interaction types: physical, genetic and pathway interactions ► de novo predictor, which requires only protein primary sequences ► Evaluated on human and yeast and achieved good performance

Top-30

Journals

1
2
1
2

Publishers

1
2
3
4
5
1
2
3
4
5
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?