Recent Patents on Biotechnology, volume 18, issue 1, pages 35-52

Role of Artificial Intelligence in Drug Discovery to Revolutionize the Pharmaceutical Industry: Resources, Methods and Applications

Pranjal Kumar Singh 1
KAPIL SACHAN 2
Vishal Khandelwal 3
Sumita Singh 4
Smita Singh 5
1
 
Department of Pharmacy, Kalka Institute for Research and Advanced Studies, Meerut, Uttar Pradesh, India
2
 
KIET School of Pharmacy, KIET Group of Institutions, Ghaziabad, Uttar Pradesh, India
4
 
Faculty of Pharmacy, Swami Vivekanand Subharti University, Meerut, Uttar Pradesh, India
Publication typeJournal Article
Publication date2025-03-01
scimago Q3
SJR0.277
CiteScore2.9
Impact factor
ISSN18722083, 22124012
Applied Microbiology and Biotechnology
Biotechnology
Bioengineering
Abstract

Traditional drug discovery methods such as wet-lab testing, validations, and synthetic techniques are time-consuming and expensive. Artificial Intelligence (AI) approaches have progressed to the point where they can have a significant impact on the drug discovery process. Using massive volumes of open data, artificial intelligence methods are revolutionizing the pharmaceutical industry. In the last few decades, many AI-based models have been developed and implemented in many areas of the drug development process. These models have been used as a supplement to conventional research to uncover superior pharmaceuticals expeditiously. AI's involvement in the pharmaceutical industry was used mostly for reverse engineering of existing patents and the invention of new synthesis pathways. Drug research and development to repurposing and productivity benefits in the pharmaceutical business through clinical trials. AI is studied in this article for its numerous potential uses. We have discussed how AI can be put to use in the pharmaceutical sector, specifically for predicting a drug's toxicity, bioactivity, and physicochemical characteristics, among other things. In this review article, we have discussed its application to a variety of problems, including <i>de novo</i> drug discovery, target structure prediction, interaction prediction, and binding affinity prediction. AI for predicting drug interactions and nanomedicines were also considered.

Qureshi R., Irfan M., Gondal T.M., Khan S., Wu J., Hadi M.U., Heymach J., Le X., Yan H., Alam T.
Heliyon scimago Q1 wos Q1 Open Access
2023-07-01 citations by CoLab: 100 Abstract  
The COVID-19 pandemic has emphasized the need for novel drug discovery process. However, the journey from conceptualizing a drug to its eventual implementation in clinical settings is a long, complex, and expensive process, with many potential points of failure. Over the past decade, a vast growth in medical information has coincided with advances in computational hardware (cloud computing, GPUs, and TPUs) and the rise of deep learning. Medical data generated from large molecular screening profiles, personal health or pathology records, and public health organizations could benefit from analysis by Artificial Intelligence (AI) approaches to speed up and prevent failures in the drug discovery pipeline. We present applications of AI at various stages of drug discovery pipelines, including the inherently computational approaches of de novo design and prediction of a drug's likely properties. Open-source databases and AI-based software tools that facilitate drug design are discussed along with their associated problems of molecule representation, data collection, complexity, labeling, and disparities among labels. How contemporary AI methods, such as graph neural networks, reinforcement learning, and generated models, along with structure-based methods, (i.e., molecular dynamics simulations and molecular docking) can contribute to drug discovery applications and analysis of drug responses is also explored. Finally, recent developments and investments in AI-based start-up companies for biotechnology, drug design and their current progress, hopes and promotions are discussed in this article.
Wang M., Wang Z., Sun H., Wang J., Shen C., Weng G., Chai X., Li H., Cao D., Hou T.
2022-02-01 citations by CoLab: 91 Abstract  
De novo drug design is the process of generating novel lead compounds with desirable pharmacological and physiochemical properties. The application of deep learning (DL) in de novo drug design has become a hot topic, and many DL-based approaches have been developed for molecular generation tasks. Generally, these approaches were developed as per four frameworks: recurrent neural networks; encoder-decoder; reinforcement learning; and generative adversarial networks. In this review, we first introduced the molecular representation and assessment metrics used in DL-based de novo drug design. Then, we summarized the features of each architecture. Finally, the potential challenges and future directions of DL-based molecular generation were prospected.
Moumné L., Marie A., Crouvezier N.
Pharmaceutics scimago Q1 wos Q1 Open Access
2022-01-22 citations by CoLab: 95 PDF Abstract  
Following the first proof of concept of using small nucleic acids to modulate gene expression, a long period of maturation led, at the end of the last century, to the first marketing authorization of an oligonucleotide-based therapy. Since then, 12 more compounds have hit the market and many more are in late clinical development. Many companies were founded to exploit their therapeutic potential and Big Pharma was quickly convinced that oligonucleotides could represent credible alternatives to protein-targeting products. Many technologies have been developed to improve oligonucleotide pharmacokinetics and pharmacodynamics. Initially targeting rare diseases and niche markets, oligonucleotides are now able to benefit large patient populations. However, there is still room for oligonucleotide improvement and further breakthroughs are likely to emerge in the coming years. In this review we provide an overview of therapeutic oligonucleotides. We present in particular the different types of oligonucleotides and their modes of action, the tissues they target and the routes by which they are administered to patients, and the therapeutic areas in which they are used. In addition, we present the different ways of patenting oligonucleotides. We finally discuss future challenges and opportunities for this drug-discovery platform.
Gliozzo E., Ionescu C.
2021-12-29 citations by CoLab: 69 PDF Abstract  
This review summarises the state-of-the-art of lead-based pigment studies, addressing their production, trade, use and possible alteration. Other issues, such as those related to the investigation and protection of artworks bearing lead-based pigments are also presented. The focus is mineralogical, as both raw materials and degradation products are mineral phases occurring in nature (except for very few cases). The minerals described are abellaite, anglesite, blixite, caledonite, challacolloite, cerussite, cotunnite, crocoite, galena, grootfonteinite, hydrocerussite, laurionite, leadhillite, litharge, macphersonite, massicot, mimetite, minium, palmierite, phosgenite, plattnerite, plumbonacrite, schulténite, scrutinyite, somersetite, susannite, vanadinite and an unnamed phase (PbMg(CO3)2). The pigments discussed are lead white, red lead, litharge, massicot, lead-tin yellow, lead-tin-antimony yellow, lead-chromate yellow and Naples yellow. An attempt is made to describe the history, technology and alteration of these pigments in the most complete manner possible, despite the topic's evident breadth. Finally, an insight into the analytical methods that can (and should) be used for accurate archaeometric investigations and a summary of key concepts conclude this review, along with a further list of references for use as a starting point for further research.
Selvaraj C., Chandra I., Singh S.K.
Molecular Diversity scimago Q2 wos Q2
2021-10-23 citations by CoLab: 85 Abstract  
The global spread of COVID-19 has raised the importance of pharmaceutical drug development as intractable and hot research. Developing new drug molecules to overcome any disease is a costly and lengthy process, but the process continues uninterrupted. The critical point to consider the drug design is to use the available data resources and to find new and novel leads. Once the drug target is identified, several interdisciplinary areas work together with artificial intelligence (AI) and machine learning (ML) methods to get enriched drugs. These AI and ML methods are applied in every step of the computer-aided drug design, and integrating these AI and ML methods results in a high success rate of hit compounds. In addition, this AI and ML integration with high-dimension data and its powerful capacity have taken a step forward. Clinical trials output prediction through the AI/ML integrated models could further decrease the clinical trials cost by also improving the success rate. Through this review, we discuss the backend of AI and ML methods in supporting the computer-aided drug design, along with its challenge and opportunity for the pharmaceutical industry. From the available information or data, the AI and ML based prediction for the high throughput virtual screening. After this integration of AI and ML, the success rate of hit identification has gained a momentum with huge success by providing novel drugs.
Dara S., Dhamercherla S., Jadav S.S., Babu C.M., Ahsan M.J.
Artificial Intelligence Review scimago Q1 wos Q1
2021-08-11 citations by CoLab: 385 Abstract  
This review provides the feasible literature on drug discovery through ML tools and techniques that are enforced in every phase of drug development to accelerate the research process and deduce the risk and expenditure in clinical trials. Machine learning techniques improve the decision-making in pharmaceutical data across various applications like QSAR analysis, hit discoveries, de novo drug architectures to retrieve accurate outcomes. Target validation, prognostic biomarkers, digital pathology are considered under problem statements in this review. ML challenges must be applicable for the main cause of inadequacy in interpretability outcomes that may restrict the applications in drug discovery. In clinical trials, absolute and methodological data must be generated to tackle many puzzles in validating ML techniques, improving decision-making, promoting awareness in ML approaches, and deducing risk failures in drug discovery.
Manelfi C., Gemei M., Talarico C., Cerchia C., Fava A., Lunghini F., Beccari A.R.
Journal of Cheminformatics scimago Q1 wos Q1 Open Access
2021-07-23 citations by CoLab: 16 PDF Abstract  
The scaffold representation is widely employed to classify bioactive compounds on the basis of common core structures or correlate compound classes with specific biological activities. In this paper, we present a novel approach called “Molecular Anatomy” as a flexible and unbiased molecular scaffold-based metrics to cluster large set of compounds. We introduce a set of nine molecular representations at different abstraction levels, combined with fragmentation rules, to define a multi-dimensional network of hierarchically interconnected molecular frameworks. We demonstrate that the introduction of a flexible scaffold definition and multiple pruning rules is an effective method to identify relevant chemical moieties. This approach allows to cluster together active molecules belonging to different molecular classes, capturing most of the structure activity information, in particular when libraries containing a huge number of singletons are analyzed. We also propose a procedure to derive a network visualization that allows a full graphical representation of compounds dataset, permitting an efficient navigation in the scaffold’s space and significantly contributing to perform high quality SAR analysis. The protocol is freely available as a web interface at https://ma.exscalate.eu .
Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A., Bridgland A., Meyer C., Kohl S.A., Ballard A.J., Cowie A., et. al.
Nature scimago Q1 wos Q1
2021-07-15 citations by CoLab: 27230 Abstract  
AbstractProteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1–4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’8—has been an important open research problem for more than 50 years9. Despite recent progress10–14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
Tripathi M.K., Nath A., Singh T.P., Ethayathulla A.S., Kaur P.
Molecular Diversity scimago Q2 wos Q2
2021-06-23 citations by CoLab: 68 Abstract  
The accumulation of massive data in the plethora of Cheminformatics databases has made the role of big data and artificial intelligence (AI) indispensable in drug design. This has necessitated the development of newer algorithms and architectures to mine these databases and fulfil the specific needs of various drug discovery processes such as virtual drug screening, de novo molecule design and discovery in this big data era. The development of deep learning neural networks and their variants with the corresponding increase in chemical data has resulted in a paradigm shift in information mining pertaining to the chemical space. The present review summarizes the role of big data and AI techniques currently being implemented to satisfy the ever-increasing research demands in drug discovery pipelines.
Zhu J., Wang J., Wang X., Gao M., Guo B., Gao M., Liu J., Yu Y., Wang L., Kong W., An Y., Liu Z., Sun X., Huang Z., Zhou H., et. al.
Nature Biotechnology scimago Q1 wos Q1
2021-06-17 citations by CoLab: 103 Abstract  
Drug discovery focused on target proteins has been a successful strategy, but many diseases and biological processes lack obvious targets to enable such approaches. Here, to overcome this challenge, we describe a deep learning–based efficacy prediction system (DLEPS) that identifies drug candidates using a change in the gene expression profile in the diseased state as input. DLEPS was trained using chemically induced changes in transcriptional profiles from the L1000 project. We found that the changes in transcriptional profiles for previously unexamined molecules were predicted with a Pearson correlation coefficient of 0.74. We examined three disorders and experimentally tested the top drug candidates in mouse disease models. Validation showed that perillen, chikusetsusaponin IV and trametinib confer disease-relevant impacts against obesity, hyperuricemia and nonalcoholic steatohepatitis, respectively. DLEPS can generate insights into pathogenic mechanisms, and we demonstrate that the MEK–ERK signaling pathway is a target for developing agents against nonalcoholic steatohepatitis. Our findings suggest that DLEPS is an effective tool for drug repurposing and discovery. Drug discovery based on transcriptional profiling does not require knowledge of protein targets.
Chiba S., Lim K., Sheri N., Anwar S., Erkut E., Shah M., Aslesh T., Woo S., Sheikh O., Maruyama R., Takano H., Kunitake K., Duddy W., Okuno Y., Aoki Y., et. al.
Nucleic Acids Research scimago Q1 wos Q1 Open Access
2021-06-09 citations by CoLab: 21 PDF Abstract  
Abstract Exon skipping using antisense oligonucleotides (ASOs) has recently proven to be a powerful tool for mRNA splicing modulation. Several exon-skipping ASOs have been approved to treat genetic diseases worldwide. However, a significant challenge is the difficulty in selecting an optimal sequence for exon skipping. The efficacy of ASOs is often unpredictable, because of the numerous factors involved in exon skipping. To address this gap, we have developed a computational method using machine-learning algorithms that factors in many parameters as well as experimental data to design highly effective ASOs for exon skipping. eSkip-Finder (https://eskip-finder.org) is the first web-based resource for helping researchers identify effective exon skipping ASOs. eSkip-Finder features two sections: (i) a predictor of the exon skipping efficacy of novel ASOs and (ii) a database of exon skipping ASOs. The predictor facilitates rapid analysis of a given set of exon/intron sequences and ASO lengths to identify effective ASOs for exon skipping based on a machine learning model trained by experimental data. We confirmed that predictions correlated well with in vitro skipping efficacy of sequences that were not included in the training data. The database enables users to search for ASOs using queries such as gene name, species, and exon number.
Pakhrin S.C., Shrestha B., Adhikari B., KC D.B.
2021-05-24 citations by CoLab: 79 PDF Abstract  
Obtaining an accurate description of protein structure is a fundamental step toward understanding the underpinning of biology. Although recent advances in experimental approaches have greatly enhanced our capabilities to experimentally determine protein structures, the gap between the number of protein sequences and known protein structures is ever increasing. Computational protein structure prediction is one of the ways to fill this gap. Recently, the protein structure prediction field has witnessed a lot of advances due to Deep Learning (DL)-based approaches as evidenced by the success of AlphaFold2 in the most recent Critical Assessment of protein Structure Prediction (CASP14). In this article, we highlight important milestones and progresses in the field of protein structure prediction due to DL-based methods as observed in CASP experiments. We describe advances in various steps of protein structure prediction pipeline viz. protein contact map prediction, protein distogram prediction, protein real-valued distance prediction, and Quality Assessment/refinement. We also highlight some end-to-end DL-based approaches for protein structure prediction approaches. Additionally, as there have been some recent DL-based advances in protein structure determination using Cryo-Electron (Cryo-EM) microscopy based, we also highlight some of the important progress in the field. Finally, we provide an outlook and possible future research directions for DL-based approaches in the protein structure prediction arena.
Rajan K., Brinkhaus H.O., Sorokina M., Zielesny A., Steinbeck C.
Journal of Cheminformatics scimago Q1 wos Q1 Open Access
2021-03-08 citations by CoLab: 15 PDF Abstract  
Chemistry looks back at many decades of publications on chemical compounds, their structures and properties, in scientific articles. Liberating this knowledge (semi-)automatically and making it available to the world in open-access databases is a current challenge. Apart from mining textual information, Optical Chemical Structure Recognition (OCSR), the translation of an image of a chemical structure into a machine-readable representation, is part of this workflow. As the OCSR process requires an image containing a chemical structure, there is a need for a publicly available tool that automatically recognizes and segments chemical structure depictions from scientific publications. This is especially important for older documents which are only available as scanned pages. Here, we present DECIMER (Deep lEarning for Chemical IMagE Recognition) Segmentation, the first open-source, deep learning-based tool for automated recognition and segmentation of chemical structures from the scientific literature. The workflow is divided into two main stages. During the detection step, a deep learning model recognizes chemical structure depictions and creates masks which define their positions on the input page. Subsequently, potentially incomplete masks are expanded in a post-processing workflow. The performance of DECIMER Segmentation has been manually evaluated on three sets of publications from different publishers. The approach operates on bitmap images of journal pages to be applicable also to older articles before the introduction of vector images in PDFs. By making the source code and the trained model publicly available, we hope to contribute to the development of comprehensive chemical data extraction workflows. In order to facilitate access to DECIMER Segmentation, we also developed a web application. The web application, available at https://decimer.ai , lets the user upload a pdf file and retrieve the segmented structure depictions.
Mercado R., Rastemo T., Lindelöf E., Klambauer G., Engkvist O., Chen H., Jannik Bjerrum E.
2021-03-02 citations by CoLab: 104 PDF Abstract  
Abstract Deep learning methods applied to chemistry can be used to accelerate the discovery of new molecules. This work introduces GraphINVENT, a platform developed for graph-based molecular design using graph neural networks (GNNs). GraphINVENT uses a tiered deep neural network architecture to probabilistically generate new molecules a single bond at a time. All models implemented in GraphINVENT can quickly learn to build molecules resembling the training set molecules without any explicit programming of chemical rules. The models have been benchmarked using the MOSES distribution-based metrics, showing how GraphINVENT models compare well with state-of-the-art generative models. This work compares six different GNN-based generative models in GraphINVENT, and shows that ultimately the gated-graph neural network performs best against the metrics considered here.
Blanchard A.E., Stanley C., Bhowmik D.
Journal of Cheminformatics scimago Q1 wos Q1 Open Access
2021-02-23 citations by CoLab: 47 PDF Abstract  
The process of drug discovery involves a search over the space of all possible chemical compounds. Generative Adversarial Networks (GANs) provide a valuable tool towards exploring chemical space and optimizing known compounds for a desired functionality. Standard approaches to training GANs, however, can result in mode collapse, in which the generator primarily produces samples closely related to a small subset of the training data. In contrast, the search for novel compounds necessitates exploration beyond the original data. Here, we present an approach to training GANs that promotes incremental exploration and limits the impacts of mode collapse using concepts from Genetic Algorithms. In our approach, valid samples from the generator are used to replace samples from the training data. We consider both random and guided selection along with recombination during replacement. By tracking the number of novel compounds produced during training, we show that updates to the training data drastically outperform the traditional approach, increasing potential applications for GANs in drug discovery.
Gudimitla R.B., Babu M.K., Bhimana S., Dasu C.D., Perla S.
2025-02-21 citations by CoLab: 0 Abstract  
The integration of Artificial Intelligence (AI), Machine Learning (ML), and Big Data is transitioning the pharmaceutical industry, specifically in drug discovery, process optimization, and clinical development. This revolution enhances drug target identification through advanced data analytics, empowering the discovery of novel biomarkers and compounds. AI-driven high-throughput and virtual screening methods accelerate the identification of promising drug candidates, while predictive modeling improves the accuracy of drug efficacy predictions. In manufacturing, AI optimizes supply chain management and quality control through real-time monitoring and predictive maintenance, ensuring efficient production processes.

Top-30

Publishers

1
1
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?