Open Access
Open access
volume 3 issue 1 publication number 4

Automated extraction of chemical structure information from digital raster images

Jungkap Park 1, 2
Gus R Rosania 1, 3
Kerby A. Shedden 1, 4
Mandee Nguyen 3
Naesung Lyu 5
Kazuhiro Saitou 1, 2
Publication typeJournal Article
Publication date2009-02-05
SJR
CiteScore
Impact factor
ISSN1752153X
General Chemistry
Abstract
BackgroundTo search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated.ResultsThis paper aims to provide critical reviews for these systems and also report our recent development of ChemReader – a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns.ConclusionThe availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links to scientific research articles.
Found 
Found 

Top-30

Journals

1
2
3
4
5
6
7
8
Journal of Cheminformatics
8 publications, 13.33%
Journal of Chemical Information and Modeling
8 publications, 13.33%
Chemical Science
3 publications, 5%
Lecture Notes in Computer Science
2 publications, 3.33%
Journal of Computer-Aided Molecular Design
2 publications, 3.33%
Methods in Molecular Biology
2 publications, 3.33%
Chemistry - Methods
1 publication, 1.67%
Applied Physics Reviews
1 publication, 1.67%
Journal of Applied Physics
1 publication, 1.67%
Assay and Drug Development Technologies
1 publication, 1.67%
Drug Repurposing, Rescue, and Repositioning
1 publication, 1.67%
Pharmaceutical patent analyst
1 publication, 1.67%
Frontiers in Bioengineering and Biotechnology
1 publication, 1.67%
BMC Bioinformatics
1 publication, 1.67%
Proceedings of the National Academy of Sciences India Section B - Biological Sciences
1 publication, 1.67%
International Journal of Machine Learning and Cybernetics
1 publication, 1.67%
PLoS ONE
1 publication, 1.67%
Drug Discovery Today
1 publication, 1.67%
Procedia Engineering
1 publication, 1.67%
Computer Methods and Programs in Biomedicine
1 publication, 1.67%
Molecular Informatics
1 publication, 1.67%
Wiley Interdisciplinary Reviews: Computational Molecular Science
1 publication, 1.67%
Topic Detection and Tracking
1 publication, 1.67%
Advances in Intelligent Systems and Computing
1 publication, 1.67%
Briefings in Bioinformatics
1 publication, 1.67%
Annual Review of Physical Chemistry
1 publication, 1.67%
Communications in Computer and Information Science
1 publication, 1.67%
AIP Conference Proceedings
1 publication, 1.67%
Frontiers in Computational Chemistry
1 publication, 1.67%
1
2
3
4
5
6
7
8

Publishers

5
10
15
20
Springer Nature
20 publications, 33.33%
American Chemical Society (ACS)
8 publications, 13.33%
Institute of Electrical and Electronics Engineers (IEEE)
7 publications, 11.67%
Association for Computing Machinery (ACM)
4 publications, 6.67%
Wiley
3 publications, 5%
AIP Publishing
3 publications, 5%
Elsevier
3 publications, 5%
Royal Society of Chemistry (RSC)
3 publications, 5%
Mary Ann Liebert
2 publications, 3.33%
Taylor & Francis
1 publication, 1.67%
Frontiers Media S.A.
1 publication, 1.67%
Public Library of Science (PLoS)
1 publication, 1.67%
Oxford University Press
1 publication, 1.67%
Annual Reviews
1 publication, 1.67%
Bentham Science Publishers Ltd.
1 publication, 1.67%
5
10
15
20
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
60
Share
Cite this
GOST |
Cite this
GOST Copy
Park J. et al. Automated extraction of chemical structure information from digital raster images // Chemistry Central Journal. 2009. Vol. 3. No. 1. 4
GOST all authors (up to 50) Copy
Park J., Rosania G. R., Shedden K. A., Nguyen M., Lyu N., Saitou K. Automated extraction of chemical structure information from digital raster images // Chemistry Central Journal. 2009. Vol. 3. No. 1. 4
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1186/1752-153X-3-4
UR - https://doi.org/10.1186/1752-153X-3-4
TI - Automated extraction of chemical structure information from digital raster images
T2 - Chemistry Central Journal
AU - Park, Jungkap
AU - Rosania, Gus R
AU - Shedden, Kerby A.
AU - Nguyen, Mandee
AU - Lyu, Naesung
AU - Saitou, Kazuhiro
PY - 2009
DA - 2009/02/05
PB - Springer Nature
IS - 1
VL - 3
PMID - 19196483
SN - 1752-153X
ER -
BibTex
Cite this
BibTex (up to 50 authors) Copy
@article{2009_Park,
author = {Jungkap Park and Gus R Rosania and Kerby A. Shedden and Mandee Nguyen and Naesung Lyu and Kazuhiro Saitou},
title = {Automated extraction of chemical structure information from digital raster images},
journal = {Chemistry Central Journal},
year = {2009},
volume = {3},
publisher = {Springer Nature},
month = {feb},
url = {https://doi.org/10.1186/1752-153X-3-4},
number = {1},
pages = {4},
doi = {10.1186/1752-153X-3-4}
}