Open Access
Open access
Chemistry - Methods, volume 2, issue 1

Image2SMILES: Transformer‐Based Molecular Optical Recognition Engine**

Publication typeJournal Article
Publication date2022-01-11
Quartile SCImago
Quartile WOS
Impact factor
ISSN26289725, 26289725
Materials Science (miscellaneous)
Abstract

The rise of deep learning in various scientific and technology areas promotes the development of AI‐based tools for information retrieval. Optical recognition of organic structures is a key part of the automated extraction of chemical information. However, this is a challenging task because there is a large variety of representation styles. In this research, we present a Transformer‐based artificial neural network to convert images of organic structures to molecular structures. To train the model, we created a comprehensive data generator that stochastically simulates various drawing styles, functional groups, functional group placeholders (R‐groups), and visual contamination. We demonstrate that the Transformer‐based architecture can gather chemical insights from our generator with almost absolute confidence. That means that, with Transformer, one can fully concentrate on data simulation to build a good recognition model. A web demo of our optical recognition engine is available online at Syntelly platform, and the code for dataset generation is available on GitHub.

Citations by journals

1
2
Journal of Cheminformatics
Journal of Cheminformatics, 2, 12.5%
Journal of Cheminformatics
2 publications, 12.5%
Journal of Chemical Information and Modeling
Journal of Chemical Information and Modeling, 2, 12.5%
Journal of Chemical Information and Modeling
2 publications, 12.5%
Briefings in Bioinformatics
Briefings in Bioinformatics, 1, 6.25%
Briefings in Bioinformatics
1 publication, 6.25%
Molecular Informatics
Molecular Informatics, 1, 6.25%
Molecular Informatics
1 publication, 6.25%
Lecture Notes in Computer Science
Lecture Notes in Computer Science, 1, 6.25%
Lecture Notes in Computer Science
1 publication, 6.25%
28th International Conference on Intelligent User Interfaces
28th International Conference on Intelligent User Interfaces, 1, 6.25%
28th International Conference on Intelligent User Interfaces
1 publication, 6.25%
npj Computational Materials
npj Computational Materials, 1, 6.25%
npj Computational Materials
1 publication, 6.25%
Nature Communications
Nature Communications, 1, 6.25%
Nature Communications
1 publication, 6.25%
Macromolecules
Macromolecules, 1, 6.25%
Macromolecules
1 publication, 6.25%
Energy
Energy, 1, 6.25%
Energy
1 publication, 6.25%
1
2

Citations by publishers

1
2
3
4
5
Springer Nature
Springer Nature, 5, 31.25%
Springer Nature
5 publications, 31.25%
American Chemical Society (ACS)
American Chemical Society (ACS), 3, 18.75%
American Chemical Society (ACS)
3 publications, 18.75%
IEEE
IEEE, 2, 12.5%
IEEE
2 publications, 12.5%
Oxford University Press
Oxford University Press, 1, 6.25%
Oxford University Press
1 publication, 6.25%
Wiley
Wiley, 1, 6.25%
Wiley
1 publication, 6.25%
Association for Computing Machinery (ACM)
Association for Computing Machinery (ACM), 1, 6.25%
Association for Computing Machinery (ACM)
1 publication, 6.25%
Elsevier
Elsevier, 1, 6.25%
Elsevier
1 publication, 6.25%
1
2
3
4
5
  • We do not take into account publications that without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.
Metrics
Share
Cite this
GOST |
Cite this
GOST Copy
Khokhlov I. et al. Image2SMILES: Transformer‐Based Molecular Optical Recognition Engine** // Chemistry - Methods. 2022. Vol. 2. No. 1.
GOST all authors (up to 50) Copy
Khokhlov I., Krasnov L., Fedorov M. V., Sosnin S. Image2SMILES: Transformer‐Based Molecular Optical Recognition Engine** // Chemistry - Methods. 2022. Vol. 2. No. 1.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1002/cmtd.202100069
UR - https://doi.org/10.1002%2Fcmtd.202100069
TI - Image2SMILES: Transformer‐Based Molecular Optical Recognition Engine**
T2 - Chemistry - Methods
AU - Khokhlov, Ivan
AU - Krasnov, Lev
AU - Fedorov, Maxim V.
AU - Sosnin, Sergey
PY - 2022
DA - 2022/01/11 00:00:00
PB - Wiley
IS - 1
VL - 2
SN - 2628-9725
SN - 2628-9725
ER -
BibTex
Cite this
BibTex Copy
@article{2022_Khokhlov,
author = {Ivan Khokhlov and Lev Krasnov and Maxim V. Fedorov and Sergey Sosnin},
title = {Image2SMILES: Transformer‐Based Molecular Optical Recognition Engine**},
journal = {Chemistry - Methods},
year = {2022},
volume = {2},
publisher = {Wiley},
month = {jan},
url = {https://doi.org/10.1002%2Fcmtd.202100069},
number = {1},
doi = {10.1002/cmtd.202100069}
}
Found error?