Open Access
Open access
Journal of Cheminformatics, volume 13, issue 1, publication number 61

DECIMER 1.0: deep learning for chemical image recognition using transformers

Publication typeJournal Article
Publication date2021-08-17
scimago Q1
wos Q1
SJR1.745
CiteScore14.1
Impact factor7.1
ISSN17582946
Physical and Theoretical Chemistry
Computer Science Applications
Library and Information Sciences
Computer Graphics and Computer-Aided Design
Abstract
The amount of data available on chemical structures and their properties has increased steadily over the past decades. In particular, articles published before the mid-1990 are available only in printed or scanned form. The extraction and storage of data from those articles in a publicly accessible database are desirable, but doing this manually is a slow and error-prone process. In order to extract chemical structure depictions and convert them into a computer-readable format, Optical Chemical Structure Recognition (OCSR) tools were developed where the best performing OCSR tools are mostly rule-based. The DECIMER (Deep lEarning for Chemical ImagE Recognition) project was launched to address the OCSR problem with the latest computational intelligence methods to provide an automated open-source software solution. Various current deep learning approaches were explored to seek a best-fitting solution to the problem. In a preliminary communication, we outlined the prospect of being able to predict SMILES encodings of chemical structure depictions with about 90% accuracy using a dataset of 50–100 million molecules. In this article, the new DECIMER model is presented, a transformer-based network, which can predict SMILES with above 96% accuracy from depictions of chemical structures without stereochemical information and above 89% accuracy for depictions with stereochemical information.
Found 
Found 

Top-30

Journals

2
4
6
8
10
2
4
6
8
10

Publishers

2
4
6
8
10
12
14
16
18
2
4
6
8
10
12
14
16
18
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex
Found error?