Research on automatic recognition of hand-drawn chemical molecular structures based on deep learning

Hengjie Ouyang 1
Wei Liu 1
Jiajun Tao 1
Yanghong Luo 1
Wanjia Zhang 1
Jiayu Zhou 1
Shuqi Geng 1
Chengpeng Zhang 1
Publication typePosted Content
Publication date2023-08-17
Abstract

Chemical molecule structures are important in academic communication because they allow for a more direct and convenient representation of chemical knowledge. Hand-drawn chemical molecular structures are a common task for chemistry students and researchers. If hand-drawn chemical molecular structures, such as SMILES codes, could be converted into machine-readable data forms. Computers would be able to process and analyze these chemical molecular structures, greatly increasing the efficiency of chemical research. Furthermore, with the advancement of information technology in education, automatic marking is becoming increasingly popular. Teachers will benefit greatly from having a machine recognize the chemical molecular structure and then determine whether they are drawn correctly. In this study, we will investigate the chemical molecular formulas consisting of three atoms C, H, O. Because there has been little research on hand-drawn chemical molecular structures, the first major task of this paper is to create a dataset. This paper proposes a synthetic image method for quickly generating synthetic images resembling hand-drawn chemical molecular structures and improving dataset acquisition efficiency. The final recognition accuracy of the hand-drawn chemical structure recognition model designed in this paper is 96.90% in terms of model selection. The model employs the EfficientNet + Transformer encoder-decoder architecture, which outperforms other encoder-decoder combinations.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
Share
Found error?