volume 71 issue 3 pages 690-705

Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks

Bing Yang 1
Zhenxin Zhang 2, 3, 4
Cai Qing Yang 1
Ying Wang 1
M. V. Orr 5
Hongbin Wang 6
Aibing Zhang 1
Publication typeJournal Article
Publication date2021-09-15
scimago Q1
wos Q1
SJR2.945
CiteScore13.1
Impact factor5.7
ISSN10635157, 1076836X
Genetics
Ecology, Evolution, Behavior and Systematics
Abstract

Integrative taxonomy is central to modern taxonomy and systematic biology, including behavior, niche preference, distribution, morphological analysis, and DNA barcoding. However, decades of use demonstrate that these methods can face challenges when used in isolation, for instance, potential misidentifications due to phenotypic plasticity for morphological methods, and incorrect identifications because of introgression, incomplete lineage sorting, and horizontal gene transfer for DNA barcoding. Although researchers have advocated the use of integrative taxonomy, few detailed algorithms have been proposed. Here, we develop a convolutional neural network method (morphology-molecule network [MMNet]) that integrates morphological and molecular data for species identification. The newly proposed method (MMNet) worked better than four currently available alternative methods when tested with 10 independent data sets representing varying genetic diversity from different taxa. High accuracies were achieved for all groups, including beetles (98.1% of 123 species), butterflies (98.8% of 24 species), fishes (96.3% of 214 species), and moths (96.4% of 150 total species). Further, MMNet demonstrated a high degree of accuracy ($>$98%) in four data sets including closely related species from the same genus. The average accuracy of two modest subgenomic (single nucleotide polymorphism) data sets, comprising eight putative subspecies respectively, is 90%. Additional tests show that the success rate of species identification under this method most strongly depends on the amount of training data, and is robust to sequence length and image size. Analyses on the contribution of different data types (image vs. gene) indicate that both morphological and genetic data are important to the model, and that genetic data contribute slightly more. The approaches developed here serve as a foundation for the future integration of multimodal information for integrative taxonomy, such as image, audio, video, 3D scanning, and biosensor data, to characterize organisms more comprehensively as a basis for improved investigation, monitoring, and conservation of biodiversity. [Convolutional neural network; deep learning; integrative taxonomy; single nucleotide polymorphism; species identification.]

Found 
Found 

Top-30

Journals

1
2
3
Molecular Phylogenetics and Evolution
3 publications, 5.26%
Zoologica Scripta
2 publications, 3.51%
Methods in Ecology and Evolution
2 publications, 3.51%
Molecular Ecology Resources
2 publications, 3.51%
Animals
1 publication, 1.75%
Agriculture (Switzerland)
1 publication, 1.75%
Plant Diversity
1 publication, 1.75%
Systematic Entomology
1 publication, 1.75%
Marine and Freshwater Research
1 publication, 1.75%
Journal of Environmental and Public Health
1 publication, 1.75%
Problemy Osobo Opasnykh Infektsii
1 publication, 1.75%
Insects
1 publication, 1.75%
National Science Review
1 publication, 1.75%
Multimedia Tools and Applications
1 publication, 1.75%
Frontiers in Microbiology
1 publication, 1.75%
Biochemistry (Moscow) Supplement Series B: Biomedical Chemistry
1 publication, 1.75%
Water (Switzerland)
1 publication, 1.75%
Weed Science
1 publication, 1.75%
Diversity
1 publication, 1.75%
Marine Biology
1 publication, 1.75%
Trends in Ecology and Evolution
1 publication, 1.75%
Agronomy
1 publication, 1.75%
International Journal of Molecular Sciences
1 publication, 1.75%
Systematic Biology
1 publication, 1.75%
Ecology and Evolution
1 publication, 1.75%
IOP Conference Series: Earth and Environmental Science
1 publication, 1.75%
Parasites and Vectors
1 publication, 1.75%
Journal of Applied Entomology
1 publication, 1.75%
Oriental Insects
1 publication, 1.75%
1
2
3

Publishers

2
4
6
8
10
Wiley
10 publications, 17.54%
MDPI
9 publications, 15.79%
Elsevier
9 publications, 15.79%
Oxford University Press
4 publications, 7.02%
Springer Nature
4 publications, 7.02%
Cold Spring Harbor Laboratory
3 publications, 5.26%
Institute of Electrical and Electronics Engineers (IEEE)
3 publications, 5.26%
Taylor & Francis
2 publications, 3.51%
CSIRO Publishing
1 publication, 1.75%
Hindawi Limited
1 publication, 1.75%
Russian Research Anti-Plague Institute Microbe
1 publication, 1.75%
Frontiers Media S.A.
1 publication, 1.75%
Pleiades Publishing
1 publication, 1.75%
Cambridge University Press
1 publication, 1.75%
Research Square Platform LLC
1 publication, 1.75%
IOP Publishing
1 publication, 1.75%
Pensoft Publishers
1 publication, 1.75%
AIP Publishing
1 publication, 1.75%
Public Library of Science (PLoS)
1 publication, 1.75%
2
4
6
8
10
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
57
Share
Cite this
GOST |
Cite this
GOST Copy
Yang B. et al. Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks // Systematic Biology. 2021. Vol. 71. No. 3. pp. 690-705.
GOST all authors (up to 50) Copy
Yang B., Zhang Z., Yang C. Q., Wang Y., Orr M. V., Wang H., Zhang A. Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks // Systematic Biology. 2021. Vol. 71. No. 3. pp. 690-705.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1093/sysbio/syab076
UR - https://doi.org/10.1093/sysbio/syab076
TI - Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks
T2 - Systematic Biology
AU - Yang, Bing
AU - Zhang, Zhenxin
AU - Yang, Cai Qing
AU - Wang, Ying
AU - Orr, M. V.
AU - Wang, Hongbin
AU - Zhang, Aibing
PY - 2021
DA - 2021/09/15
PB - Oxford University Press
SP - 690-705
IS - 3
VL - 71
PMID - 34524452
SN - 1063-5157
SN - 1076-836X
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2021_Yang,
author = {Bing Yang and Zhenxin Zhang and Cai Qing Yang and Ying Wang and M. V. Orr and Hongbin Wang and Aibing Zhang},
title = {Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks},
journal = {Systematic Biology},
year = {2021},
volume = {71},
publisher = {Oxford University Press},
month = {sep},
url = {https://doi.org/10.1093/sysbio/syab076},
number = {3},
pages = {690--705},
doi = {10.1093/sysbio/syab076}
}
MLA
Cite this
MLA Copy
Yang, Bing, et al. “Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks.” Systematic Biology, vol. 71, no. 3, Sep. 2021, pp. 690-705. https://doi.org/10.1093/sysbio/syab076.