Application value of long-read sequencing in full characterization of thalassemia-associated structural variations: identifying a novel large segmental duplication and literature review
Background
Thalassemia is one of the most prevalent monogenic disorders in tropical and subtropical regions, imposing significant familial and social burdens on local populations. It is caused by point mutations or structural variations (SVs) in the α- or β-globin gene clusters. Due to the complex structure, full characterization of SVs has always been the focus and difficulty of molecular diagnosis of thalassemia patients.
Methods
Peripheral blood of a Chinese boy with β-thalassemia intermedia phenotype and his family members were collected. Multiplex ligation dependent probe amplification (MLPA), long-read sequencing (LRS) and Sanger sequencing were used to analyze the variant in this family.
Results
A novel large duplication (αααα280) was identified using LRS technique and validated by Sanger sequencing. Additionally, we conducted a systematic review of known SVs and evaluated the advantages and disadvantages of various methods in analyzing complex SVs.
Conclusions
Our study identified a novel SV in the α-globin gene cluster and demonstrated that LRS was a superior approach for detecting novel rare SVs. The appropriate use of LRS significantly improves diagnostic accuracy when conventional methods are not capable of completely identifying complex SVs.