Nature Reviews Genetics, volume 14, issue 2, pages 125-138

Phenotypic impact of genomic structural variation: insights from and for human disease

Publication typeJournal Article
Publication date2013-01-18
scimago Q1
SJR14.293
CiteScore57.4
Impact factor39.1
ISSN14710056, 14710064
PubMed ID:  23329113
Molecular Biology
Genetics
Genetics (clinical)
Abstract
With the increased cataloguing of human structural variants, our understanding of their influence on phenotype is ever improving. Here, the influence of structural variants on phenotypes including disease is discussed, and strategies for further characterization are presented. Genomic structural variants have long been implicated in phenotypic diversity and human disease, but dissecting the mechanisms by which they exert their functional impact has proven elusive. Recently however, developments in high-throughput DNA sequencing and chromosomal engineering technology have facilitated the analysis of structural variants in human populations and model systems in unprecedented detail. In this Review, we describe how structural variants can affect molecular and cellular processes, leading to complex organismal phenotypes, including human disease. We further present advances in delineating disease-causing elements that are affected by structural variants, and we discuss future directions for research on the functional consequences of structural variants.
Nature scimago Q1 wos Q1
2012-09-04 citations by CoLab: 15521 Abstract  
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research. This overview of the ENCODE project outlines the data accumulated so far, revealing that 80% of the human genome now has at least one biochemical function assigned to it; the newly identified functional elements should aid the interpretation of results of genome-wide association studies, as many correspond to sites of association with human disease. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription-factor association, chromatin structure and histone modification. In this overview, the Consortium guides the readers through the project itself, the data and their integrated analyses. Eighty per cent of the human genome now has at least one biochemical function assigned to it. In addition to expanding our understanding of how gene expression is regulated on a genome-wide scale, the newly identified functional elements should help researchers to interpret the results of genome-wide associated studies because many correspond to sites associated with human disease.
Berglund J., Nevalainen E.M., Molin A., Perloski M., André C., Zody M.C., Sharpe T., Hitte C., Lindblad-Toh K., Lohi H., Webster M.T.
Genome Biology scimago Q1 wos Q1 Open Access
2012-08-23 citations by CoLab: 66 Abstract  
Copy number variants (CNVs) account for substantial variation between genomes and are a major source of normal and pathogenic phenotypic differences. The dog is an ideal model to investigate mutational mechanisms that generate CNVs as its genome lacks a functional ortholog of the PRDM9 gene implicated in recombination and CNV formation in humans. Here we comprehensively assay CNVs using high-density array comparative genomic hybridization in 50 dogs from 17 dog breeds and 3 gray wolves. We use a stringent new method to identify a total of 430 high-confidence CNV loci, which range in size from 9 kb to 1.6 Mb and span 26.4 Mb, or 1.08%, of the assayed dog genome, overlapping 413 annotated genes. Of CNVs observed in each breed, 98% are also observed in multiple breeds. CNVs predicted to disrupt gene function are significantly less common than expected by chance. We identify a significant overrepresentation of peaks of GC content, previously shown to be enriched in dog recombination hotspots, in the vicinity of CNV breakpoints. A number of the CNVs identified by this study are candidates for generating breed-specific phenotypes. Purifying selection seems to be a major factor shaping structural variation in the dog genome, suggesting that many CNVs are deleterious. Localized peaks of GC content appear to be novel sites of CNV formation in the dog genome by non-allelic homologous recombination, potentially activated by the loss of PRDM9. These sequence features may have driven genome instability and chromosomal rearrangements throughout canid evolution.
Kloosterman W., Tavakoli-Yaraki M., van Roosmalen M., van Binsbergen E., Renkens I., Duran K., Ballarati L., Vergult S., Giardino D., Hansson K., Ruivenkamp C.L., Jager M., van Haeringen A., Ippel E., Haaf T., et. al.
Cell Reports scimago Q1 wos Q1 Open Access
2012-06-15 citations by CoLab: 175
Devlin B., Scherer S.W.
2012-06-01 citations by CoLab: 398 Abstract  
Autism spectrum disorder (ASD) is characterized by impairments in reciprocal social interaction and communication, and by restricted and repetitive behaviors. Family studies indicate a significant genetic basis for ASD susceptibility, and genomic scanning is beginning to elucidate the underlying genetic architecture. Some 5-15% of individuals with ASD have an identifiable genetic etiology corresponding to known chromosomal rearrangements or single gene disorders. Rare (
Golzio C., Willer J., Talkowski M.E., Oh E.C., Taniguchi Y., Jacquemont S., Reymond A., Sun M., Sawa A., Gusella J.F., Kamiya A., Beckmann J.S., Katsanis N.
Nature scimago Q1 wos Q1
2012-05-15 citations by CoLab: 361 Abstract  
Overexpression of all 29 human transcripts of a region of the 16p11.2 chromosome in zebrafish embryos identifies KCTD13 as the message inducing the microcephaly phenotype associated with 16p11.2 duplication, whereas its suppression yields the macrocephalic phenotype associated with the reciprocal deletion, suggesting that KCTD13 is a major driver for the neurodevelopmental phenotypes associated with the 16p11.2 copy number variants. Copy number variants (CNVs) make an important contribution to genetic disorders, and some CNVs have been shown to have reciprocal phenotypic effects. For instance, duplication of chromosomal region 16p11.2 has been linked to autism, schizophrenia and microcephaly, and reciprocal deletion to autism, obesity and macrocephaly. By manipulating levels of expression — in pairwise combination — of zebrafish orthologues in this genomic interval, Nicholas Katsanis and colleagues identified KCTD13 as the locus that can recapitulate the macro- and microcephalic phenotype, which they show is underpinned by a proliferative defect. Together with further human genetic data, these results suggest that KCTD13 is a major driver for the neurodevelopmental phenotypes associated with 16p11.2 duplication/deletion. The approach used here also offers a way of identifying other dosage-sensitive loci. Copy number variants (CNVs) are major contributors to genetic disorders1. We have dissected a region of the 16p11.2 chromosome—which encompasses 29 genes—that confers susceptibility to neurocognitive defects when deleted or duplicated2,3. Overexpression of each human transcript in zebrafish embryos identified KCTD13 as the sole message capable of inducing the microcephaly phenotype associated with the 16p11.2 duplication2,3,4,5, whereas suppression of the same locus yielded the macrocephalic phenotype associated with the 16p11.2 deletion5,6, capturing the mirror phenotypes of humans. Analyses of zebrafish and mouse embryos suggest that microcephaly is caused by decreased proliferation of neuronal progenitors with concomitant increase in apoptosis in the developing brain, whereas macrocephaly arises by increased proliferation and no changes in apoptosis. A role for KCTD13 dosage changes is consistent with autism in both a recently reported family with a reduced 16p11.2 deletion and a subject reported here with a complex 16p11.2 rearrangement involving de novo structural alteration of KCTD13. Our data suggest that KCTD13 is a major driver for the neurodevelopmental phenotypes associated with the 16p11.2 CNV, reinforce the idea that one or a small number of transcripts within a CNV can underpin clinical phenotypes, and offer an efficient route to identifying dosage-sensitive loci.
Jaeger E., Leedham S., Lewis A., Segditsas S., Becker M., Cuadrado P.R., Davis H., Kaur K., Heinimann K., Howarth K., East J., Taylor J., Thomas H., Tomlinson I.
Nature Genetics scimago Q1 wos Q1
2012-05-06 citations by CoLab: 216 Abstract  
Ian Tomlinson and colleagues identify a 40-kb duplication upstream of the gene that encodes the BMP antagonist GREM1 in families with hereditary mixed polyposis syndrome. The mutation is associated with increased allele-specific and ectopic expression of GREM1. Hereditary mixed polyposis syndrome (HMPS) is characterized by apparent autosomal dominant inheritance of multiple types of colorectal polyp, with colorectal carcinoma occurring in a high proportion of affected individuals. Here, we use genetic mapping, copy-number analysis, exclusion of mutations by high-throughput sequencing, gene expression analysis and functional assays to show that HMPS is caused by a duplication spanning the 3′ end of the SCG5 gene and a region upstream of the GREM1 locus. This unusual mutation is associated with increased allele-specific GREM1 expression. Whereas GREM1 is expressed in intestinal subepithelial myofibroblasts in controls, GREM1 is predominantly expressed in the epithelium of the large bowel in individuals with HMPS. The HMPS duplication contains predicted enhancer elements; some of these interact with the GREM1 promoter and can drive gene expression in vitro. Increased GREM1 expression is predicted to cause reduced bone morphogenetic protein (BMP) pathway activity, a mechanism that also underlies tumorigenesis in juvenile polyposis of the large bowel.
Koolen D.A., Kramer J.M., Neveling K., Nillesen W.M., Moore-Barton H.L., Elmslie F.V., Toutain A., Amiel J., Malan V., Tsai A.C., Cheung S.W., Gilissen C., Verwiel E.T., Martens S., Feuth T., et. al.
Nature Genetics scimago Q1 wos Q1
2012-04-29 citations by CoLab: 198 Abstract  
Bert DeVries and colleagues identify mutations in the chromatin regulator KANSL1 in 17q21.31 microdeletion syndrome. This syndrome is characterized by intellectual disability, hypotonia and distinctive facial features. We show that haploinsufficiency of KANSL1 is sufficient to cause the 17q21.31 microdeletion syndrome, a multisystem disorder characterized by intellectual disability, hypotonia and distinctive facial features. The KANSL1 protein is an evolutionarily conserved regulator of the chromatin modifier KAT8, which influences gene expression through histone H4 lysine 16 (H4K16) acetylation. RNA sequencing studies in cell lines derived from affected individuals and the presence of learning deficits in Drosophila melanogaster mutants suggest a role for KANSL1 in neuronal processes.
Talkowski M., Rosenfeld J., Blumenthal I., Pillalamarri V., Chiang C., Heilbut A., Ernst C., Hanscom C., Rossin E., Lindgren A.M., Pereira S., Ruderfer D., Kirby A., Ripke S., Harris D.J., et. al.
Cell scimago Q1 wos Q1
2012-04-19 citations by CoLab: 495 Abstract  
Balanced chromosomal abnormalities (BCAs) represent a relatively untapped reservoir of single-gene disruptions in neurodevelopmental disorders (NDDs). We sequenced BCAs in patients with autism or related NDDs, revealing disruption of 33 loci in four general categories: (1) genes previously associated with abnormal neurodevelopment (e.g., AUTS2, FOXP1, and CDKL5), (2) single-gene contributors to microdeletion syndromes (MBD5, SATB2, EHMT1, and SNURF-SNRPN), (3) novel risk loci (e.g., CHD8, KIRREL3, and ZNF507), and (4) genes associated with later-onset psychiatric disorders (e.g., TCF4, ZNF804A, PDE10A, GRIN2B, and ANK3). We also discovered among neurodevelopmental cases a profoundly increased burden of copy-number variants from these 33 loci and a significant enrichment of polygenic risk alleles from genome-wide association studies of autism and schizophrenia. Our findings suggest a polygenic risk model of autism and reveal that some neurodevelopmental genes are sensitive to perturbation by multiple mutational mechanisms, leading to variable phenotypic outcomes that manifest at different life stages.
Kou Y., Betancur C., Xu H., Buxbaum J.D., Ma'ayan A.
2012-04-12 citations by CoLab: 30 Abstract  
Autism spectrum disorders (ASD) are a group of related neurodevelopmental disorders with significant combined prevalence (∼1%) and high heritability. Dozens of individually rare genes and loci associated with high-risk for ASD have been identified, which overlap extensively with genes for intellectual disability (ID). However, studies indicate that there may be hundreds of genes that remain to be identified. The advent of inexpensive massively parallel nucleotide sequencing can reveal the genetic underpinnings of heritable complex diseases, including ASD and ID. However, whole exome sequencing (WES) and whole genome sequencing (WGS) provides an embarrassment of riches, where many candidate variants emerge. It has been argued that genetic variation for ASD and ID will cluster in genes involved in distinct pathways and protein complexes. For this reason, computational methods that prioritize candidate genes based on additional functional information such as protein-protein interactions or association with specific canonical or empirical pathways, or other attributes, can be useful. In this study we applied several supervised learning approaches to prioritize ASD or ID disease gene candidates based on curated lists of known ASD and ID disease genes. We implemented two network-based classifiers and one attribute-based classifier to show that we can rank and classify known, and predict new, genes for these neurodevelopmental disorders. We also show that ID and ASD share common pathways that perturb an overlapping synaptic regulatory subnetwork. We also show that features relating to neuronal phenotypes in mouse knockouts can help in classifying neurodevelopmental genes. Our methods can be applied broadly to other diseases helping in prioritizing newly identified genetic variation that emerge from disease gene discovery based on WES and WGS.
Vazquez-Mena O., Medina-Martinez I., Juárez-Torres E., Barrón V., Espinosa A., Villegas-Sepulveda N., Gómez-Laguna L., Nieto-Martínez K., Orozco L., Roman-Basaure E., Muñoz Cortez S., Borges Ibañez M., Venegas-Vega C., Guardado-Estrada M., Rangel-López A., et. al.
PLoS ONE scimago Q1 wos Q1 Open Access
2012-03-07 citations by CoLab: 41 PDF Abstract  
Several copy number-altered regions (CNAs) have been identified in the genome of cervical cancer, notably, amplifications of 3q and 5p. However, the contribution of copy-number alterations to cervical carcinogenesis is unresolved because genome-wide there exists a lack of correlation between copy-number alterations and gene expression. In this study, we investigated whether CNAs in the cell lines CaLo, CaSki, HeLa, and SiHa were associated with changes in gene expression. On average, 19.2% of the cell-line genomes had CNAs. However, only 2.4% comprised minimal recurrent regions (MRRs) common to all the cell lines. Whereas 3q had limited common gains (13%), 5p was entirely duplicated recurrently. Genome-wide, only 15.6% of genes located in CNAs changed gene expression; in contrast, the rate in MRRs was up to 3 times this. Chr 5p was confirmed entirely amplified by FISH; however, maximum 33.5% of the explored genes in 5p were deregulated. In 3q, this rate was 13.4%. Even in 3q26, which had 5 MRRs and 38.7% recurrently gained SNPs, the rate was only 15.1%. Interestingly, up to 19% of deregulated genes in 5p and 73% in 3q26 were downregulated, suggesting additional factors were involved in gene repression. The deregulated genes in 3q and 5p occurred in clusters, suggesting local chromatin factors may also influence gene expression. In regions amplified discontinuously, downregulated genes increased steadily as the number of amplified SNPs increased (p
Albers C.A., Paul D.S., Schulze H., Freson K., Stephens J.C., Smethurst P.A., Jolley J.D., Cvejic A., Kostadima M., Bertone P., Breuning M.H., Debili N., Deloukas P., Favier R., Fiedler J., et. al.
Nature Genetics scimago Q1 wos Q1
2012-02-26 citations by CoLab: 336 Abstract  
Cornelis Albers, Cedric Ghevaert and colleagues report that a majority of thrombocytopenia with absent radii (TAR) syndrome cases are caused by compound heterzygosity of a null allele and a low-frequency SNP in the regulatory regions of the RBM8A gene, which encodes the Y14 subunit of the exon-junction complex (EJC). TAR syndrome is the first reported human disorder caused by a defect in an EJC component. The exon-junction complex (EJC) performs essential RNA processing tasks1,2,3,4,5. Here, we describe the first human disorder, thrombocytopenia with absent radii (TAR)6, caused by deficiency in one of the four EJC subunits. Compound inheritance of a rare null allele and one of two low-frequency SNPs in the regulatory regions of RBM8A, encoding the Y14 subunit of EJC, causes TAR. We found that this inheritance mechanism explained 53 of 55 cases (P < 5 × 10−228) of the rare congenital malformation syndrome. Of the 53 cases with this inheritance pattern, 51 carried a submicroscopic deletion of 1q21.1 that has previously been associated with TAR7, and two carried a truncation or frameshift null mutation in RBM8A. We show that the two regulatory SNPs result in diminished RBM8A transcription in vitro and that Y14 expression is reduced in platelets from individuals with TAR. Our data implicate Y14 insufficiency and, presumably, an EJC defect as the cause of TAR syndrome.
MacArthur D.G., Balasubramanian S., Frankish A., Huang N., Morris J., Walter K., Jostins L., Habegger L., Pickrell J.K., Montgomery S.B., Albers C.A., Zhang Z.D., Conrad D.F., Lunter G., Zheng H., et. al.
Science scimago Q1 wos Q1 Open Access
2012-02-17 citations by CoLab: 1031 PDF Abstract  
Defective Gene Detective Identifying genes that give rise to diseases is one of the major goals of sequencing human genomes. However, putative loss-of-function genes, which are often some of the first identified targets of genome and exome sequencing, have often turned out to be sequencing errors rather than true genetic variants. In order to identify the true scope of loss-of-function genes within the human genome, MacArthur et al. (p. 823 ; see the Perspective by Quintana-Murci ) extensively validated the genomes from the 1000 Genomes Project, as well as an additional European individual, and found that the average person has about 100 true loss-of-function alleles of which approximately 20 have two copies within an individual. Because many known disease-causing genes were identified in “normal” individuals, the process of clinical sequencing needs to reassess how to identify likely causative alleles.
Lettice L., Williamson I., Wiltshire J., Peluso S., Devenney P., Hill A., Essafi A., Hagman J., Mort R., Grimes G., DeAngelis C., Hill R.
Developmental Cell scimago Q1 wos Q1
2012-02-13 citations by CoLab: 137 Abstract  
Sonic hedgehog (Shh) expression during limb development is crucial for specifying the identity and number of digits. The spatial pattern of Shh expression is restricted to a region called the zone of polarizing activity (ZPA), and this expression is controlled from a long distance by the cis-regulator ZRS. Here, members of two groups of ETS transcription factors are shown to act directly at the ZRS mediating a differential effect on Shh, defining its spatial expression pattern. Occupancy at multiple GABPα/ETS1 sites regulates the position of the ZPA boundary, whereas ETV4/ETV5 binding restricts expression outside the ZPA. The ETS gene family is therefore attributed with specifying the boundaries of the classical ZPA. Two point mutations within the ZRS change the profile of ETS binding and activate Shh expression at an ectopic site in the limb bud. These molecular changes define a pathogenetic mechanism that leads to preaxial polydactyly (PPD).
Ahmed M.M., Sturgeon X., Ellison M., Davisson M.T., Gardiner K.J.
Journal of Proteome Research scimago Q1 wos Q1
2012-01-25 citations by CoLab: 36 Abstract  
The Ts65Dn mouse model of Down syndrome (DS) is trisomic for orthologs of 88 of 161 classical protein coding genes present on human chromosome 21 (HSA21). Ts65Dn mice display learning and memory impairments and neuroanatomical, electrophysiological, and cellular abnormalities that are relevant to phenotypic features seen in DS; however, little is known about the molecular perturbations underlying the abnormalities. Here we have used reverse phase protein arrays to profile 64 proteins in the cortex, hippocampus, and cerebellum of Ts65Dn mice and littermate controls. Proteins were chosen to sample a variety of pathways and processes and include orthologs of HSA21 proteins and phosphorylation-dependent and -independent forms of non-HSA21 proteins. Protein profiles overall show remarkable stability to the effects of trisomy, with fewer than 30% of proteins altered in any brain region. However, phospho-proteins are less resistant to trisomy than their phospho-independent forms, and Ts65Dn display abnormalities in some key proteins. Importantly, we demonstrate that Ts65Dn mice have lost correlations seen in control mice among levels of functionally related proteins, including components of the MAP kinase pathway and subunits of the NMDA receptor. Loss of normal patterns of correlations may compromise molecular responses to stimulation and underlie deficits in learning and memory.
Balaskas N., Ribeiro A., Panovska J., Dessaud E., Sasai N., Page K., Briscoe J., Ribes V.
Cell scimago Q1 wos Q1
2012-01-20 citations by CoLab: 420 Abstract  
Secreted signals, known as morphogens, provide the positional information that organizes gene expression and cellular differentiation in many developing tissues. In the vertebrate neural tube, Sonic Hedgehog (Shh) acts as a morphogen to control the pattern of neuronal subtype specification. Using an in vivo reporter of Shh signaling, mouse genetics, and systems modeling, we show that a spatially and temporally changing gradient of Shh signaling is interpreted by the regulatory logic of a downstream transcriptional network. The design of the network, which links three transcription factors to Shh signaling, is responsible for differential spatial and temporal gene expression. In addition, the network renders cells insensitive to fluctuations in signaling and confers hysteresis--memory of the signal. Our findings reveal that morphogen interpretation is an emergent property of the architecture of a transcriptional network that provides robustness and reliability to tissue patterning.
Cheng S., Xie Z., Yu H., Wang C., Yu X., Wang J., Zheng H., Lu J., He X., Chen K., Gao J., Hu Y., Yao B., Lei D., You S., et. al.
2025-04-15 citations by CoLab: 0
Landi M., Carluccio A.V., Shah T., Niazi A., Stavolone L., Falquet L., Gisel A., Bongcam-Rudloff E.
BMC Genomics scimago Q1 wos Q2 Open Access
2025-04-10 citations by CoLab: 0 PDF Abstract  
Abstract Background Structural variants (SVs) are critical for plant genomic diversity and phenotypic variation. This study investigates a large, 9.7 Mbp highly repetitive segment on chromosome 12 of TMEB117, a region not previously characterized in cassava (Manihot esculenta Crantz). We aim to explore its presence and variability across multiple cassava landraces, providing insights into its genomic significance and potential implications. Results We validated the presence of the 9.7 Mbp segment in the TMEB117 genome, distinguishing it from other published cassava genome assemblies. By mapping short-read sequencing data from 16 cassava landraces to TMEB117 chromosome 12, we observed variability in read mapping, suggesting that while all genotypes contain the insertion region, some exhibit missing segments or sequence differences. Further analysis revealed two unique genes associated with deacetylase activity, HDA14 and SRT2, within the insertion. Additionally, the MUDR-Mutator transposable element was significantly overrepresented in this region. Conclusions This study uncovers a large structural variant in the TMEB117 cassava genome, highlighting its variability among different genotypes. The enrichment of HDA14 and SRT2 genes and the MUDR-Mutator elements within the insertion suggests potential functional significance, though further research is needed to explore this. These findings provide important insights into the role of structural variations in shaping cassava genomic diversity.
Zhong Z., Zheng G., Zhu D., Liu Y., Lin Z., Guan Z., Xiong F., Chen J., Shang X.
2025-04-02 citations by CoLab: 0 PDF Abstract  
Abstract Background Thalassemia is one of the most prevalent monogenic disorders in tropical and subtropical regions, imposing significant familial and social burdens on local populations. It is caused by point mutations or structural variations (SVs) in the α- or β-globin gene clusters. Due to the complex structure, full characterization of SVs has always been the focus and difficulty of molecular diagnosis of thalassemia patients. Methods Peripheral blood of a Chinese boy with β-thalassemia intermedia phenotype and his family members were collected. Multiplex ligation dependent probe amplification (MLPA), long-read sequencing (LRS) and Sanger sequencing were used to analyze the variant in this family. Results A novel large duplication (αααα280) was identified using LRS technique and validated by Sanger sequencing. Additionally, we conducted a systematic review of known SVs and evaluated the advantages and disadvantages of various methods in analyzing complex SVs. Conclusions Our study identified a novel SV in the α-globin gene cluster and demonstrated that LRS was a superior approach for detecting novel rare SVs. The appropriate use of LRS significantly improves diagnostic accuracy when conventional methods are not capable of completely identifying complex SVs.
Rao J., Luo H., An D., Liang X., Peng L., Chen F.
BMC Genomics scimago Q1 wos Q2 Open Access
2025-03-25 citations by CoLab: 0 PDF
Nachtigall P.G., Nystrom G.S., Broussard E.M., Wray K.P., Junqueira-de-Azevedo I.L., Parkinson C.L., Margres M.J., Rokyta D.R.
Molecular Biology and Evolution scimago Q1 wos Q1 Open Access
2025-03-18 citations by CoLab: 0 PDF Abstract  
Abstract Of all mutational mechanisms contributing to phenotypic variation, structural variants are both among the most capable of causing major effects as well as the most technically challenging to identify. Intraspecific variation in snake venoms is widely reported, and one of the most dramatic patterns described is the parallel evolution of streamlined neurotoxic rattlesnake venoms from hemorrhagic ancestors by means of deletion of snake venom metalloproteinase (SVMP) toxins and recruitment of neurotoxic dimeric phospholipase A2 (PLA2) toxins. While generating a haplotype-resolved, chromosome-level genome assembly for the eastern diamondback rattlesnake (Crotalus adamanteus), we discovered that our genome animal was heterozygous for a ∼225 Kb deletion containing six SVMP genes, paralleling one of the two steps involved in the origin of neurotoxic rattlesnake venoms. Range-wide population-genomic analysis revealed that, although this deletion is rare overall, it is the dominant homozygous genotype near the northwestern periphery of the species’ range, where this species is vulnerable to extirpation. Although major SVMP deletions have been described in at least five other rattlesnake species, C. adamanteus is unique in not additionally gaining neurotoxic PLA2s. Previous work established a superficially complementary north–south gradient in myotoxin (MYO) expression based on copy number variation with high expression in the north and low in the south, yet we found that the SVMP and MYO genotypes vary independently, giving rise to an array of diverse, novel venom phenotypes across the range. Structural variation, therefore, forms the basis for the major axes of geographic venom variation for C. adamanteus.
Poleg T., Hadar N., Heimer G., Dolgin V., Aminov I., Safran A., Agam N., Jean M.M., Freund O., Kaur S., Christodoulou J., Ben-Zeev B., Birk O.S.
npj Genomic Medicine scimago Q1 wos Q1 Open Access
2025-03-13 citations by CoLab: 0 PDF
Yang Q., Sun J., Wang X., Wang J., Liu Q., Ru J., Zhang X., Wang S., Hao R., Bian P., Dai X., Gong M., Zhang Z., Wang A., Bai F., et. al.
Nature Communications scimago Q1 wos Q1 Open Access
2025-03-11 citations by CoLab: 0 PDF
Choi S.H., Jurgens S.J., Xiao L., Hill M.C., Haggerty C.M., Sveinbjörnsson G., Morrill V.N., Marston N.A., Weng L., Pirruccello J.P., Arnar D.O., Gudbjartsson D.F., Mantineo H., von Falkenhausen A.S., Natale A., et. al.
Nature Genetics scimago Q1 wos Q1
2025-03-06 citations by CoLab: 0
Krishnamurthy N., Krishna D., Sanjana, Rathinasamy J., Kumar A., Francis A.M.
2025-03-01 citations by CoLab: 0
Qiao X., Shi J., Xu H., Liu K., Pu Y., Xue X., Zheng W., Guo Y., Ma H., Wang C., Bitsue H.K., Xu X., Wang S., Zhao J., Guo X., et. al.
Communications Biology scimago Q1 wos Q1 Open Access
2025-02-22 citations by CoLab: 0 PDF
Gong J., Sun H., Wang K., Zhao Y., Huang Y., Chen Q., Qiao H., Gao Y., Zhao J., Ling Y., Cao R., Tan J., Wang Q., Ma Y., Li J., et. al.
Nature Communications scimago Q1 wos Q1 Open Access
2025-02-10 citations by CoLab: 0 PDF Abstract  
Genomic structural variants (SVs) are a major source of genetic diversity in humans. Here, through long-read sequencing of 945 Han Chinese genomes, we identify 111,288 SVs, including 24.56% unreported variants, many with predicted functional importance. By integrating human population-level phenotypic and multi-omics data as well as two humanized mouse models, we demonstrate the causal roles of two SVs: one SV that emerges at the common ancestor of modern humans, Neanderthals, and Denisovans in GSDMD for bone mineral density and one modern-human-specific SV in WWP2 impacting height, weight, fat, craniofacial phenotypes and immunity. Our results suggest that the GSDMD SV could serve as a rapid and cost-effective biomarker for assessing the risk of cisplatin-induced acute kidney injury. The functional conservation from human to mouse and widespread signals of positive natural selection suggest that both SVs likely influence local adaptation, phenotypic diversity, and disease susceptibility across diverse human populations. Genetic studies of Chinese individuals have been performed, but mostly with short read sequencing, limiting the types of variants that can be identified. Here, the authors perform long read sequencing of 945 han Chinese individuals, finding structural variants under natural selection and those associated with human traits and evolutionary history.
Duan D., Cheng C., Huang Y., Chung A., Chen P., Chen Y., Hsu J.S., Chen P.
PLoS ONE scimago Q1 wos Q1 Open Access
2025-02-06 citations by CoLab: 0 PDF Abstract  
Structural variants (SVs) have been associated with changes in gene expression, which may contribute to alterations in phenotypes and disease development. However, the precise identification and characterization of SVs remain challenging. While long-read sequencing offers superior accuracy for SV detection, short-read sequencing remains essential due to practical and cost considerations, as well as the need to analyze existing short-read datasets. Numerous algorithms for short-read SV detection exist, but none are universally optimal, each having limitations for specific SV sizes and types. In this study, we evaluated the efficacy of six advanced SV detection algorithms, including the commercial software DRAGEN, using the GIAB v0.6 Tier 1 benchmark and HGSVC2 cell lines. We employed both individual and combination strategies, with systematic assessments of recall, precision, and F1 scores. Our results demonstrate that the union combination approach enhanced detection capabilities, surpassing single algorithms in identifying deletions and insertions, and delivered comparable recall and F1 scores to the commercial software DRAGEN. Interestingly, expanding the number of algorithms from three to five in the combination did not enhance performance, highlighting the efficiency of a well-chosen ensemble over a larger algorithmic pool.
Li K., Qi L., Zhu Y., He M., Xiang Q., Zheng D.
2025-02-01 citations by CoLab: 2
Pinglay S., Lalanne J., Daza R.M., Kottapalli S., Quaisar F., Koeppel J., Garge R.K., Li X., Lee D.S., Shendure J.
Science scimago Q1 wos Q1 Open Access
2025-01-31 citations by CoLab: 1 PDF Abstract  
Studying the functional consequences of structural variants (SVs) in mammalian genomes is challenging because (i) SVs arise much less commonly than single-nucleotide variants or small indels and (ii) methods to generate, map, and characterize SVs in model systems are underdeveloped. To address these challenges, we developed Genome-Shuffle-seq, a method that enables the multiplex generation and mapping of thousands of SVs (deletions, inversions, translocations, and extrachromosomal circles) throughout mammalian genomes. We also demonstrate the co-capture of SV identity with single-cell transcriptomes, facilitating the measurement of SV impact on gene expression. We anticipate that Genome-Shuffle-seq will be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, the chromatin landscape, and three-dimensional nuclear architecture, while also initiating a path toward a minimal mammalian genome.
Oriowo T.O., Chrysostomakis I., Martin S., Kukowka S., Brown T., Winkler S., Myers E.W., Böhne A., Stange M.
GigaScience scimago Q1 wos Q1 Open Access
2025-01-29 citations by CoLab: 0 PDF Abstract  
Abstract Background In this study, we present an in-depth analysis of the Eurasian minnow (Phoxinus phoxinus) genome, highlighting its genetic diversity, structural variations, and evolutionary adaptations. We generated an annotated haplotype-phased, chromosome-level genome assembly (2n = 50) by integrating high-fidelity (HiFi) long reads and chromosome conformation capture data (Hi-C). Results We achieved a haploid size of 940 megabase pairs (Mbp) for haplome 1 and 929 Mbp for haplome 2 with high scaffold N50 values of 36.4 Mb and 36.6 Mb and BUSCO scores of 96.9% and 97.2%, respectively, indicating a highly complete genome assembly. We detected notable heterozygosity (1.43%) and a high repeat content (approximately 54%), primarily consisting of DNA transposons, which contribute to genome rearrangements and variations. We found substantial structural variations within the genome, including insertions, deletions, inversions, and translocations. These variations affect genes enriched in functions such as dephosphorylation, developmental pigmentation, phagocytosis, immunity, and stress response. In the annotation of protein-coding genes, 30,980 messenger RNAs and 23,497 protein-coding genes were identified with a high completeness score, which further underpins the high contiguity of our genome assemblies. We performed a gene family evolution analysis by comparing our proteome to 10 other teleost species, which identified immune system gene families that prioritize histone-based disease prevention over NB-LRR-related-based immune responses. Additionally, demographic analysis indicates historical fluctuations in the effective population size of P. phoxinus, likely correlating with past climatic changes. Conclusions This annotated, phased reference genome provides a crucial resource for resolving the taxonomic complexity within the genus Phoxinus and highlights the importance of haplotype-phased assemblies in understanding haplotype diversity in species characterized by high heterozygosity.

Top-30

Journals

5
10
15
20
25
5
10
15
20
25

Publishers

20
40
60
80
100
120
140
160
180
20
40
60
80
100
120
140
160
180
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?