Wellcome Sanger Institute

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Wellcome Sanger Institute
Short name
Sange
Country, city
United Kingdom, Cambridge
Publications
10 173
Citations
1 213 394
h-index
482
Top-3 journals
Nature Genetics
Nature Genetics (464 publications)
Nature Communications
Nature Communications (453 publications)
Nature
Nature (398 publications)
Top-3 organizations
University of Cambridge
University of Cambridge (3089 publications)
University of Oxford
University of Oxford (1368 publications)
Top-3 foreign organizations
Harvard University
Harvard University (767 publications)
University of Helsinki
University of Helsinki (492 publications)

Most cited in 5 years

Danecek P., Bonfield J.K., Liddle J., Marshall J., Ohan V., Pollard M.O., Whitwham A., Keane T., McCarthy S.A., Davies R.M., Li H.
GigaScience scimago Q1 wos Q1 Open Access
2021-01-29 citations by CoLab: 8521 PDF Abstract  
Abstract Background SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods. Findings The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines. Conclusion Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed >1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org.
Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P., Gauthier L.D., Brand H., Solomonson M., Watts N.A., Rhodes D., et. al.
Nature scimago Q1 wos Q1
2020-05-27 citations by CoLab: 7342 Abstract  
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases. A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.
Harvey W.T., Carabelli A.M., Jackson B., Gupta R.K., Thomson E.C., Harrison E.M., Ludden C., Reeve R., Rambaut A., Peacock S.J., Robertson D.L.
Nature Reviews Microbiology scimago Q1 wos Q1
2021-06-01 citations by CoLab: 2932 Abstract  
Although most mutations in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome are expected to be either deleterious and swiftly purged or relatively neutral, a small proportion will affect functional properties and may alter infectivity, disease severity or interactions with host immunity. The emergence of SARS-CoV-2 in late 2019 was followed by a period of relative evolutionary stasis lasting about 11 months. Since late 2020, however, SARS-CoV-2 evolution has been characterized by the emergence of sets of mutations, in the context of ‘variants of concern’, that impact virus characteristics, including transmissibility and antigenicity, probably in response to the changing immune profile of the human population. There is emerging evidence of reduced neutralization of some SARS-CoV-2 variants by postvaccination serum; however, a greater understanding of correlates of protection is required to evaluate how this may impact vaccine effectiveness. Nonetheless, manufacturers are preparing platforms for a possible update of vaccine sequences, and it is crucial that surveillance of genetic and antigenic changes in the global virus population is done alongside experiments to elucidate the phenotypic impacts of mutations. In this Review, we summarize the literature on mutations of the SARS-CoV-2 spike protein, the primary antigen, focusing on their impacts on antigenicity and contextualizing them in the protein structure, and discuss them in the context of observed mutation frequencies in global sequence datasets. The evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been characterized by the emergence of mutations and so-called variants of concern that impact virus characteristics, including transmissibility and antigenicity. In this Review, members of the COVID-19 Genomics UK (COG-UK) Consortium and colleagues summarize mutations of the SARS-CoV-2 spike protein, focusing on their impacts on antigenicity and contextualizing them in the protein structure, and discuss them in the context of observed mutation frequencies in global sequence datasets.
Alexandrov L.B., Kim J., Haradhvala N.J., Huang M.N., Tian Ng A.W., Wu Y., Boot A., Covington K.R., Gordenin D.A., Bergstrom E.N., Islam S.M., Lopez-Bigas N., Klimczak L.J., McPherson J.R., Morganella S., et. al.
Nature scimago Q1 wos Q1
2020-02-05 citations by CoLab: 2554 Abstract  
Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature1. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses3–15, enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated—but distinct—DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer. The characterization of 4,645 whole-genome and 19,184 exome sequences, covering most types of cancer, identifies 81 single-base substitution, doublet-base substitution and small-insertion-and-deletion mutational signatures, providing a systematic overview of the mutational processes that contribute to cancer development.
Efremova M., Vento-Tormo M., Teichmann S.A., Vento-Tormo R.
Nature Protocols scimago Q1 wos Q1
2020-02-26 citations by CoLab: 2333 Abstract  
Cell–cell communication mediated by ligand–receptor complexes is critical to coordinating diverse biological processes, such as development, differentiation and inflammation. To investigate how the context-dependent crosstalk of different cell types enables physiological processes to proceed, we developed CellPhoneDB, a novel repository of ligands, receptors and their interactions. In contrast to other repositories, our database takes into account the subunit architecture of both ligands and receptors, representing heteromeric complexes accurately. We integrated our resource with a statistical framework that predicts enriched cellular interactions between two cell types from single-cell transcriptomics data. Here, we outline the structure and content of our repository, provide procedures for inferring cell–cell communication networks from single-cell RNA sequencing data and present a practical step-by-step guide to help implement the protocol. CellPhoneDB v.2.0 is an updated version of our resource that incorporates additional functionalities to enable users to introduce new interacting molecules and reduces the time and resources needed to interrogate large datasets. CellPhoneDB v.2.0 is publicly available, both as code and as a user-friendly web interface; it can be used by both experts and researchers with little experience in computational genomics. In our protocol, we demonstrate how to evaluate meaningful biological interactions with CellPhoneDB v.2.0 using published datasets. This protocol typically takes ~2 h to complete, from installation to statistical analysis and visualization, for a dataset of ~10 GB, 10,000 cells and 19 cell types, and using five threads. CellPhoneDB combines an interactive database and a statistical framework for the exploration of ligand–receptor interactions inferred from single-cell transcriptomics measurements.
Sungnak W., Huang N., Bécavin C., Berg M., Queen R., Litvinukova M., Talavera-López C., Maatz H., Reichart D., Sampaziotis F., Worlock K.B., Yoshida M., Barnes J.L.
Nature Medicine scimago Q1 wos Q1
2020-04-23 citations by CoLab: 2123 Abstract  
We investigated SARS-CoV-2 potential tropism by surveying expression of viral entry-associated genes in single-cell RNA-sequencing data from multiple tissues from healthy human donors. We co-detected these transcripts in specific respiratory, corneal and intestinal epithelial cells, potentially explaining the high efficiency of SARS-CoV-2 transmission. These genes are co-expressed in nasal epithelial cells with genes involved in innate immunity, highlighting the cells’ potential role in initial viral infection, spread and clearance. The study offers a useful resource for further lines of inquiry with valuable clinical samples from COVID-19 patients and we provide our data in a comprehensive, open and user-friendly fashion at www.covid19cellatlas.org. An analysis of single-cell transcriptomics datasets from different tissues shows that ACE2 and TMPRSS2 are co-expressed in respiratory, corneal and intestinal epithelial cell populations, and that respiratory expression of ACE2 is associated with genes involved in innate immunity.
Durbin R., Wang Y., Howe K., Wood J., McCarthy S.A., Guan D.
Bioinformatics scimago Q1 wos Q1 Open Access
2020-01-23 citations by CoLab: 1997 Abstract  
Abstract Motivation Rapid development in long-read sequencing and scaffolding technologies is accelerating the production of reference-quality assemblies for large eukaryotic genomes. However, haplotype divergence in regions of high heterozygosity often results in assemblers creating two copies rather than one copy of a region, leading to breaks in contiguity and compromising downstream steps such as gene annotation. Several tools have been developed to resolve this problem. However, they either focus only on removing contained duplicate regions, also known as haplotigs, or fail to use all the relevant information and hence make errors. Results Here we present a novel tool, purge_dups, that uses sequence similarity and read depth to automatically identify and remove both haplotigs and heterozygous overlaps. In comparison with current tools, we demonstrate that purge_dups can reduce heterozygous duplication and increase assembly continuity while maintaining completeness of the primary assembly. Moreover, purge_dups is fully automatic and can easily be integrated into assembly pipelines. Availability and implementation The source code is written in C and is available at https://github.com/dfguan/purge_dups. Supplementary information Supplementary data are available at Bioinformatics online.
Nurk S., Koren S., Rhie A., Rautiainen M., Bzikadze A.V., Mikheenko A., Vollger M.R., Altemose N., Uralsky L., Gershman A., Aganezov S., Hoyt S.J., Diekhans M., Logsdon G.A., Alonge M., et. al.
Science scimago Q1 wos Q1 Open Access
2022-04-11 citations by CoLab: 1911 PDF Abstract  
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Rhie A., McCarthy S.A., Fedrigo O., Damas J., Formenti G., Koren S., Uliano-Silva M., Chow W., Fungtammasan A., Kim J., Lee C., Ko B.J., Chaisson M., Gedman G.L., Cantin L.J., et. al.
Nature scimago Q1 wos Q1
2021-04-28 citations by CoLab: 1900 Abstract  
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences. The Vertebrate Genome Project has used an optimized pipeline to generate high-quality genome assemblies for sixteen species (representing all major vertebrate classes), which have led to new biological insights.
Challis R., Richards E., Rajan J., Cochrane G., Blaxter M.
G3: Genes, Genomes, Genetics scimago Q1 wos Q3 Open Access
2020-04-01 citations by CoLab: 1568 PDF Abstract  
AbstractReconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.
Ganaie F.A., Beall B.W., Yu J., van der Linden M., McGee L., Satzke C., Manna S., Lo S.W., Bentley S.D., Ravenscroft N., Nahm M.H.
Clinical Microbiology Reviews scimago Q1 wos Q1
2025-03-13 citations by CoLab: 1 Abstract  
SUMMARY Streptococcus pneumoniae (the “pneumococcus”) is a significant human pathogen. The key determinant of pneumococcal fitness and virulence is its ability to produce a protective polysaccharide (PS) capsule, and anti-capsule antibodies mediate serotype-specific opsonophagocytic killing of bacteria. Notably, immunization with pneumococcal conjugate vaccines (PCVs) has effectively reduced the burden of disease caused by serotypes included in vaccines but has also spurred a relative upsurge in the prevalence of non-vaccine serotypes. Recent advancements in serotyping and bioinformatics surveillance tools coupled with high-resolution analytical techniques have enabled the discovery of numerous new capsule types, thereby providing a fresh perspective on the dynamic pneumococcal landscape. This review offers insights into the current pneumococcal seroepidemiology highlighting important serotype shifts in different global regions in the PCV era. It also comprehensively summarizes newly discovered serotypes from 2007 to 2024, alongside updates on revised chemical structures and the de-novo determinations of structures for previously known serotypes. Furthermore, we spotlight emerging evidence on non-pneumococcal Mitis-group strains that express capsular PS that are serologically and biochemically related to the pneumococcal capsule types. We further discuss the implications of these recent findings on capsule nomenclature, pneumococcal carriage detection, and future PCV design. The review maps out the current status and also outlines the course for future research and vaccine strategies, ensuring a continued effective response to the evolving pneumococcal challenge.
Matentzoglu N., Bello S.M., Stefancsik R., Alghamdi S.M., Anagnostopoulos A.V., Balhoff J.P., Balk M.A., Bradford Y.M., Bridges Y., Callahan T.J., Caufield H., Cuzick A., Carmody L.C., Caron A.R., de Souza V., et. al.
Genetics scimago Q1 wos Q2
2025-03-06 citations by CoLab: 0 Abstract  
Abstract Phenotypic data are critical for understanding biological mechanisms and consequences of genomic variation, and are pivotal for clinical use cases such as disease diagnostics and treatment development. For over a century, vast quantities of phenotype data have been collected in many different contexts covering a variety of organisms. The emerging field of phenomics focuses on integrating and interpreting these data to inform biological hypotheses. A major impediment in phenomics is the wide range of distinct and disconnected approaches to recording the observable characteristics of an organism. Phenotype data are collected and curated using free text, single terms or combinations of terms, using multiple vocabularies, terminologies, or ontologies. Integrating these heterogeneous and often siloed data enables the application of biological knowledge both within and across species. Existing integration efforts are typically limited to mappings between pairs of terminologies; a generic knowledge representation that captures the full range of cross-species phenomics data is much needed. We have developed the Unified Phenotype Ontology (uPheno) framework, a community effort to provide an integration layer over domain-specific phenotype ontologies, as a single, unified, logical representation. uPheno comprises (1) a system for consistent computational definition of phenotype terms using ontology design patterns, maintained as a community library; (2) a hierarchical vocabulary of species-neutral phenotype terms under which their species-specific counterparts are grouped; and (3) mapping tables between species-specific ontologies. This harmonized representation supports use cases such as cross-species integration of genotype-phenotype associations from different organisms and cross-species informed variant prioritization.
Leong I.U., Cabrera C.P., Cipriani V., Ross P.J., Turner R.M., Stuckey A., Sanghvi S., Pasko D., Moutsianas L., Odhams C.A., Elgar G.S., Chan G., Giess A., Walker S., Foulger R.E., et. al.
Journal of Clinical Oncology scimago Q1 wos Q1
2025-02-20 citations by CoLab: 1 Abstract  
PURPOSE As part of the 100,000 Genomes Project, we set out to assess the potential viability and clinical impact of reporting genetic variants associated with drug-induced toxicity for patients with cancer recruited for whole-genome sequencing (WGS) as part of a genomic medicine service. METHODS Germline WGS from 76,805 participants was analyzed for pharmacogenetic (PGx) variants in four genes ( DPYD , NUDT15 , TPMT , UGT1A1 ) associated with toxicity induced by five drugs used in cancer treatment (capecitabine, fluorouracil, mercaptopurine, thioguanine, irinotecan). Linking genomic data with prescribing and hospital incidence records, a phenome-wide association study (PheWAS) was performed to identify whether phenotypes indicative of adverse drug reactions (ADRs) were enriched in drug-exposed individuals with the relevant PGx variants. In a subset of 7,081 patients with cancer, DPYD variants were reported back to clinicians and outcomes were collected. RESULTS We identified clinically relevant PGx variants across the four genes in 62.7% of participants in our cohort. Extending this to annual prescription numbers in England for the drugs affected by these PGx variants, approximately 14,540 patients per year could potentially benefit from a reduced dose or alternative drug to reduce the risk of ADRs. Validating PGx associations in a real-world data set, we found a significant association between PGx variants in DPYD and toxicity-related phenotypes in patients treated with capecitabine or fluorouracil. Reported DPYD variants were deemed informative for clinical decision making in a majority of cases. CONCLUSION Reporting PGx variants from germline WGS relevant to patients with cancer alongside primary findings related to their cancer can be clinically informative, informing prescribing to reduce the risk of ADRs. Extending the range of actionable variants to those found in patients of non-European ancestry is important and will extend the potential clinical impact.
Zaremba B., Fallahshahroudi A., Schneider C., Schmidt J., Sarropoulos I., Leushkin E., Berki B., Van Poucke E., Jensen P., Senovilla-Ganzo R., Hervas-Sotomayor F., Trost N., Lamanna F., Sepp M., García-Moreno F., et. al.
Science scimago Q1 wos Q1 Open Access
2025-02-14 citations by CoLab: 3 PDF Abstract  
Innovations in the pallium likely facilitated the evolution of advanced cognitive abilities in birds. We therefore scrutinized its cellular composition and evolution using cell type atlases from chicken, mouse, and nonavian reptiles. We found that the avian pallium shares most inhibitory neuron types with other amniotes. Whereas excitatory neuron types in amniote hippocampal regions show evolutionary conservation, those in other pallial regions have diverged. Neurons in the avian mesopallium display gene expression profiles akin to the mammalian claustrum and deep cortical layers, while certain nidopallial cell types resemble neurons in the piriform cortex. Lastly, we observed substantial gene expression convergence between the dorsally located hyperpallium and ventrally located nidopallium during late development, suggesting that topological location does not always dictate gene expression programs determining functional properties in the adult avian pallium.
Oberstaller J., Xu S., Naskar D., Zhang M., Wang C., Gibbons J., Pires C.V., Mayho M., Otto T.D., Rayner J.C., Adams J.H.
Science scimago Q1 wos Q1 Open Access
2025-02-07 citations by CoLab: 3 PDF Abstract  
Malaria parasites are highly divergent from model eukaryotes. Large-scale genome engineering methods effective in model organisms are frequently inapplicable, and systematic studies of gene function are few. We generated more than 175,000 transposon insertions in the Plasmodium knowlesi genome, averaging an insertion every 138 base pairs, and used this “supersaturation” mutagenesis to score essentiality for 98% of genes. The density of mutations allowed mapping of putative essential domains within genes, providing a completely new level of genome annotation for any Plasmodium species. Although gene essentiality was largely conserved across P. knowlesi , Plasmodium falciparum , and rodent malaria model Plasmodium berghei , a large number of shared genes are differentially essential, revealing species-specific adaptations. Our results indicated that Plasmodium essential gene evolution was conditionally linked to adaptive rewiring of metabolic networks for different hosts.
McIntyre J., Morrison A., Maitland K., Berger D., Price D.R., Dougan S., Grigoriadis D., Tracey A., Holroyd N., Bull K., Rose Vineer H., Glover M.J., Morgan E.R., Nisbet A.J., McNeilly T.N., et. al.
PLoS Pathogens scimago Q1 wos Q1 Open Access
2025-02-06 citations by CoLab: 0 PDF Abstract  
The parasitic nematode Teladorsagia circumcincta is one of the most important pathogens of sheep and goats in temperate climates worldwide and can rapidly evolve resistance to drugs used to control it. To understand the genetics of drug resistance, we have generated a highly contiguous genome assembly for the UK T. circumcincta isolate, MTci2. Assembly using PacBio long-reads and Hi-C long-molecule scaffolding together with manual curation resulted in a 573 Mb assembly (N50 = 84 Mb, total scaffolds = 1,286) with five autosomal and one sex-linked chromosomal-scale scaffolds consistent with its karyotype. The genome resource was further improved via annotation of 22,948 genes, with manual curation of over 3,200 of these, resulting in a robust and near complete resource (96.3% complete protein BUSCOs) to support basic and applied research on this important veterinary pathogen. Genome-wide analyses of drug resistance, combining evidence from three distinct experiments, identified selection around known candidate genes for benzimidazole, levamisole and ivermectin resistance, as well as novel regions associated with ivermectin and moxidectin resistance. These insights into contemporary and historic genetic selection further emphasise the importance of contiguous genome assemblies in interpreting genome-wide genetic variation associated with drug resistance and identifying key loci to prioritise in developing diagnostic markers of anthelmintic resistance to support parasite control.
Gutiérrez-Abril J., Gundem G., Fiala E., Liosis K., Farnoud N., Leongamornlert D., Amallraja A., Arango Ossa J.E., Domenico D., Levine M.F., Medina-Martínez J.S., Stockfisch E., You D., Walsh M.F., Jasinski S., et. al.
Blood advances scimago Q1 wos Q1 Open Access
2025-02-05 citations by CoLab: 0
Portal E.A., Farley C., Iannetelli T., Coelho J., Efstratiou A., Bentley S.D., Chalker V.J., Spiller O.B.
Antibiotics scimago Q1 wos Q1 Open Access
2025-02-05 citations by CoLab: 0 PDF Abstract  
Background: Streptococcus agalactiae (Group B Streptococcus, GBS) is a leading cause of neonatal sepsis in high-income countries. While intrapartum antibiotic screening reduces this risk, increasing resistance to macrolides and lincosamides in Europe since the 1990s has limited therapeutic options for penicillin-allergic patients. Reports of reduced beta-lactam susceptibility in GBS further emphasise the need for robust antimicrobial resistance (AMR) surveillance. However, broth microdilution (BMD) methods are unsuitable for large-scale antimicrobial susceptibility testing (AST). Objective: To demonstrate that agar-dilution AST provides equivalent results to broth dilution methods, with superior capacity for high-throughput screening. Methods: Agar-dilution and microdilution AST methods were compared using a panel of 24 characterised susceptible and resistant GBS strains for benzylpenicillin, chloramphenicol, clindamycin, erythromycin, gentamicin, levofloxacin, tetracycline, and vancomycin. Minimum inhibitory concentration (MIC) agreements were evaluated, and resistance profile correlations were assessed using Cohen’s kappa values. Results: Agar-dilution demonstrated >90% agreement with BMD MIC for most antimicrobials, except vancomycin (87.5%), erythromycin (83.33%), and tetracycline (52.78%). Cohen’s kappa values indicated strong agreement (0.88–1.00) for resistance determination. Agar-dilution avoided “trailing growth” issues associated with BMD and facilitated easier detection of non-GBS contaminants. Conclusions: Agar-dilution is a valid method for high-throughput AMR surveillance of retrospective cohorts (96 isolates per plate) and is critical for identifying emerging GBS resistance trends and informing therapeutic guidelines. However, due to the large number of plates required per antimicrobial, it is impractical for routine clinical diagnostics.
Cai Z., Apolinário S., Baião A.R., Pacini C., Sousa M.D., Vinga S., Reddel R.R., Robinson P.J., Garnett M.J., Zhong Q., Gonçalves E.
Nature Communications scimago Q1 wos Q1 Open Access
2025-02-04 citations by CoLab: 0 PDF
Al-Ajli F.O., Formenti G., Fedrigo O., Tracey A., Sims Y., Howe K., Al-Karkhi I.M., Althani A.A., Jarvis E.D., Rahman S., Ayub Q.
Scientific Reports scimago Q1 wos Q1 Open Access
2025-02-04 citations by CoLab: 0 PDF Abstract  
The taxonomic classification of a falcon population found in the Mongolian Altai region in Asia has been heavily debated for two centuries and previous studies have been inconclusive, hindering a more informed conservation approach. Here, we generated a chromosome-level gyrfalcon reference genome using the Vertebrate Genomes Project (VGP) assembly pipeline. Using whole genome sequences of 49 falcons from different species and populations, including “Altai” falcons, we analyzed their population structure, admixture patterns, and demographic history. We find that the Altai falcons are genomic mosaics of saker and gyrfalcon ancestries, and carry distinct W and mitochondrial haplotypes that cluster with the lanner falcon. The Altai maternally-inherited haplotypes diverged 422,000 years before present (290,000–550,000 YBP) from the ancestor of sakers and gyrfalcons, both of which, in turn, split 109,000 YBP (70,000–150,000 YBP). The Altai W chromosome has 31 coding variants in 29 genes that may possibly influence important structural, behavioral, and reproductive traits. These findings provide insights into the question of Altai falcons as a candidate distinct species.
Li Y., Lim C., Dismuke T., Malawsky D.S., Oasa S., Bruce Z.C., Offenhäuser C., Baumgartner U., D’Souza R.C., Edwards S.L., French J.D., Ock L.S., Nair S., Sivakumaran H., Harris L., et. al.
Nature Communications scimago Q1 wos Q1 Open Access
2025-02-04 citations by CoLab: 2 PDF Abstract  
OLIG2-expressing tumor stem cells have been shown to drive recurrence in Sonic Hedgehog (SHH)-subgroup medulloblastoma (MB) and patients urgently need specific therapies to target this tumor cell population. Here, we investigate the therapeutic potential of the brain-penetrant orally bioavailable, OLIG2 inhibitor CT-179, using SHH-MB explant organoids, PDX and GEM SHH-MB models. We find that CT-179 disrupts OLIG2 dimerization, phosphorylation and DNA binding and alters tumor cell-cycle kinetics, increasing differentiation and apoptosis. CT-179 prolongs survival in SHH-MB PDX and GEM models and potentiates radiotherapy (RT) in vivo. Single cell transcriptomic studies (scRNA-seq) confirm that CT-179 increases differentiation and implicate Cdk4 up-regulation in maintaining proliferation during treatment. Consistent with CDK4 mediating CT-179 resistance, CT-179 combines effectively with the CDK4/6 inhibitor palbociclib, further prolonging survival in vivo. These data support therapeutic targeting of OLIG2+ tumor stem cells in regimens for SHH-driven MB, to improve response, delay recurrence and ultimately improve MB patient outcomes. Previously, OLIG2-expressing stems cells have been identified as having a role in medulloblastoma recurrence. Here, the authors investigate the effects of targeting this OLIG2+ stem cell population in Sonic Hedgehog (SHH) medulloblastoma using CT-179, an OLIG2 inhibitor.
Kregar L.D., Williams N., Lee J., Mitchell E., Laurenti E., Nangalia J., Campbell P.
Cancer Research scimago Q1 wos Q1
2025-02-01 citations by CoLab: 0 Abstract  
Abstract Ageing is the decline in an organism's function over time, driven by internal and external stresses. Key features of ageing include genomic instability, telomere shortening, epigenetic changes, and loss of proteostasis, and they tend to worsen and compound with age. However, how do these molecular changes manifest in the observed systemic changes seen in ageing is unclear. Somatic mutations contribute to cancer but are thought to have a limited role in ageing, as the genome tolerates rare random errors. Yet, some mutations confer a fitness advantage to the cell, allowing it to overproliferate relative to other cells. To what extent does a clone's founding cell also confer a phenotype, which is inherited by the expanding clone is incompletely understood. It would be interesting to see if constant mutation rates coupled with slow but exponential clonal expansion leads to a sizable proportion of cells that also carry a stable phenotype, which could result in tissue ageing at a systemic level. We sequenced whole genomes and methylomes of single human hematopoietic stem cells (HSCs) from several healthy individuals of different ages and patients with hematologic malignancies. Based on the patterns of shared and cell-specific somatic mutations, we built phylogenies of HSCs for each individual. To establish whether loss or gain of methylation, both mono and bi-allelic, was heritable across cell division, we developed a method that recapitulates this process in a population of related cells. We analysed approximately 25 million CpG sites per individual, and find that methylation is highly heritable at the majority of sites. Moreover, the method allows us to accurately time when methylation changes occurred as methylation changes are assigned to a branch, which represents a common ancestor of the descendant cells, and the branches of a phylogeny can be linked to a developmental period, thus disentangling their evolutionary history. We observe that methylation changes are remarkably stable as we found that thousands of such changes are acquired before germ layer formation and are stably inherited until old age. Interestingly, the same regions across individuals acquire these pre-gastrulation changes that are maintained indefinitely. Moreover, we found the rates of methylation change are several folds higher than the rates of somatic mutations and that methylation changes are stochastic and allele-specific. Lastly, we find that some of these changes precede the neoplasm’s clone, which suggests they might play a role in function. By comparing normal healthy individuals of different ages with cancer patients, we have been able to unravel the process of normal ageing from disease development at an unprecedented scale and granularity. Citation Format: Lori D Kregar, Nicholas Williams, Joe Lee, Emily Mitchell, Elisa Laurenti, Jyoti Nangalia, Peter Campbell. Characteristics of methylation heritability in human somatic cells [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: DNA Methylation, Clonal Hematopoiesis, and Cancer; 2025 Feb 1-4; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2025;85(3 Suppl):Abstract nr PR002.
Bellis K.L., Dissanayake O.M., Harrison E.M., Aggarwal D.
2025-02-01 citations by CoLab: 3 Abstract  
Community-acquired (CA), community-onset methicillin-resistant Staphylococcus aureus (CO-MRSA) infection presents a significant public health challenge, even where MRSA rates are historically lower. Despite successes in reducing hospital-onset MRSA, CO-MRSA rates are increasing globally, with a need to understand this trend, and the potential risk factors for re-emergence.
Brown M.R., Gonzalez de La Rosa P., Blaxter M.
Bioinformatics scimago Q1 wos Q1 Open Access
2025-01-31 citations by CoLab: 6 PDF Abstract  
Abstract Summary tidk (short for telomere identification toolkit), uses a simple, fast algorithm to scan long DNA reads for the presence of short tandemly repeated DNA in runs, and to aggregate them based on canonical DNA string representation. These are telomeric repeat candidates. Our algorithm is shown to be accurate in genomes for which the telomeric repeat unit is known and is tested across a wide variety of newly assembled genomes to uncover new telomeric repeat units. Tools are provided to identify telomeric repeats de novo, scan genomes for known telomeric repeats, and to visualize telomeric repeats on the assembly. tidk is implemented in Rust and is available as a command line tool which can be compiled using the Rust toolchain or downloaded as a binary from bioconda. Availability the tidk Rust crate is freely available under the MIT license (https://crates.io/crates/tidk), and the source code is available at https://github.com/tolkit/telomeric-identifier. Supplementary information Supplementary data are available at Bioinformatics online.
Pinglay S., Lalanne J., Daza R.M., Kottapalli S., Quaisar F., Koeppel J., Garge R.K., Li X., Lee D.S., Shendure J.
Science scimago Q1 wos Q1 Open Access
2025-01-31 citations by CoLab: 1 PDF Abstract  
Studying the functional consequences of structural variants (SVs) in mammalian genomes is challenging because (i) SVs arise much less commonly than single-nucleotide variants or small indels and (ii) methods to generate, map, and characterize SVs in model systems are underdeveloped. To address these challenges, we developed Genome-Shuffle-seq, a method that enables the multiplex generation and mapping of thousands of SVs (deletions, inversions, translocations, and extrachromosomal circles) throughout mammalian genomes. We also demonstrate the co-capture of SV identity with single-cell transcriptomes, facilitating the measurement of SV impact on gene expression. We anticipate that Genome-Shuffle-seq will be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, the chromatin landscape, and three-dimensional nuclear architecture, while also initiating a path toward a minimal mammalian genome.

Since 1993

Total publications
10173
Total citations
1213394
Citations per publication
119.28
Average publications per year
317.91
Average authors per publication
19.61
h-index
482
Metrics description

Top-30

Fields of science

500
1000
1500
2000
2500
3000
Genetics, 2621, 25.76%
Molecular Biology, 1644, 16.16%
Multidisciplinary, 1311, 12.89%
General Biochemistry, Genetics and Molecular Biology, 1132, 11.13%
Genetics (clinical), 998, 9.81%
Infectious Diseases, 952, 9.36%
General Medicine, 942, 9.26%
Microbiology, 877, 8.62%
Cell Biology, 808, 7.94%
Biochemistry, 797, 7.83%
Immunology, 707, 6.95%
Cancer Research, 657, 6.46%
Oncology, 506, 4.97%
General Chemistry, 448, 4.4%
General Immunology and Microbiology, 447, 4.39%
Microbiology (medical), 434, 4.27%
General Physics and Astronomy, 434, 4.27%
Ecology, Evolution, Behavior and Systematics, 432, 4.25%
Biotechnology, 422, 4.15%
Hematology, 364, 3.58%
Parasitology, 317, 3.12%
Virology, 313, 3.08%
Molecular Medicine, 288, 2.83%
Computer Science Applications, 279, 2.74%
Immunology and Allergy, 274, 2.69%
Developmental Biology, 231, 2.27%
General Neuroscience, 221, 2.17%
General Agricultural and Biological Sciences, 184, 1.81%
Structural Biology, 182, 1.79%
Computational Theory and Mathematics, 171, 1.68%
500
1000
1500
2000
2500
3000

Journals

50
100
150
200
250
300
350
400
450
500
50
100
150
200
250
300
350
400
450
500

Publishers

500
1000
1500
2000
2500
3000
3500
4000
500
1000
1500
2000
2500
3000
3500
4000

With other organizations

500
1000
1500
2000
2500
3000
3500
500
1000
1500
2000
2500
3000
3500

With foreign organizations

100
200
300
400
500
600
700
800
100
200
300
400
500
600
700
800

With other countries

500
1000
1500
2000
2500
3000
3500
4000
USA, 3748, 36.84%
Germany, 1720, 16.91%
Australia, 1194, 11.74%
Netherlands, 1121, 11.02%
France, 1027, 10.1%
Italy, 863, 8.48%
Canada, 858, 8.43%
Spain, 772, 7.59%
Sweden, 750, 7.37%
Switzerland, 669, 6.58%
Finland, 642, 6.31%
China, 564, 5.54%
Denmark, 556, 5.47%
Belgium, 525, 5.16%
Norway, 460, 4.52%
Japan, 437, 4.3%
Austria, 313, 3.08%
Singapore, 253, 2.49%
Ireland, 237, 2.33%
Estonia, 233, 2.29%
Greece, 228, 2.24%
Republic of Korea, 226, 2.22%
Iceland, 204, 2.01%
South Africa, 204, 2.01%
Brazil, 190, 1.87%
Thailand, 190, 1.87%
India, 176, 1.73%
Saudi Arabia, 173, 1.7%
Portugal, 167, 1.64%
500
1000
1500
2000
2500
3000
3500
4000
  • We do not take into account publications without a DOI.
  • Statistics recalculated daily.
  • Publications published earlier than 1993 are ignored in the statistics.
  • The horizontal charts show the 30 top positions.
  • Journals quartiles values are relevant at the moment.