Structural Basis for the GGGGCC Repeat RNA Binding to SRSF2 Protein†
Comprehensive Summary
RNA‐protein interactions are crucial for regulating various cellular processes such as gene expression, RNA modification and translation. In contrast, undesirable RNA‐protein interactions often cause dysregulated cellular activities associated with many human diseases. The RNA containing expanded GGGGCC repeats forms secondary structures that sequester various RNA binding proteins (RBPs), leading to the development of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). However, a gap persists in understanding the structural basis for GGGGCC repeat RNA binding to RBPs. Here, we resolve the first solution NMR structure of a natural GGGGCC repeat RNA containing a 2 × 2 GG/GG internal loop, and perform MD simulations and site‐directed mutagenesis to elucidate the mechanism for GGGGCC repeat RNA binding to SRSF2, a splicing factor and key marker of nuclear speckles. We reveal that the R47/T51/R61 residues in RNA recognition motif of SRSF2 and the 2 × 2 GG/GG internal loop in GGGGCC repeat RNA are essential for binding. This work furnishes a valuable high‐resolution structural basis for understanding the binding mechanism for GGGGCC repeat RNA and RBPs, and steers RNA structure‐based drug design.