May 23, 2024
In vivo single-molecule analysis reveals COOLAIR RNA structural diversity – Nature

In vivo single-molecule analysis reveals COOLAIR RNA structural diversity – Nature

Statistics

No statistical methods were used to predetermine the sample size. The experiments were not randomized, and investigators were not blinded to allocation during experiments and outcome assessment. Sampling in all cases was performed by collecting materials independently from separate plants.

Plant materials and growth conditions

The genotypes Col FRISF2 (Col FRI) and Var2–6 near-isogenic line have been described previously3,7. FLCWT, FLCWTTEX, FLCmut, FLCmut-r, FLCmutTEX and FLCmut-rTEX were transgenic lines carrying an approximately 12 kb wild-type or mutated FLC genomic fragment. FLCmut was generated by introducing four-nucleotide mutations using site-directed mutagenesis. FLCWTTEX and FLCmutTEX were generated by inserting a NOS terminator fragment in the first exon of COOLAIR in the wild-type or mutated FLC genomic fragment, respectively3. FLCmut-r was generated by inserting a fragment (GAAATAAAGCGAGAACAAATGAAAACCCAGGT) complementary to the big bulge in the H4–H6 region using site-directed mutagenesis. Primers used for the construction are listed in Supplementary Table 1. The fragments were then cloned into SLJ77515 (ref. 26) and transformed into the Arabidopsis flc-2 FRI genotype3 with a floral-dipping method. Transgenic lines with a single insertion that segregated 3:1 for Basta resistance were identified in the T2 generation to generate homozygous T3 lines. T3 homozygous lines with FLCmut in flc-2 FRI background were crossed with Col FRI (WT) for F1 generation (Extended Data Fig. 8g) or with the flc-2 fri background for  FLCmut fri (Extended Data Fig. 8h).

Seeds were surface-sterilized and sown on half-strength Murashige and Skoog medium. The plates were kept at 4 °C for 2–3 days. For warm-grown plants, seedlings were grown in warm conditions (16 h light, 8 h darkness with constant 20 °C) for 10 days. For the cold treatment, the plants were subjected to a two-week treatment at 5 °C (8 h light and 16 h dark conditions) after a 10-day pre-growth period in warm conditions.

(+)SHAPE and (−)SHAPE smStructure-seq library construction

We used the SHAPE reagent, NAI, to do the in vivo RNA secondary structure chemical probing. NAI was prepared as reported previously13. In brief, A. thaliana seedlings were completely covered in 20 ml 1× SHAPE reaction buffer (100 mM KCl, 40 mM HEPES (pH 7.5) and 0.5 mM MgCl2) in a 50-ml Falcon tube. NAI was added to a final concentration of 1 M and the tube swirled on a shaker (1,000 rpm). This high NAI concentration allows NAI to penetrate plant cells and modify the RNA in vivo. After quenching the reaction with freshly prepared dithiothreitol (DTT), the seedlings were washed with deionized water and immediately frozen with liquid nitrogen and ground into powder. Total RNA was extracted using the hot phenol method4, followed by DNase I treatment in accordance with the manufacturer’s protocol. The control group was prepared using DMSO (labelled as (−)SHAPE), following the same procedure as described above. Then, 2 µg (+)SHAPE or (−)SHAPE RNA samples was added to a 19-µl buffer system containing 2 µl 0.5 µM RNA–DNA hybrid adaptors (5′-rArGrArUrCrGrGrArArGrArGrCrArCrArCrGrUrCrUrGrArArCrUrCrCrArGrUrCrArC/3SpC3/ and 5′-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTN (N = equimolar A, T, G, C)), 4 µl 5× reaction buffer (2.25 M NaCl, 25 mM MgCl2, 100 mM Tris-HCl, pH 7.5), 2 µl 10× DTT (50 mM; made fresh or from frozen stock) and 1 µl TGIRT-III enzyme (10 µM; InGex). The reaction system was pre-incubated at room temperature for 30 min, then 1 µl of 25 mM dNTPs (an equimolar mixture of dATP, dCTP, dGTP and dTTP; at 25 mM each; RNA-grade) was added. The whole reaction system in the tube was incubated at 60 °C for 120 min. To remove the TGIRT-III enzyme from the template, 1 µl of 5 M NaOH was added and the sample incubated at 95 °C for 3 min. The sample was cooled down to room temperature and neutralized with 1 µl of 5 M HCl before the clean-up of the cDNAs with a MinElute Reaction Cleanup Kit (QIAGEN, 28204). To capture class I and class II COOLAIR isoforms along with 18S rRNA, PCR reactions with 10 cycles were done with specific primers (Supplementary Table 1) using KOD Xtreme Hot Start DNA Polymerase (Novagen). The amplified DNA fragments from the eight replicates of the PCR reactions were merged to obtain sufficient DNA. The resulting DNA samples were size-selected using the Solid Phase Reversible Immobilization size-selection system (BECKMAN COULTER). Two independent biological replicates were generated for both (+)SHAPE and (−)SHAPE smStructure-seq libraries. The purified DNA samples were subjected to PacBio library construction by BGI using a PacBio Sequel 3.0.

smStructure-seq data analysis of COOLAIR isoforms

The raw reads from (+)SHAPE and (−)SHAPE libraries were converted into HiFi reads (circular consensus sequences) using ‘ccs’ (https://github.com/PacificBiosciences/ccs) with parameters ‘–minPasses=3’ in order to achieve around 99.8% predicted accuracy (Q30)14. The HiFi reads were demultiplexed using the demultiplex barcoding algorithm Lima v.1.11.0 (https://github.com/pacificbiosciences/barcoding). The derived HiFi reads were mapped to both COOLAIR references and 18S rRNA (Supplementary Table 1) using BLASR (v.5.3.3)27 with parameters ‘–minMatch 10 -m 5 –hitPolicy leftmost’. Each read was converted into a ‘bit vector’. In brief, each bit vector corresponds to a single read and consists of series of zeroes (representing matches) and ones (mutations representing mismatches and unambiguously aligned deletions)11. To generate the overall SHAPE reactivity profiles, the mutation rate (MR) at a given nucleotide is simply the total number of ones divided by the total number of zeroes and ones at that location. Raw SHAPE reactivities of class II COOLAIR were then generated for each nucleotide using the following equation:

$$R=frac{{{rm{MR}}}_{left(+right){rm{SHAPE}}}-{{rm{MR}}}_{left(-right){rm{SHAPE}}}}{1-{{rm{MR}}}_{left(-right){rm{SHAPE}}}}$$

where (+)SHAPE corresponds to a NAI-treated sample and (−)SHAPE refers to a DMSO-treated sample. The true-negative rate, 1 − MR(−)SHAPE, represents the specificity at a specific location. The raw SHAPE reactivity (R) mathematically estimates the positive likelihood ratio of SHAPE modification. The raw SHAPE reactivity was normalized to a standard scale that spanned from 0 (no reactivity) to around 1 (high SHAPE reactivity)28 for showing the mutational profiles.

Structural analysis of class II COOLAIR isoforms by DaVinci

The whole pipeline of DaVinci is illustrated in Extended Data Fig. 2a. The bitvectors generated from previous step were transformed into constraint information (‘1’ representing single-stranded nucleotides) for each sequencing read of class II COOLAIR isoforms. The single-stranded constraints were incorporated into the SCFG engine of the DaVinci pipeline. The SCFG engine, including a set of transformation rules for SCFG and a probability distribution of the transformation rules for each non-terminal symbol, was provided by CONTRAfold29 with an extended function utility in CentroidFold30 (–engine CONTRAfold –sampling). The generated RNA structures with constraints derived from individual bitvectors were collected. Because the different structures can have the same mutational profile during probing, we used the sampling function with constraint of a bitvector to capture multiple structures of class II.ii COOLAIR isoforms. All of the collected RNA structures were transformed into dot-bracket strings followed by transformation into RNA structure elements using rnaConvert in the Forgi package31. The digitalized RNA secondary structure elements were extracted to create a numeric matrix and subjected to dimensionality reduction, such as PCA or multidimensional scaling. The dimensionality reduction results were clustered using k-means clustering with the k-means function from the scikit-learn Python package32. The value of k was set as determined visually. The representative structure for each cluster was identified by calculating the most common RNA structure type at each position (that is, the maximum expected accuracy) and was determined by the RNA structure that is at the centre of the cluster and most similar to the most common RNA structure. The base-pair probability was calculated by counting the frequency of all present base pairs in the conformation space. The positional base-pair probability was derived by ({P}_{i}=mathop{sum }limits_{j}^{J}{P}_{ij}), where Pij is the probability of base i of being base-paired with base j, over all its potential J pairing partners. The likelihood of single strandedness was calculated by the expression of 1 − Pi. In addition, the Shannon entropy was calculated as ({E}_{i}=mathop{sum }limits_{j}^{J}-{P}_{ij}{log }_{10}left({P}_{ij}right)).

Structural analysis of HIV-1 RRE, RRE61, cspA and TenA

Probing data for HIV-1 RRE11 were obtained from RRE-invitroDMS_NL43rna.bam (https://codeocean.com/capsule/6175523/tree/v1). Probing data for the cspA 5′ untranslated region33 at 37 °C and 10 °C were obtained from Sequence Read Archive (accessions numbers SRR6123773 and SRR6123774). We performed the RNA structure probing experiments of in vitro folded HIV-1 RRE61 RNAs (3 pmol) containing the stem loops III, IV and V18 as described previously11. The TenA RNAs (3 pmol) were subjected to NAI chemical treatment13,34 in the presence or absence of 1 µM thiamine pyrophosphate (TPP). The NAI-modified RNA samples (TPP-treated and non TPP-treated RNAs) were mixed with a ratio of 20:80 (vol/vol) or 50:50 (vol/vol) for the library construction. All of the sequencing data were mapped to the respective references as described above. The subsequent bitvectors were generated and subjected to the DaVinci analysis described above, including the creation of the numeric matrix for the digitalized RNA structure elements, dimensionality reduction, k-mean determination and representative structure construction. In silico structural ensemble analysis of RRE wild-type and mutant RRE61 were performed by Boltzmann sampling (10,000 times) using RNAfold35. The subsequent analysis for the in silico structure ensemble is the same as for the DaVinci analysis but includes only the steps of creating the numeric matrix for the digitalized RNA structure elements, dimensionality reduction, k-mean determination and representative structure construction.

Total RNA extraction and RT–qPCR for gene expression analysis

Total RNA was extracted as previously described36. Genomic DNA was digested with TURBO DNA-free (Ambion Turbo DNase kit, AM1907) according to the manufacturer’s guidelines before reverse transcription was performed. Reverse transcription was performed with the SuperScript III Reverse Transcriptase (ThermoFisher, 18080093) following the manufacturer’s protocol using gene-specific primers. The standard reference gene UBC (At5g25760) for gene expression was used for normalization. All primers are listed in Supplementary Table 1.

Chromatin-bound RNA measurement assay

Chromatin-bound RNAs were extracted as previously outlined37. In brief, 2 g of warm-grown or cold-grown seedlings were ground into fine powder using mortar in liquid nitrogen. Then, 1% of the materials (about 200 mg fine powder) was used for total RNA extraction as described above. The nuclei from the remaining material were prepared with Honda buffer in the presence of 50 ng μl−1 tRNA, 20 U ml−1 RNase inhibitor (SUPERase-In; Life Technologies), and 1× cOmplete protease inhibitor (Roche). The nuclei pellet was resuspended in an equal volume of resuspension buffer (50% (vol/vol) glycerol, 0.5 mM EDTA, 1 mM DTT, 100 mM NaCl and 25 mM Tris-HCl pH 7.5) and washed twice with urea wash buffer (300 mM NaCl, 1 M urea, 0.5 mM EDTA, 1 mM DTT and 1% Tween-20 and 25 mM Tris-HCl pH 7.5). Two volumes of wash buffer were added to the resuspended nuclei and vortexed for 1 s. The chromatin was spun down and protein was removed using phenol–chloroform. RNAs from the supernatant were precipitated with isopropanol, dissolved and DNase-treated. The chromatin-bound RNAs were reverse-transcribed with the SuperScript III Reverse Transcriptase (ThermoFisher, 18080093) following the manufacturer’s protocol. A mixture of gene-specific primers (Supplementary Table 1) and EF1alpha (At5g60390.2)37,38, to estimate how many RNAs were bound to genome DNA (expressed as (chromatin-bound RNA)/EF1alpha), were included in the reverse-transcription reaction. The total RNAs were also reverse transcribed with the SuperScript III Reverse Transcriptase (ThermoFisher, 18080093) following the manufacturer’s protocol. A mixture of gene-specific primers (Supplementary Table 1) and PP2A (At1g13320) as a control were added to the reverse-transcription reaction, which estimates the total expression level of class II (expressed as (total RNA)/PP2A). The chromatin-binding ratio was calculated using the equation:

$${rm{Chromatin-binding}},{rm{ratio}}=frac{({rm{Chromatin-bound}},{rm{RNA}})/EF{1}alpha}{({rm{Total}},{rm{RNA}})/PP{2}A}.$$

ChIRP–qPCR assay

ChIRP was performed as previously outlined, with some modifications4,39,40. Antisense DNA probes were designed against the distal exon sequence of COOLAIR class II and biotinylated at the 3′ end; probes are listed in Supplementary Table 1. Then, 3 g of warm-grown seedlings were crosslinked in 3% (vol/vol) formaldehyde at room temperature in a vacuum. Crosslinking was then quenched with 0.125 M glycine for 5 min. Crosslinked plants were ground into a fine powder and lysed in 50 ml of cell lysis buffer (20 mM Tris-HCl pH 7.5, 250 mM sucrose, 25% glycerol, 20 mM KCl, 2.5 mM MgCl2, 0.1% NP-40 and 5 mM DTT). The lysate was filtered through two layers of Miracloth (Merck, D00172956) and pelleted by centrifugation. The pellets were washed twice with 10 ml of nuclear wash buffer (20 mM Tris-HCl pH 7.5, 2.5 mM MgCl2, 25% glycerol, 0.3% Triton X-100 and 5 mM DTT). The nuclear pellet was then resuspended in nuclear lysis buffer (50 mM Tris-HCl pH 7.5, 10 mM EDTA, 1% SDS, 0.1 mM PMSF and 1 mM DTT) and sonicated using a Bioruptor ultrasonicator (Diagenode). All of the buffers were supplemented with 0.1 U μl−1 RNaseOUT (Life Technologies), 1 mM PMSF and Roche cOmplete tablets to keep the integrity of any RNA–protein and protein–protein complexes. The following steps were performed as previously described40. For each reaction, 30 μl pre-blocked Streptavidin C1 magnetic beads (Thermo Fisher Scientific, 65001) were used. Then, 20 μl of RNase A/T1 Mix (Thermo Fisher Scientific, EN0551) instead of RNaseOUT was added into the RNase+ reactions (Fig. 4e), just before the hybridization (at 37 °C for 4 h) started; these samples were used as the control for background noise. RNA was eluted and reverse transcribed using SuperScript IV Reverse Transcriptase (ThermoFisher, 18090050) with gene-specific primers. COOLAIR enrichment and DNA eluted was analysed by RT–qPCR. All primers used for reverse transcription and RT–qPCR are listed in Supplementary Table 1.

Electrophoretic mobility shift assays

Electrophoretic mobility shift assays (EMSAs) were performed as described previously21 using oligonucleotides end-labelled with Cy5 (DNA) or FAM (RNA). Oligonucleotide sequences are shown in Supplementary Table 1. EMSAs were done using home-made 15% polyacrylamide gels with 40 mM Tris-acetate (pH 7.4) and 10 mM MgCl2 at 15 volt cm−1. Gel images were taken with a Typhoon FLA 9500 fluorescence reader (GE Healthcare Life Sciences). Sequences for the positive control rDNA enhancer En3-PAPAS were obtained from a previous study21.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Source link