A simple method for directional transcriptome sequencing using Illumina technology

  • A simple method for directional transcriptome sequencing using Illumina technology.

    Croucher NJ, Fookes MC, Perkins TT, Turner DJ, Marguerat SB, Keane T, Quail MA, He M, Assefa S, Bähler J, Kingsley RA, Parkhill J, Bentley SD, Dougan G and Thomson NR

    Nucleic acids research 2009;37;22;e148

Abstract

High-throughput sequencing of cDNA has been used to study eukaryotic transcription on a genome-wide scale to single base pair resolution. In order to compensate for the high ribonuclease activity in bacterial cells, we have devised an equivalent technique optimized for studying complete prokaryotic transcriptomes that minimizes the manipulation of the RNA sample. This new approach uses Illumina technology to sequence single-stranded (ss) cDNA, generating information on both the direction and level of transcription throughout the genome. The protocol, and associated data analysis programs, are freely available from http://www.sanger.ac.uk/Projects/Pathogens/Transcriptome/. We have successfully applied this method to the bacterial pathogens Salmonella bongori and Streptococcus pneumoniae and the yeast Schizosaccharomyces pombe. This method enables experimental validation of genetic features predicted in silico and allows the easy identification of novel transcripts throughout the genome. We also show that there is a high correlation between the level of gene expression calculated from ss-cDNA and double-stranded-cDNA sequencing, indicting that ss-cDNA sequencing is both robust and appropriate for use in quantitative studies of transcription. Hence, this simple method should prove a useful tool in aiding genome annotation and gene expression studies in both prokaryotes and eukaryotes.

Publications using this method:

  • A strand-specific RNA-Seq analysis of the transcriptome of the typhoid bacillus Salmonella typhi.

    Perkins TT, Kingsley RA, Fookes MC, Gardner PP, James KD, Yu L, Assefa SA, He M, Croucher NJ, Pickard DJ, Maskell DJ, Parkhill J, Choudhary J, Thomson NR and Dougan G

    PLoS genetics 2009;5;7;e1000569

Raw data associated with this study can be down loaded from here.

Data formats and programs used in the computational pipeline:

The computational pipeline

The computational pipeline

zoom

  1. The solexa reads should be in fastq format and the reference genome in fasta format. Align the transcriptome reads to the reference genome using SSAHA2 using the -solexa parameter. e.g.
    ssaha2 -solexa reads.fastq reference.fasta > output.cigar
    
  2. Use the cigar2CoverageStranded script to create the Artemis plots. The script is run by:
    perl -w cigar2CoverageStranded.pl ssaha_cigar_output_file reference_genome_fasta
    

This will produce a file containing the data for the plot which can be loaded directly into Artemis (from the 'Graph' menu and select the 'Add user plot' option).

To view in Artemis:

  1. Click to launch Artemis. Alternatively download Artemis from the Artemis Home Page.
  2. Load the S. Typhi sequence file.
  3. Download the S. Typhi transcriptome plot.
  4. From the 'Graph' menu load in the transcriptome data as a graph by using the Graph -> 'Add user plot...' option.

Figure to show how the RNA-seq data appears in Artemis once processed

Fig 1: RNA-seq data displayed in Artemis. Mapped RNA-seq data is displayed as a plot showing sequence depth for the forward (blue) and reverse strand (red). The S. bongori genome annotation is also shown. The graphs, from the top downwards, represent the result of sequencing i) undepleted ss-cDNA ii) depleted ss-cDNA iii) depleted ss-cDNA with actinomycin D present in the reverse transcription reaction iv) ds-cDNA v) ds-cDNA with actD present in the reverse transcription reaction.

Fig 1: RNA-seq data displayed in Artemis. Mapped RNA-seq data is displayed as a plot showing sequence depth for the forward (blue) and reverse strand (red). The S. bongori genome annotation is also shown. The graphs, from the top downwards, represent the result of sequencing i) undepleted ss-cDNA ii) depleted ss-cDNA iii) depleted ss-cDNA with actinomycin D present in the reverse transcription reaction iv) ds-cDNA v) ds-cDNA with actD present in the reverse transcription reaction.

zoom

* quick link - http://q.sanger.ac.uk/1mqbsui9