A simple method for directional transcriptome sequencing using Illumina technology
Nicholas J. Croucher*, Maria C. Fookes*, Timothy T. Perkins*, Daniel J. Turner, Samuel B. Marguerat, Thomas Keane, Michael A. Quail, Miao He, Sammey Assefa, Jürg Bähler, Robert A. Kingsley, Julian Parkhill, Stephen D. Bentley, Gordon Dougan & Nicholas R. Thomson
Published
online in Nucleic Acids Research, 8th October, 2009
Abstract
High-throughput sequencing of cDNA has been used to study eukaryotic transcription on a genome-wide scale to single base pair resolution. In order to compensate for the high ribonuclease activity in bacterial cells, we have devised an equivalent technique optimized for studying complete prokaryotic transcriptomes that minimizes the manipulation of the RNA sample. This new approach uses Illumina technology to sequence single-stranded (ss) cDNA, generating information on both the direction and level of transcription throughout the genome. The protocol, and associated data analysis programs, are freely available from http://www.sanger.ac.uk/Projects/Pathogens/Transcriptome/. We have successfully applied this method to the bacterial pathogens Salmonella bongori and Streptococcus pneumoniae and the yeast Schizosaccharomyces pombe. This method enables experimental validation of genetic features predicted in silico and allows the easy identification of novel transcripts throughout the genome. We also show that there is a high correlation between the level of gene expression calculated from ss-cDNA and double-stranded-cDNA sequencing, indicting that ss-cDNA sequencing is both robust and appropriate for use in quantitative studies of transcription. Hence, this simple method should prove a useful tool in aiding genome annotation and gene expression studies in both prokaryotes and eukaryotes.Publications using this method:
Perkins et al. (2009) A Strand-Specific RNA-Seq Analysis of the Transcriptome of the Typhoid Bacillus Salmonella Typhi PLoS Genet 5(7): e1000569

Data formats and programs used in the computational pipeline:
1) The solexa reads should be in fastq format and the reference genome
in fasta format. Align the transcriptome reads to the reference genome
using SSAHA2 using the -solexa parameter.
e.g.
ssaha2 -solexa reads.fastq reference.fasta > output.cigar2) Use the cigar2CoverageStranded script to create the Artemis plots. The script is run by:
perl -w cigar2CoverageStranded.pl ssaha_cigar_output_file reference_genome_fastaThis will produce a file containing the data for the plot which can be loaded directly into Artemis (from the 'Graph' menu and select the 'Add user plot' option ).
To view in Artemis:
1) Click to launch Artemis. Alternatively download Artemis from the Artemis Home Page
2) Load the S. Typhi sequence file from here.
3) Download the S. Typhi transcriptome plot from here.
4) From the 'Graph' menu load in the transcriptome data as a graph by using the Graph -> 'Add user plot...' option.
Figure to show how the RNA-seq data appears in Artemis once processed

Fig 1: RNA-seq data displayed in Artemis. Mapped RNA-seq data is displayed as a plot showing sequence depth for the forward (blue) and reverse strand (red). The S. bongori genome annotation is also shown. The graphs, from the top downwards, represent the result of sequencing i) undepleted ss-cDNA ii) depleted ss-cDNA iii) depleted ss-cDNA with actinomycin D present in the reverse transcription reaction iv) ds-cDNA v) ds-cDNA with actD present in the reverse transcription reaction.



