JETTA was developed to detect alternatively spliced exons between two conditions, for example, between two groups of treated and untreated patients in a typical clinical study. The current goal is to provide users a convenient tool to identify alternatively spliced exons in their clinical data for experimental verification and following up studies.
RNA sequencing analysis is under active research, and many algorithms have been developed or are being developed for the low-level RNA-Seq analysis, such as base calling, sequence mapping and alignment, transcript assembly, and expression calculation from RNA-Seq data. For alternative splicing analysis, JETTA currently uses the pre-calculated expression indices of exons and junctions from these existing tools (please see below on the details of the input files).
For low-level RNA-Seq analysis, please refer to SeqMap, rSeq, and SpliceMap or other tools.
JETTA can be utilized for the analysis of calculated expression indices of exons and junctions from any RNA-Seq platform. The software requires the following input files:
The first three files above include the expression indices of genes, exons and junctions (optional) calculated from a typical RNA-Seq data set. The expression indices can be calculated using appropriate low-level analysis software (we used SeqMap and rSeq) as an appropriate measure, for an example, unlogged RPKM (Wang et al, Nature, 2008). JETTA alternative splicing analysis requires multiple samples of two conditions. For n samples, an expression matrix file should have the following tab-delimited format:
The alternative splicing structure file is an annotation file that defines the associations of gene-exon, gene-junction and exon-junction, with following columns:
Users can define their own alternative splicing structure files according to their annotations of genes, exons and junctionsi, which are used in the low level analysis of seqneucen mapping and alignment. Here are the annotation files we used for RNA-Seq analysis in Xu et al, PNAS, 2011, and the alternative splicing structure file of the GG-H arrays.
JETTA calculates alternative splicing statistics from the given expression matrixes and alterantive splicing strcuture files. This function is only available in R now, but will be merged into the automatic pipleline with GUI.
# read data
ASS = read.table('./data/asa.structure.txt',header=TRUE,sep='\t');
TC = read.table('./data/expr.tc.txt',header=TRUE,sep='\t',row.names=1);
EXON = read.table('./data/expr.psr.txt',header=TRUE,sep='\t',row.names=1);
JUNC = read.table('./data/expr.junc.txt',header=TRUE,sep='\t',row.names=1);
AS.LIST = read.table('./data/as_candidates_from_ggh.txt')[,1];
TC.ANNOT = read.table('./data/tc.annotation.txt',header=TRUE,sep='\t');
# convert to asa
asa.structure = ASS;
tc.expr = as.matrix(TC[,2:9]);
tc.nreads = as.matrix(TC[,10:17]);
exon.expr = as.matrix(EXON[,2:9]);
exon.nreads = as.matrix(EXON[,10:17]);
junc.expr = as.matrix(JUNC[,2:9]);
junc.nreads = as.matrix(JUNC[,10:17]);
asa.class = c(-1,-1,-1,-1,1,1,1,1);
ASA = jetta.rnaseq.convert_to_asa(ASS,tc.expr,tc.nreads,exon.expr,exon.nreads,junc.expr,junc.nreads,asa.class);
Using the calculated alternative splicing statistics, alternatively spliced exons can be detected using either R or GUI.
An example R script
asa.sel = jetta.asa.filtering(asa,midas.cutoff=0.01,tc.expr.cutoff=1,tc.fc.cutoff=2,ps.fc.cutoff=2);
asa.list = unique(asa.sel$asa$PSR[asa.sel$asa$Selected==1]);
The rest of analysis is the same for RNA-Seq and microarrays. Please see the GUI tutorial or R-script Examples.
The JETTA RNA-Seq analysis was demonstrated with human liver and muscle tissue samples. Please find the details here.
Back to top