FAQ for rSeq: RNA-Seq Analyzer

Hui Jiang


You've developed a Baysian approach to estimate confidence intervals for isoform expression estimates in the paper "Statistical Inferences for Isoform Expression in RNA-Seq". Is that algorithm implemented in rSeq?

Yes or No. That work was written in MATLAB therefore is not part of rSeq yet. I am also planning to write an R package that does the job. For the MATLAB code that does Bayesian inference, please follow the instructions here.

How much memory does rSeq need?

Usually 2GB memory should be sufficient. For more than 10M reads, SeqMap may need more than 2GB memory, which usually requires a 64-bit system. To reduce memory usage, you can split the read file into two or more parts and map each part separately and then concatenate the mapping results. You can also use other aligners such as Eland, BWA, Bowtie and Bowtie2.

How can I splite the read file and combine the results?

Use linux command "split" to split the reads, "cat" to concatenate the results.

What SeqMap does with duplicate reads? does it count them in the output but just leave them out of the alignment?

SeqMap finds all the duplicate reads, maps one of them, and replicates the result for the rest of them. So no worries, all of them are there in the output file. For more questions about SeqMap, please refer to its website.

Are the mapping results in SeqMap output in strict order? I'm planning to split my reads data into several pieces, map them with SeqMap and combine the results as input to rSeq. I'm wondering whether this partition will break the structure in SeqMap output file.

Yes, they are in strict order. You can go ahead splitting the data files.

Can rSeq compute expressions on SeqMap data run with a 3 or more bp mismatch allowance?

Yes, simply run rSeq with option "-n k" where k is the number of mismatches allowed.

I am mapping some reads to a hg18 fasta file and I¡¯m getting this message as seqmap runs: "......Bad charactor R found when processing transcript chr16_random. Skipped......." What does this mean? Is it skipping the entire chromosome? Or just that bp?

No worries. Only that base is skipped.

I got the following error in running SeqMap:
bad format in line: @SRR002051.1 :8:1:325:773 length=33 AAAGAACATTAAAGCTATATTATAAGCAAAGAT NM
internal error: read file failed

It is because you have space in your read tag "@SRR002051.1 :8:1:325:773 length=33". Changing it to ""@SRR002051.1" should solve the problem. You should use /eland:3 in the mapping.

SeqMap told me that there is not enough memory but actually I have.

Sometimes SeqMap has some trouble detection the real available memory. If you are sure that is the case you can turn it off by using option "/available_memory:8000" to specify 8G available memory as in the example.

What makes rSeq any different than ERANGE, Cufflinks and others?

rSeq is a set of tools for RNA-Seq analysis, including quality control, sequence alignment, gene and isoform expression calculation and so on. rSeq maps the reads to transcript sequences rather than genome as done in ERANGE, Cufflinks and other, which can help reduce multi-reads, reduce running time, fully exploit splice reads and etc. For isoform expression, rSeq uses the approach described in "Statistical Inferences for isoform expression in RNA-Seq", Bioinformatics, 25(8):1026-1032, (2009). For paired-end RNA-Seq analysis, rSeq uses the method described in "Statistical Modeling of RNA-Seq Data", Statistical Science, 26(1):62-83, (2011).

What command will tell me the rSeq version I¡¯m using?

Sorry, no such command yet. To make sure that you are using the latest version, you can download it again from the website.

Does rSeq do de novo isoform discovery?

No, it does not. For de novo splicing event discovery, you can try SpliceMap.

Can I use other mapping softwares with rSeq?

Yes, right now SeqMap, Eland, BWA, Bowtie and Bowtie2 are tested to work with rSeq. In priciple, any aligner which outputs in either Eland-multi or SAM format will work with rSeq.

Does rSeq output per exon RPKMs?

No, it does not.

Does rSeq do differential expression?

Not yet. It's my next thing to do.