Bayesian inference for isoform-specific gene expression

1) Follow the three steps described in the manual for estimating isoform-specific gene expression. In the end you will get a file named "output.txt.2.b.matlab".

2) Download the MATLAB code for Bayesian inference for isoform-specific gene expression. "compute.m" is the main function. Run it with the file "output.txt.2.b.matlab" to get the final results.

The input format of the MATLAB code (bold type fonts are comments)

//file header

36280005 25 //#mapped reads, read length

//for each gene

Cacnb1 7 2 //gene name, #exons, #isoforms
NM_145121 NM_031173 //isoform IDs
462 204 684 20 155 628 217 //exon lengths
731 28 1709 0 4 1265 328 //read counts for exons
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 //junction lengths
0 31 0 0 0 0 7 0 0 0 0 24 0 2 0 0 15 0 0 0 24 //read counts for junctions
1 0 1 1 0 1 0 //indicator matrix
0 1 1 0 1 1 1

A sample input file is here

Some explananation about the variables and output of the MATLAB code

//variables

// global settings

NN// number of mapping reads
Read_len// length of read

// each iteration for a gene

name// gene name
m// number of exons
n// number of isoforms
trans// name of each isoforms
L// length of each exon
N// number reads mapping on every exon
L1// the length of each exon-exon junction, the junctions are indexed by 1-2, 1-3, ..., (n-1)-n, total n(n-1)/2 of them.
N1// the number of reads mapping to each junction.
A// indicator matrix

//Results

toyGene 2.694110 2.704110 2.474467 2.944638 //MLE, posterior mean, lower and upper bounds of the 95% posterior interval.
NM_1 1.474805 1.478264 1.314230 1.650764
NM_2 1.219305 1.225846 1.026216 1.440133
0.007426 -0.002060 //the posterior covariance matrix of the isoform expression
-0.002060 0.011215