Bayesian inference for isoform-specific gene expression
1) Follow the three steps described in the manual for estimating isoform-specific gene expression. In the end you will get a file named "output.txt.2.b.matlab".
2) Download the MATLAB code for Bayesian inference for isoform-specific gene expression. "compute.m" is the main function. Run it with the file "output.txt.2.b.matlab" to get the final results.
The input format of the MATLAB code (bold type fonts are comments)
//file header
36280005 25 //#mapped reads, read length
//for each gene
Cacnb1 7 2 //gene name, #exons,
#isoforms
NM_145121 NM_031173 //isoform IDs
462 204 684 20 155 628 217 //exon lengths
731 28 1709 0 4 1265 328 //read counts for exons
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 //junction
lengths
0 31 0 0 0 0 7 0 0 0 0 24 0 2 0 0 15 0 0 0 24 //read
counts for junctions
1 0 1 1 0 1 0 //indicator matrix
0 1 1 0 1 1 1
A sample input file is here
Some explananation about the variables and output of the MATLAB code
//variables
// global settings
NN// number of mapping reads
Read_len// length of read
// each iteration for a gene
name// gene name
m// number of exons
n// number of isoforms
trans// name of each isoforms
L// length of each exon
N// number reads mapping on every exon
L1// the length of each exon-exon junction, the junctions are indexed
by 1-2, 1-3, ..., (n-1)-n, total n(n-1)/2 of them.
N1// the number of reads mapping to each junction.
A// indicator matrix
//Results
toyGene 2.694110 2.704110 2.474467 2.944638 //MLE, posterior
mean, lower and upper bounds of the 95% posterior interval.
NM_1 1.474805 1.478264 1.314230 1.650764
NM_2 1.219305 1.225846 1.026216 1.440133
0.007426 -0.002060 //the posterior covariance matrix of the isoform
expression
-0.002060 0.011215