Some basic terminology used in RNA-seq papers
http://thegenomefactory.blogspot.com/2013/08/paired-end-read-confusion-library.html
Basic idea to calculate RPKM (Reads Per Kilo exon per Million mappedread)
http://seqanswers.com/forums/showthread.php?t=29549
First of all, Cufflinks uses FPKM(Fragments Per Kilobase of exon per Million mapped fragments) instead of RPKM(Reads Per Kilobase of exon per Million mapped reads) to avoid confusion when dealing with paired-end data.
Secondly, Cufflinks uses corrections when calculating FPKM, so if you do a simple calculation it will not match that of Cufflink's. Anyway, the crude calculation for a gene would be (NOT the one that Cufflinks uses):
FPKM = [f / (e / 1000)] / (m / 1,000,000)
f - number of fragments mapping to gene
e - exonic length of gene
m - total number of mapped fragments
If you would like to know more about the corrections that Cufflinks applies to FPKM, see this paper:
Trapnell C, Williams BA, Pertea G, Mortazavi AM, Kwan G, van Baren MJ, Salzberg SL, Wold B, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
Nature Biotechnology doi:10.1038/nbt.1621
Supplementary Text and Figures, 3. Transcript abundance estimation
Also, have a look at Cufflink's FAQ
Tools
1. Cufflink: (pre-requisite : boost, samtools, elgen )
install:
http://cufflinks.cbcb.umd.edu/tutorial.html#inst
usage:
2.Tophat (pre-requisite : boost, samtools)
install:
usage:
3. HTseq
4. fluxcapacitor
5. bedtools
Comments
Post a Comment