Skip to content
cziegenhain edited this page Aug 9, 2018 · 12 revisions

zUMIs' output is structured in two subdirectories:

zUMIs_output/expression
zUMIs_output/stats
  • "expression" contains .rds files
    • list of count matrices as sparseMatrix (*.dgecounts.rds)
  • "stats" contains plots and data files with descriptive statistics
  • STAR output files and featureCounts per reads files are stored in the parent directory defined by the user

Structure of the output dgecounts object in .dgecounts.rds

zUMIs produces dge output in .rds format that can be read in R with the following command.

AllCounts <- readRDS("zUMIs_output/expression/example.dgecounts.rds")
names(AllCounts)
[1] "umicount"  "readcount"

names(AllCounts$umicount)
[1] "exon"   "inex"   "intron"

names(AllCounts$umicount$exon)
[1] "all"          "downsampling"

AllCounts is a list of lists with all the count matrices as sparseMatrix. The parent list contains UMI and read count quantification. In each of these counting types, you will find the three feature types (introns,exons and intron+exon). Each of those contain a sparseMatrix generating using all reads observed and a list for the downsampling sizes requested. Each of the downsampling list elements is also a sparseMatrix.

The sparseMatrix can be converted into a conventional count table using "as.matrix" function in R and saved as a text file using the code below.

#Feature exons umicounts table
dge <- as.matrix(AllCounts$umicount$exon$all)
write.table(dge,"exons.umicounts.txt",quote=F,sep="\t")

NOTE: It is highly recommended to retain sparseMatrix format for the downstream analysis, especially when you have >20,000 cells to save time and space.

Clone this wiki locally