![]() ![]() (D) Density plots of the likelihoods (solid lines, scaled to integrate to 1) and the posteriors (dashed lines) for the green and purple genes and of the prior (solid black line): due to the higher dispersion of the purple gene, its likelihood is wider and less peaked (indicating less information), and the prior has more influence on its posterior than for the green gene. (C) The counts (normalized by size factors s j) for these genes reveal low dispersion for the gene in green and high dispersion for the gene in purple. Two genes with similar mean count and MLE logarithmic fold change are highlighted with green and purple circles. Small triangles at the top and bottom of the plots indicate points that would fall outside of the plotting window. Plots of the (A) MLE (i.e., no shrinkage) and (B) MAP estimate (i.e., with shrinkage) for the LFCs attributable to mouse strain, over the average expression strength for a ten vs eleven sample comparison of the Bottomly et al. Figure 2Įffect of shrinkage on logarithmic fold change estimates. MAP, maximum a posteriori MLE, maximum-likelihood estimate. Additional file 1: Figure S1 displays the same data but with dispersions of all genes shown. For clarity, only a subset of genes is shown, which is enriched for dispersion outliers. ![]() The black points circled in blue are detected as dispersion outliers and not shrunk toward the prior (shrinkage would follow the dotted line). This can be understood as a shrinkage (along the blue arrows) of the noisy gene-wise estimates toward the consensus represented by the red line. This fit is used as a prior mean for a second estimation round, which results in the final MAP estimates of dispersion (arrow heads). Then, a curve (red) is fit to the MLEs to capture the overall trend of dispersion-mean dependence. First, gene-wise MLEs are obtained using only the respective gene’s data (black dots). dataset with six samples across two groups and (B) for five samples from the Pickrell et al. Plot of dispersion estimates over the average expression strength (A) for the Bottomly et al. You can also follow the podcast on Twitter and Mastodon.Shrinkage estimation of dispersion. One of the recent papers from my lab, MRLocus for eQTL and GWAS integration: The EOSS award, which has funded vizWithSCE by Kwame Forbes, and nullranges by Wancen Mu and Eric Davis: I credit Kevin Blighe and Alexander Toenges, who help to answer lots of DESeq2 questions on the support site: Heavy-tailed distributions for effect sizes, Zhu et al 2018: Stephens paper on the false sign rate (ash): Schurch et al 2016, a RNA-seq dataset with many replicates, helpful for benchmarking: Regarding estimating the width of the dispersion prior, references are the Robinson and Smyth 2007 paper, McCarthy et al 2012 (edgeR), and Wu et al 2013 (DSS): We talk about using publicly available data as a prior, references I mention are the McCall et al paper using publicly available data to ask if a gene is expressed, and a new manuscript from my lab that compares splicing in a sample to GTEx as a reference panel: We talk about robust steps for estimating the middle of the dispersion prior distribution, references are Anders and Huber 2010 (DESeq), Eling et al 2018 (one of the BASiCS papers), and Phipson et al 2016: The recent manuscript mentioned from the Kendziorski lab, which has a Gamma-Poisson hierarchical structure, although it does not in general reduce to the Negative Binomial: Limma, the original paper and limma-voom: Chan Zuckerberg Initiative: Ensuring Reproducible Transcriptomic Analysis with DESeq2 and tximetaĪnd a more comprehensive set of links from Mike himself:.Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. ![]()
0 Comments
Leave a Reply. |