1.3.Differential Expression

Differential Expression for bulk RNAs

Pipeline

Data Structure

Inputs

Outputs

Running Scripts

Software/Tools

Assumption for most normalization and differential expression analysis tools: The expression levels of most genes are similar, i.e., not differentially expressed.

a) DEseq: defines scaling factor (also known as size factor) estimates based on a pseudoreferencesample, which is built with the geometric mean of gene counts across all cells (samples).

b) EdgeR (TMM): trimmed mean of M values

c) Wilcox Test using RPM: Read counts Per Million of total mapped reads; alternatives: RPKM, TPM

Performance:

Example of single case

# experimential design
design <- read.table("design.txt",sep="\t",header=T)

# expression matrix
mx <- read.table("miRNA.homer.ct.mx",sep="\t",header=T)

# filter genes
filter_genes <- apply(
    mx[,2:ncol(mx)],
    1,
    function(x) length(x[x > 2]) >= 2
)
mx_filterGenes <- mx[filter_genes,]

Draw Plots

1. Heatmap for DESeq2 normalized count matrix

2. PCA analysis

3. MA plot

4. Distance between samples

5. Hierarchical clustering for differential expressed genes

Tips/Utilities

edgeR User Guide

DESeq2 User Guide

Homework and more

  1. Identify differential expressed genes for other RNA types. between differential conditions, i.e. Normal Control (NC) V.S. HCC using three methods: edgeR, DESeq2 and Wilcox/Mann-Whitney-U Test.

  2. Draw Venn plot to show the difference among the above three methods.

More Reading and Practice

Video

1. Differential expression

@Youtube

@Bilibili

Last updated