1.4 Normalization Issues


Problems and issues

  • Sparsity of data and technical noise ("batch effects") --> will mask the signal of interest


Spike-in for Normalization

RNA content (total amount and species) varies

for mRNAs:

  • 92 ERCC molecules

  • 8 mRNAs

  • whole transcriptome HeLa RNAs

for sRNAs:

  • 52(?) sRNA sequences


Typically only half of the spike-in were detected.

Computational Normalization Tools

for Single cell RNA-seq (and exRNA-seq)

  1. scran:

    1. pools multiple cells (samples) in order to estimate cell-specific size factors in the presence of zero inflation and unbalanced differential expression of genes across groups of cells;

    2. precluster (using e.g. rank-based clustering) the cells into smaller, more homogeneous sets

  2. SCnorm

  3. Census

If considering spike-ins:

  1. SAMstrt

  2. GRM


More Reading & Practice

See more about normalization, imputation and confounder (e.g. batch effect) in


a) Normalization 1



b) Normalization 2