Training @ Lu Lab
Lu Lab Docs
  • Home
    • Training @ Lu Lab
  • Drylab Training
    • Genomics
      • RNA Types in Genome
  • Wetlab Training
    • Wetlab Safety Guide
    • Wetlab FAQ
  • Archive
    • Archive 2021
      • cfDNA Methylation
      • Genomic Annotation
    • Archive 2019 - Wetlab Training
      • Class I. Basics
        • 1. Wet Lab Safety
        • 2. Wet Lab Regulation
        • 3. Wet Lab Protocols
        • 4. How to design sample cohort
        • 5. How to collect and manage samples
        • 6. How to purify RNA from blood
        • 7. How to check the quantity and quality of RNA
        • 8. RNA storage
        • 9. How to remove DNA contanimation
        • 10. What is Spike-in
      • Class II. NGS - I
        • 1. How to do RNA-seq
        • 2. How to check the quantity and quality of RNA-seq library
        • 3. What is SMART-seq2 and Multiplex
    • Archive 2019 - Drylab Training
      • Getting Startted
      • Part I. Programming Skills
        • Introduction of PART I
        • 1.Setup
        • 2.Linux
        • 3.Bash and Github
        • 4.R
        • 5.Python
        • 6.Perl
        • Conclusion of PART I
      • Part II. Machine Learning Skills
        • 1.Machine Learning
        • 2.Feature Selection
        • 3.Machine Learning Practice
        • 4.Deep Learning
      • Part III. Case studies
        • Case Study 1. exRNA-seq
          • 1.1 Mapping, Annotation and QC
          • 1.2 Expression Matrix
          • 1.3.Differential Expression
          • 1.4 Normalization Issues
        • Case Study 2. exSEEK
          • 2.1 Plot Utilities
          • 2.2 Matrix Processing
          • 2.3 Feature Selection
        • Case Study 3. DeepSHAPE
          • 3.1 Background
          • 3.2 Resources
          • 3.3 Literature
      • Part IV. Appendix
        • Appendix I. Keep Learning
        • Appendix II. Public Data
        • Appendix III. Mapping Protocol of RNA-seq
        • Appendix IV. Useful tools for bioinformatics
      • Part V. Software
        • I. Docker Manual
        • II. Local Gitbook Builder
        • III. Teaching Materials
  • Archive 2018
Powered by GitBook
On this page
  • Background
  • Problems and issues
  • Spike-in for Normalization
  • Computational Normalization Tools
  • References
  • More Reading & Practice
  • Video
  • a) Normalization 1
  • b) Normalization 2
Edit on GitHub
  1. Archive
  2. Archive 2019 - Drylab Training
  3. Part III. Case studies
  4. Case Study 1. exRNA-seq

1.4 Normalization Issues

Last updated 3 years ago

Background

Problems and issues

  • Sparsity of data and technical noise ("batch effects") --> will mask the signal of interest

Causes:

Spike-in for Normalization

RNA content (total amount and species) varies

for mRNAs:

  • 92 ERCC molecules

  • 8 mRNAs

  • whole transcriptome HeLa RNAs

for sRNAs:

  • 52(?) sRNA sequences

Caveat:

Typically only half of the spike-in were detected.

Computational Normalization Tools

for Single cell RNA-seq (and exRNA-seq)

  1. scran:

    1. pools multiple cells (samples) in order to estimate cell-specific size factors in the presence of zero inflation and unbalanced differential expression of genes across groups of cells;

    2. precluster (using e.g. rank-based clustering) the cells into smaller, more homogeneous sets

  2. SCnorm

  3. Census

If considering spike-ins:

  1. SAMstrt

  2. GRM

References

More Reading & Practice

See more about normalization, imputation and confounder (e.g. batch effect) in

Video

a) Normalization 1

b) Normalization 2

: 4.QC and Normalization; 5. Imputation and Confounders

Normalizing single-cell RNA sequencing data: challenges and opportunities, Nature Methods, 2017
@Youtube
@Bilibili
@Youtube
@Bilibili
Additional Tutorial