2.1 Plot Utilities

Plot Utilities

We provide a jupyter file for uses to run the codes easily: notebook

# Initialize Notebook
from IPython.core.display import HTML,Image
#%run ../library/v1.0.5/init.ipy
HTML('''<script> code_show=true;  function code_toggle() {  if (code_show){  $('div.input').hide();  } else {  $('div.input').show();  }  code_show = !code_show }  $( document ).ready(code_toggle); </script> <form action="javascript:code_toggle()"><input type="submit" value="Toggle Code"></form>''')
cd ~/projects/exSEEK_training/
import gc, argparse, sys, os, errno
from functools import reduce
%pylab inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm_notebook as tqdm
import scipy
import sklearn
from scipy.stats import pearsonr
import warnings
warnings.filterwarnings('ignore')
from bokeh.io import output_notebook, show
output_notebook()
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.metrics import roc_curve,roc_auc_score,auc,precision_recall_curve,average_precision_score
from sklearn.preprocessing import RobustScaler,MinMaxScaler,StandardScaler
from sklearn.neighbors import NearestNeighbors
from bokeh.palettes import Category20c
from ipywidgets import interact,interactive, FloatSlider,IntSlider, RadioButtons,Dropdown,Tab,Text

load plotting functions

embed pdf; std_plot; display dataframe

Python

numpy and pandas

numpy

http://cs231n.github.io/python-numpy-tutorial/

Pandas

https://github.com/adeshpande3/Pandas-Tutorial/blob/master/Pandas Tutorial.ipynb

qgrid filtering

Matplotlib

Seaborn

use boxplot as an example https://seaborn.pydata.org/generated/seaborn.boxplot.html

interactive plotting

It is useful to use ipywidgets to tune the parameters to get a perfect plot https://ipywidgets.readthedocs.io/en/stable/examples/Using Interact.html#Basic-interact

fixed arguments

Slider

based on the above things, you can create some fancy plotting...

display dataframe, std_plot, pdf_figure...

Basic plot

Now we try to use the following commands to get summary plot of exSEEK modules:

  • mapping

  • count matrix

  • differetial expression

pie plot of RNA ratio

boxplot of rna ratio

line plot of rna length

3D barplot of rna length

stack bar plot of rna counts and ratio

bar plot of RNA by sample

if sequencing_type == 'long': plot_bar_by_rna_total(percent_by_mapped/100.,datatype='ratio') else: plot_bar_by_rna_total(table_ratio,datatype='ratio')

embed_pdf_figure()

if sequencing_type == 'long': plot_bar_by_rna_total(percent_by_mapped/100.,datatype='count') else: plot_bar_by_rna_total(table_ratio,datatype='count')

embed_pdf_figure()

FastQC

Sample QC

use PCA and tSNE to visualize outliters

Differential Expression

if you would like to try...

Find suitable methods and metrics for better visualization

abundance and diversity

def filter_mx(expression_mx,cutoff_ratio = 0.2,counts_threshold = 10 ): retain_index = np.where(np.sum(expression_mx > counts_threshold,axis=1) >=round(cutoff_ratio*expression_mx.shape[1]))[0] return expression_mx.iloc[retain_index,:]

Last updated