Introduction of PART I

写在前面的话

今天,数据不是太少而是太大,信息不是匮乏而是繁杂,现代人的重要的能力是“挖掘”和“鉴别”。对生物信息学的工作而言,最重要的、最有用的基本工具和技能一直是,我相信也会始终是:

  1. google

  2. wikipedia

Aim

These basic data skills give you freedom.

  • Running bioinformatics software isn’t all that difficult, doesn’t take much skill, and it doesn’t embody any of the significant challenges of bioinformatics.…These data skills give you freedom

  • I believe these two qualities — reproducibility and robustness.

  • So what is a reproducible bioinformatics project? At the very least, it’s sharing your project’s code and data.

  • In wet lab biology, when experiments fail, it can be very apparent, but this is not always true in computing. Electrophoresis gels that look like Rorschach blots rather than tidy bands clearly indicate something went wrong. Unfortunately, without prior expectations, it can be quite difficult to distinguish good results from bad results.

  • The easy way to ensure everything is working properly is to adopt a cautious attitude , and check everything between computational steps.

  • You will almost certainly have to rerun an analysis more than once.

  • Write Code for Humans, Write Data for Computers

  • Use Existing Libraries Whenever Possible

  • Treat Data as Read-Only

  • Document Everything (-- Too geeky?) Just as a well-organized laboratory makes a scientist’s life easier, a well-organized and well-documented project makes a bioinformatician’s life easier.

-- <<Bioinformatics Data Skills>>

Step 1

学习内容:

  1. Setup - How to do jobs efficiently and reproducibly

  2. Linux - How to work with command lines

预习内容:

  1. 练习一个Editor (e.g. VIM,Atom)

  2. 阅读《鸟哥的Linux私房菜-基础学习篇》如下章节

第5章 5.3.1 man page

第6章

6.1用户与用户组
6.2 LINUX文件权限概念
6.3 LINUX目录配置

第7章Linux文件与目录管理

7.1目录与路径
7.2文件与目录管理
7.3文件内容查阅
7.5命令与文件的查询
7.6权限与命令间的关系

第8章

8.2文件系统的简单操作

第9章

9.1压缩文件的用途与技术
9.2 Linux系统常见的压缩命令
9.3打包命令:tar

第10章vim程序编辑器 (或者其他编辑器文档

第11章 认识与学习bash

第25章 LINUX备份策略

25.2.2完整备份的差异备份
25.3鸟哥的备份策略
25.4灾难恢复的考虑
25.5重点回顾

Step 2

学习内容:

  1. Bash (and Github) - How to set up multiple job as a pipeline

  2. R - How to make professional and beautiful plots

预习内容:

  1. 阅读和练习《鸟哥的Linux私房菜-基础学习篇》如下章节:

第11章 认识与学习bash 第12章 正则表达式与文件格式化处理 第13章 学习shell script

2.阅读和练习Quick R 如下章节 :

Learning R R Interface Data Input Statistics Descriptive Statistics

Step 3

学习内容:

  1. Perl/Python - How to program for bioinformatics

预习内容:

  1. 《Beginning Perl for Bioinformatics》

相关书籍下载: