-
Notifications
You must be signed in to change notification settings - Fork 2
Home
This course will teach core computing skills as well as project specific approaches. Each student will be developing and completing a research project targeting journal article submission by the end of the Quarter. There will be an emphasis on developing habits that increase automation which in turn will facilitate reproducibility. The course primary course platform will be GitHub, with each student creating their own repository.
T 3:00-4:20 FSH 213
Th 9:30-11:20 FSH 213
Week (slides) |
Description | Reading | Quiz & PP | Recordings |
---|---|---|---|---|
zero |
Biology, Course Framework, Getting set-up | Preface xiii-xxv; How to Learn Bioinformatics 1-18; Roberts and Gavery (2018) Opportunities in Functional Genomics: A Primer on Lab and Computational Aspects |
Questions | video-Th |
one | Bash, version control, Project Set-up | Setting Up and Managing a Bioinformatics Project 21-35; Remedial Unix Shell 37-54 |
Questions | |
two | Jupyter, Annotation | Retrieving Bioinformatics Data 109-124, Unix Data tools 125-168 | Questions | video-Th |
three | Projects | Working with Sequence Data 339-354 | Questions |
video-Tues video-Th |
four | FastQC | Git for Scientists 67-83 | Questions |
video-Tues video-Th |
five | cloud resources | Working with Remote Machines 57-66 | Questions |
video-Tues video-Th |
six | R and find_xargs | Bioinformatics Shell Scripting, Writing Pipelines 395-423 | Questions |
video-Tues video-Th |
seven | Genome Browser | Working with Alignment Data 355-383, Working with Range Data 329-338 | Questions | video-Tues |
eight | Holiday | 🦃 🦃 🌽 🍰 💻 | No Questions | video-Tues |
nine | Visualize | Considering best ways to summarize your effort | Questions |
video_Tues audio_Thurs |
ten | Projects | Presentations | Questions |
video_Tues video_Thurs |
🔺 subject to change based on guest availability
Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools
By Vince Buffalo
Publisher: O'Reilly Media
Final Release Date: July 2015
Pages: 538
- Quizzes (10✖️3) = 30 DUE Friday Midnight Weekly
- Project Progress (10✖️3) = 30 DUE Friday Midnight Weekly
- Draft Product (Week 5) = 15
- Final Product (Week 10) = 25
➡️ conversion
This course will be taught using personal computers of students. This approach (as opposed to using virtual machines in the cloud) has its disadvantages and advantages.
Any modern laptop should work fine. There will be some analysis that we will not complete during the course (given time constraints), however students should be able to clearly understand how to carryout the analysis. We will also be introducing students to cloud based options. Generally speaking, what we will be doing is more straightforward to do on Unix based machines, (Linux and MacOSx) though we will also show students Windows-centric solutions.
A good text editor will be very useful. There are several built in options with nano recommended by Software Carpentry. For this course I suggest stand alone applications.
Windows
Mac OS X
Linux
We will use Markdown. Below are some recommended editors. Text editors above would also work.
Multi-Platform
Browser-based
- Jupyter will work - see below.
Windows
Mac OS X
We will be using the "command-line", specifically the Bash shell. Below is information for this for different operating systems taken from the Software Carpentry website.
Bash is a commonly-used shell that gives you the power to do simple tasks more quickly.
Windows
Download the Git for Windows installer. Run the installer. This will provide you with both Git and Bash in the Git Bash program.
Detailed Instructions
Video Tutorial
Download the Git for Windows installer.
Run the installer and follow the steps bellow:
- Click on "Next".
- Click on "Next".
- Keep "Use Git from the Windows Command Prompt" selected and click on "Next". If you forgot to do this programs that you need for the workshop will not work properly. If this happens rerun the installer and select the appropriate option.
- Click on "Next".
- Keep "Checkout Windows-style, commit Unix-style line endings" selected and click on "Next".
- Keep "Use Windows' default console window" selected and click on "Next".
- Click on "Install".
- Click on "Finish".
If your "HOME" environment variable is not set (or you don't know what this is):
- Open command prompt (Open Start Menu then type
cmd
and press [Enter]) - Type the following line into the command prompt window exactly as shown: `setx HOME "%USERPROFILE%"``
- Press [Enter], you should see `SUCCESS: Specified value was saved.``
- Quit command prompt by typing
exit
then pressing [Enter
This will provide you with both Git and Bash in the Git Bash program.
Mac OS X
The default shell in all versions of Mac OS X is bash, so no need to install anything. You access bash from the Terminal (found in /Applications/Utilities
). You may want to keep Terminal in your dock for this workshop.
Linux
The default shell is usually Bash, but if your machine is set up differently you can run it by opening a terminal and typing bash
.
Note you should be able to run bash shell on any platform within Jupyter, once installed
We will be using GitHub, a Web-based Git repository hosting service. It offers distributed revision control of Git as well as adding its own features.
- GitHub Desktop is available for Mac and Windows
Formerly IPython Notebook
Installation instructions are available here. If you are new to Python and Jupyter, it is recommended you use Anaconda.
The newest version of BLAST+ for all operating systems is available @ ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis.
Below is information for this for different operating systems taken from the Software Carpentry website.
Windows
Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE.
Mac OS X
Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio IDE.
Linux
You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base
and for Fedora run sudo yum install R
). Also, please install the RStudio IDE.
If you are new to R or would like a refresher have a look at the material for Data Science for SAFS. And check out R for Data Science by Garrett Grolemund and Hadley Wickham. A version of the textbook is available free online: http://r4ds.had.co.nz/
"Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks"
Available for download @ https://github.com/arq5x/bedtools2/releases Likely only available for Linux and Mac OS
"high-performance visualization tool for interactive exploration of large, integrated genomic datasets"
To download the software you will need to register. See https://www.broadinstitute.org/software/igv/log-in.
Here is a list of free web services we will likely use during the course