stephengill5 / bootcamp-2020

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

2020 MSiA boot camp

This Github repository contains materials for the R sessions of the 2020 Masters of Science in Analytics program boot camp, including lecture notes, slides, exercises, and recommended resources as you continue to develop your skills.

Workshop materials

Session Content Lecture notes Slides Exercises
Day -1
Mon, 8/24
Intro to the Shell Lecture Notes
Day 0
Tues, 8/25
Intro to git Lecture Notes
Day 1
Wed, 8/26
Basic syntax and data structures; reading and writing files Lecture notes Slides Exercises and Answers
Day 2
Thur, 8/27
Data manipulation and simple visualization in base R Lecture notes Slides Exercises and Answers
Day 3
Wed, 9/2
Loops, conditionals, and functions Lecture notes Slides Exercises and Answers
Day 4
Thur, 9/3
Git workflow in Rstudio/R Markdown; Reshaping and merging Lecture notes Slides Exercises and Answers
Day 5
Tue, 9/8
Advanced data manipulation in dplyr Lecture notes Slides Exercises and Answers
Day 6
Wed, 9/9
Advanced data manipulation in data.table Lecture notes Slides Exercises and Answers
Day 7
Thur, 9/10
Data visualization with ggplot2 Lecture notes Slides Exercises and Answers
Day 8
Tue, 9/15
Final exercise Instructions

Resources

Shell

NUIT's command line workshop includes some exercises and a well-curated list of resources. I'll add some commentary on what resources I think are useful for data analysts who need a working understanding of Bash, shells, and Unix.

Software Carpentry's Unix Shell course is a useful and matter-of-fact introduction. It probably won't convince you just how broadly useful the shell really is, though.

Learn Enough Command Line to be Dangerous, by Michael Hartl, is an excellent and realistic introduction with good exercises. I wish it had existed when I started learning this stuff. My only quibble is that the author is a macOS proselytizer, which I find unhelpful and out of step with the current landscape--Bash is for Windows users too!

Once you're comfortable with that, you can follow it up with Learn Enough Text Editor to be Dangerous. This might not be that exciting, but practicing this stuff will make you faster and more productive.

If you'd like a really accessible intro to the nuts and bolts of how all this stuff actually works, I like Julia Evans' work. She writes a blog as well as ingenious comics that teach Linux and Bash.

If you liked the DataCamp introductory R course, they also offer a free shell course.

Git

Software Carpentry's Version Control with Git course is what we followed along with earlier. Like their Bash course, it doesn't really introduce you to a real-world workflow, but it does help you understand the basic mechanics in a straightforward way.

Michael Hartl also wrote Learn Enough Git to be Dangerous. Just like his Bash and Text Editor tutorials, this is great for developing a practical understanding of the parts of Git that you really need to know.

NUIT also has a Git resource list.

Authorship

The R materials used on 8/26 - 9/8 and 9/10 are based on the Intro to R workshop from NUIT Research Computing Services, created by Christina Maimone. They have been expanded and modified by Kumar Ramanathan and Richard Morel. The materials on data.table used on 9/18 were originally developed by Ali Ehlen. The materials on ggplot2 used on 9/10 were originally developed by Kumar Ramanathan. Richard Morel, Ali Ehlen, and Kumar Ramanathan all jointly developed the overarching sequence of the sessions as well as the synthetic final exercise on 9/15.

Contact information

About


Languages

Language:HTML 89.6%Language:JavaScript 6.3%Language:CSS 2.8%Language:SCSS 1.2%Language:R 0.1%