fabiogm / lps

Literate Programming and Statistics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Literate Programming and Statistics (CMP595)

Creative Commons License

Goal

The aim of this short 15-hour course is to present the fundamental philosophy behind literate programming to conduct a faithful and reproducible data analysis, using sound statistical procedures and modern data analytics tools. The course is based on Rstudio as IDE and using the R programming language for data analysis. Every lecture will be backed up with practical sessions and worked out examples.

Content

  1. (First Session) General Introduction
  2. (First Session) Literate Programming - Literate Programming Motivation & RStudio Case Study
    • Reproducibility and Literate Programming (PDF)
    • Why R? RStudio. (PDF)
    • Hands-on: Using RStudio for running a Statistical Analysis
      • Given Example Analysis
      • Data set #1: ping-pong measurements
      • Data set #2: iteration duration of a geophysics application
  3. (Second Session) Data Carpentry and Manipulation - Clean-up data, and using the dplyr R package
  4. (Third Session) Data Quality, Descriptive Statistics
  5. (Fourth Session) Data Visualization
  6. (Fifth Session) Statistics

Schedule

DayDateHourRoom
024/10 (Tuesday)8:30 – 10:30 (2h)Lab 67-104
125/10 (Wednesday)8:30 – 10:30 (2h)Lab 67-104
230/10 (Monday)8:30 – 10:30 (2h)Lab 67-104
331/10 (Tuesday)8:30 – 12:30 (4h)Lab 67-103
401/11 (Wednesday)8:30 – 12:30 (4h)Lab 67-103 / AUD-1

Final project

The deadline for the final project is the 15th of December, 2017.

StudentDataset
EduardoBoston Marathon 2017ok
LizaUS Homicidesok
FábioPorto Alegre accidentsok
GabrielliRainfall in Indiaok
FelipeOnline Retail Sales in Europeok
Rodrigo F.US Homicidesok
LucasProfessional Hockeyok
MatheusRS Homicideok
Rodrigo N.Video Game Salesok
LizethWorld Happinessok
EmmanuellLand usage and Agriculture & Climate changeok

References

  • Literate Programming. Donald E. Knuth (Stanford, California) (CSLI Lecture Notes, no. 27.). ISBN 0-937073-80-6.
  • Applied Statistics and Probability for Engineers 6th Edition. Douglas C. Montgomery (Author), George C. Runger. Wiley.
  • R for Data Science. Garrett Grolemund, Hadley Wickham. http://r4ds.had.co.nz/

Contato

Get in touch with us

About

Literate Programming and Statistics


Languages

Language:TeX 100.0%Language:Makefile 0.0%