zuranojp / granolarr

A reproducible resource for teaching geographic data science in R

Home Page:https://sdesabbata.github.io/granolarr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

granolarr

GRANOLARR is a geogGRaphic dAta scieNce, reprOducibLe teAching resouRce in R

by Stefano De Sabbata

This work is licensed under the GNU General Public License v3.0 except where specified. Contains public sector information licensed under the Open Government Licence v3.0, see Data / README.md. See Lectures / Images / README.md, Practicals / Images / README.md and Utils / IOSlides / README.md for information regarding the images used in the materials.

This repository contains reproducible materials to teach geographic information and data science in R. Part of the materials are derived from the lectures and practical sessions for the module GY7702 Practical Programming in R of the MSc in Geographic Information Science at the School of Geography, Geology, and the Environment of the University of Leicester, by Dr Stefano De Sabbata.

This content was created using R, RStudio, RMarkdown, Bookdown, and GitHub.

Table of contents

Materials by topic

All the materials are available through the lectures bookdown and practical sessions bookdown pages. Links to the lecture slides and bookdown chapters for each week are listed below.

  1. Programming, a practical introduction
    • 101 Introduction
      • Lecture (slides, bookdown)
        • Basic types
        • Basic operators
        • Variables
        • Libraries
        • The pipe operator
        • Coding style
      • Practical session (bookdown)
        • The R programming language
        • Interpreting values
        • Variables
        • Basic types
        • Tidyverse
        • Coding style
    • 102 Data types
      • Lecture (slides, bookdown)
        • Vectors
        • Factors
        • Matrices
        • Arrays
        • Lists
        • Data Frames
      • Practical session (bookdown)
        • Vectors
        • Factors
        • Matrices
        • Arrays
        • Lists
        • Data Frames
    • 111 Control structures and functions
      • Lecture (slides, bookdown)
        • Conditional statements
        • Loops
        • Functions
        • Scope of a variable
      • Practical session (bookdown)
        • Conditional statements
        • Loops
        • Functions
  2. Data wrangling
    • 201 Selection and manipulation
      • Lecture (slides, bookdown)
        • Data selection
        • Data filtering
        • Data manipulation
      • Practical session (bookdown)
        • Creating R projects
        • Creating R scripts
        • Data wrangling script
    • 202 Table operations
      • Lecture (slides, bookdown)
        • Join operations
        • Table re-shaping
        • Read and write data
      • Practical session (bookdown)
        • Join operations
        • Table re-shaping
        • Read and write data
  3. Reproducibility
    • 301 Reproducible analysis
      • Lecture (slides, bookdown)
        • Reproducibility
        • Versioning
        • R and Markdown
        • Git
      • Practical session (bookdown)
        • Reproducibile data analysis
        • RMarkdown
        • Git
  4. Data visualisation
    • Coming soon
  5. Data analysis
    • 501 Exploratory data analysis
      • Lecture (slides, bookdown)
        • Data visualisation
        • Descriptive statistics
        • Exploring assumptions
      • Practical session (bookdown)
        • Data visualisation
        • Descriptive statistics
        • Exploring assumptions
    • 502 Regression models
      • Lecture (slides, bookdown)
        • Comparing means
        • Correlation
        • Regression
      • Practical session (bookdown)
        • Comparing means
        • Correlation
        • Regression
  6. Machine learning
    • 601 Unsupervised
      • Lecture (slides, bookdown)
        • Machine Learning: definition and types
        • Unsupervised machine Learning
        • Clustering
      • Practical session (bookdown)
        • Geodemographic classification
    • 602 Supervised
      • Coming soon

Suggested schedule

The lectures and practical sessions have been designed to follow the schedule below:

  • Programming, a practical introduction
    • 101 Introduction
    • 102 Data types
  • Data wrangling
    • 201 Selection and manipulation
    • 202 Table operations
  • Reproducibility
    • 301 Reproducible analysis
  • Programming
    • 111 Control structures and functions
  • Data analysis
    • 501 Exploratory data analysis
    • 502 Regression models
  • Machine learning
    • 601 Unsupervised

Reference books

Suggested reading

  • Programming Skills for Data Science: Start Writing Code to Wrangle, Analyze, and Visualize Data with R by Michael Freeman and Joel Ross, Addison-Wesley, 2019. See book webpage and repository.
  • Machine Learning with R: Expert techniques for predictive modeling by Brett Lantz, Packt Publishing, 2019. See book webpage.

Further reading

  • The Art of R Programming: A Tour of Statistical Software Design by Norman Matloff, No Starch Press, 2011. See book webpage
  • Discovering Statistics Using R by Andy Field, Jeremy Miles and Zoë Field, SAGE Publications Ltd, 2012. See book webpage.
  • R for Data Science by Garrett Grolemund and Hadley Wickham, O'Reilly Media, 2016. See online book.
  • An Introduction to R for Spatial Analysis and Mapping by Chris Brunsdon and Lex Comber, Sage, 2015. See book webpage

Reproducibility

To reproduce these materials:

  • install R, RStudio and Git
  • install the following R libraries
    • tidyverse, magrittr
    • knitr, stargazer
    • nycflights13
    • pastecs, car, psych, lmtest, lm.beta
    • e1071, dbscan
    • sp, rgdal, tmap
  • install tinytex
  • clone this repository as an RStudio project
    • open RStudio
    • make sure Git is correctly set up in Tools > Global Options... > Git/SVN
    • make sure that the selected option for Typeset LaTeX into PDF using is XeLaTeX in Tools > Global Options... > Sweave (practicals are also compiled locally as PDF files)
    • select File > New Project..., thenVersion Control and finally Git
    • copy https://github.com/sdesabbata/granolarr.git in the Repository URL field, select a folder for the field Create project as subdirectory of and click on Create Project
  • execute Make.R

Credits and acknowledgements

Stefano De Sabbata

This work is licensed under the GNU General Public License v3.0.

This repository includes teaching materials that were created by Dr Stefano De Sabbata for the module GY7702 Practical Programming in R, while working at the School of Geography, Geology, and the Environment of the University of Leicester. Stefano would also like to acknowledge the contributions made to parts of these materials by Prof Chris Brunsdon and Prof Lex Comber (see also An Introduction to R for Spatial Analysis and Mapping, Sage, 2015), Dr Marc Padilla, and Dr Nick Tate, who conveened previous versions of the module (GY7022) at the University of Leicester.

Files in the Data folder have been derived from data by sources such as the Office for National Statistics, Ministry of Housing, Communities & Local Government, Ofcom, and other institutions of the UK Government under the Open Government License v3 -- see linked webpage above on the National Archives website or the LICENSE file in this folder).

About

A reproducible resource for teaching geographic data science in R

https://sdesabbata.github.io/granolarr

License:GNU General Public License v3.0


Languages

Language:R 100.0%