Repository for the Cambridge Digital Humanities Workshop: Harvesting Data and Visualising Cultural Transmission
January 30 & February 6, 2023
This repository holds the code for the first session section on Web-scraping. It utilizes the Fitzwilliiam Museum Database (https://github.com/FitzwilliamMuseum). The Museum has an API that can be used but for the purposes of the workshop, we web-scrap to show the utility of web-scraping.
The files entitled CDH_webscape are the practical in differing file formats that will walk someone through the practical. It is recommended to use the HTML document.
Images folder contains the images used within the RMarkdown file for the practical.
fitz_scapinpotg.R contains the full code to scrape the Fitzwilliam museum database for pottery pieces. This takes several hours to run.
Point of contact: Leah Brainerd, lmb211@cam.ac.uk