Greg Williams's repositories
data_modelling_with_cassandra
A project undertaken as part of the Udacity course in Data Engineering
my-travel-plans
A toy repository used within a Udacity course
Stations_Data_from_XML
This was written at the office of rail and road to automate the extraction of data relating to stations from NR's Open feeds source.
AdHocFaresIndexGrouping
This is a short toy example of grouping and summing using the Pandas data library
airflow_data_piplines
5th Project in Udacity Data Engineering course. Building an airflow pipeline to extract data from S3 into Redshift, with an additional data quality check
bitesofpy
Samples of problems solved from PyBites website
cloud_data_warehouse_project
A project undertaken as part of the Udacity Course in Data Engineering.
course-collaboration-travel-plans
A toy example of working with forks in Git from the Udacity training course in Data Engineering
data-lake
A project within the Udacity Data Engineering course. This covers the use of S3, EMR clusters and a spark sessions
data_modelling_with_postgres
A project undertaken as part of a Udacity course in data engineering
FaresIndexPreparation
This was written to automate the production of Rail Fares National Statics. It will not insert data into the ORR Data Warehouse outside of the ORR's secure network
QA_of_loaded_dw_data
This was written at the Office of Rail and Road to perform various QA checks on data loaded into the Data Warehouse
Webscrape_Rail_Fares
This is a webscraping application for extracting fares-related data from the National Enquiries Website