- Student Name: Insert Name
- Student ID: Insert Student ID
- Due Date: Friday 13th of August 11:59:00 am (AEST).
- Report Link: Insert Report Link if applicable
- Language: i.e Python 3.8.3 and/or R 4.05
- Packages / Libraries: i.e pandas, pyspark, sklearn, statsmodels, folium, etc... OR add a
requirements.txt
- NYC TLC: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
- External dataset 1: (optional)
- External dataset 2: (optional)
- ...
- External dataset n: (optional)
Change this to fit your needs when you have started the project.
raw_data
: Contains all the raw data files. You may add this folder to.gitignore
if your files are too large, but you must provide code to automatically download or links so that we may manually download them.preprocessed_data
: Contains all the preprocessed data files. You may add this folder to.gitignore
if your files are too large, but your script should automaticaally generate files here given the correct dataset inraw_data
.plots
: Output and save all your figures here.code
: Keep all notebooks and scripts in this folder. Ensure that you have notebooks for each stage of code. Here's an example:- Notebook 1 for "Extracting Data" and "Installing Packages".
- Notebook 2 for "Preprocessing" and/or "Exploratory Data Analysis".
- Notebook 3 for "Analysis and Visualisation".
- Notebook 4 for "Statistical Modelling".
deprecated
: A folder to store "old code" that you do not use anymore or code that you experimented with, but decided to not go ahead. This is useful in case you ever need to come back to an older iteration of code or to express your other approaches to the problem.
Feel free to add any other information that you deem useful.
- (You may delete these dot points once you have read and understood them)
- You should avoid uploading your datasets as they are far too large (without using git LFS). Please add them to the
.gitignore
file or remove them when pushing changes. - You can delete all the
.gitkeep
files located inside each empty directory. These just exist to give the folder directory templates as GitHub doesn't keep track of empty directories. - Attatch a
requirements.txt
if you are using non-standard Python libraries that are not officially taught or covered in this subject. - Remember, there are marks awarded for readability in your code, as well as reproducability.