This repository holds the data for assignment 1 in NLP 220, as well as code for reproducing it (not needed for assignment completion).
On Mac/Linux (e.g. nlp-gpu-01
):
wget "https://raw.githubusercontent.com/kingb12/nlp220_hw1_data/main/small_books_rating.csv"
On Windows:
Unable to test myself, but Wget for Windows looks useful and would result in the same command as above once installed. Some more options discussed here.
To reproduce, do the following, and then run python create_dataset.py
.
- Sign up for a Kaggle account
- Set up an API token in your profile
- Move the API token (provided in
kaggle.json
) to your working computer (could benlp-gpu-01
) under~/.kaggle
. - In your assignment environment:
pip install kaggle
kaggle datasets download -d mohamedbakhet/amazon-books-reviews