Group coursework for the Text Technologies for Data Science course.
- Frontend: No framework needed (unless if additional time allows)
- Backend: Python 3.9.5 + Flask
- DB: MongoDB
- Git Version Control
- Create own branch and push it into your own branch
- Create a Pull Request on GitHub and request for a reviewer
- 1 Reviewer should review your code; if approved - it should be then merged into main branch
conda env create -f environment.yml
- To configure settings file, type in:
python setup_settings.py
- Go to
settings.ini
file and update password - refer to MSTeams Wiki if needed- This is absolutely needed to prevent freely exposing API keys and passwords
- Install MongoDB Compass for better visualization
- Ensure MongoDB is installed on your computer
- Install MongoDB Compass for better visualization
- Install Flask / Ensure Flask is installed
- Ensure you are in the correct directory of the project folder
- For Windows users use PowerShell:
-
py -3 -m venv venv
-
venv\Scripts\activate
-
pip install Flask
-
- For Mac/Linux users use Terminal:
-
python3 -m venv venv
-
. venv/bin/activate
-
pip install Flask
-
- Please ensure you have Python version 3.9.5 (should be fine tho)
- Install following dependencies
-
pip install pymongo
-
pip install dnspython
-
- To configure settings file, type in:
python setup_settings.py
- Go to
settings.ini
file and update - please refer to MSTeams Wiki if needed- This is absolutely needed to prevent freely exposing API keys and passwords
- Ensure your virtual environment is already setup properly
- Type the following in your console:
./run
- Assuming
settings.ini
is already setup, go to thegeniuslyrics_datacollection
section and changedata_collection_type
option value along withbatch_starting_initial
option value
- NOTE:
data_collection_type
can either besample_data
ortest_data
- NOTE:
batch_starting_initial
can be from folder a to z but the initial name must be clearly defined