-
Make sure docker desktop & skaffold are installed.
-
Make sure your docker desktop have write access to /tmp folder since the k8s persistent volume store files in that path. You can change to what you want in these two files: mysql-pv.yaml and celery-pv.yaml
-
Enable ingress first: $ kubectl apply -f infra/others/ingress-config-macos.yaml
-
Start the app: $ skaffold dev --no-prune=false --cache-artifacts=false You should be able to see the webpage in localhost: 3000
- Assume that the database strucutre is same as the "order" csv file provided.
- Assume that memory efficienty without breaking user experience is preferred compared to speed. So used batch processing of large dataframes
- Beside some unit tests, there are files provided for test
- One xxx.bin file to test whether it can forbit non-csv files
- One empties.csv file for empty cases
- One mixed_empties for other empty cases such as: empty space or using NA string. However, I didn't validate for all the invalid NA strings such as "nA" or "nUlL" since I assume that these kinds of mistakes are rare
- 500000 sales xx.csv. File provided
- orignal.csv. Smaller file which contains subset of the 500000xxx.csv data
- Although it's required to handle large files, I still set limit to 500mb in dataservice/app/app.py since it's a bit slow for my local environment to handle larger files
- Secret files are provided in infra/k8s/secrets.yaml and is commited together with other files to github although it's not recommmended to do so. I commited is just to simplify the installations process.