Outdated data source links in Retaihero (RU+EN) notebooks.
Ksyula opened this issue Β· comments
π Documentation
In the QuickStart documentation block https://www.uplift-modeling.com/en/latest/quick_start.html there is links to the notebooks:
https://nbviewer.jupyter.org/github/maks-sh/scikit-uplift/blob/master/notebooks/RetailHero_EN.ipynb
https://nbviewer.jupyter.org/github/maks-sh/scikit-uplift/blob/master/notebooks/RetailHero.ipynb
https://colab.research.google.com/github/maks-sh/scikit-uplift/blob/master/notebooks/RetailHero_EN.ipynb
https://colab.research.google.com/github/maks-sh/scikit-uplift/blob/master/notebooks/RetailHero.ipynb
The same notebooks in the /notebooks folder in the repo have outdated links.
The current link to the retailhero-uplift dataset (https://drive.google.com/u/0/uc?id=1fkxNmihuS15kk0PP0QcphL_Z3_z8LLeb&export=download) is outdated and leads to 404 error.
The new link to the same dataset is https://storage.yandexcloud.net/datasouls-ods/materials/9c6913e5/retailhero-uplift.zip
The respective PR covers this issue for the notebooks in /notebooks of the project #116
Hello all! I faced with the same problem when I use this tutorial as example.
I agree, we need to change the link on following: https://storage.yandexcloud.net/datasouls-ods/materials/9c6913e5/retailhero-uplift.zip
But also we need to change "reading data" in next cell (in kernel after downloading):
df_clients = pd.read_csv('/content/data/clients.csv', index_col='client_id')
df_train = pd.read_csv('/content/data/uplift_train.csv', index_col='client_id')
df_test = pd.read_csv('/content/data/uplift_test.csv', index_col='client_id')
Otherwise we will get again a mistake.