The following project is realized as part of the Data Mining course given by the MoSEF Master at the University of Paris 1 Panthéon-Sorbonne. The main goal is to apply our knowledge in ML and Feature Engineering. For this purpose, the project is articulated over 2 parts: PCA Clustering Analysis and Regression (predict electricity per sector).
Datasets are from open data sources such as INSEE. You can find them at the following addresses:
- https://www.insee.fr/fr/statistiques/2021703
- https://www.insee.fr/fr/statistiques/3698339
- https://www.insee.fr/fr/information/6051727
- https://opendata.agenceore.fr/explore/dataset/conso-elec-gaz-annuelle-par-secteur-dactivite-agregee-commune/
- https://agreste.agriculture.gouv.fr/agreste-web/disaron/G_2002/detail/
In order to execute notebook without problems, you may put datasets in the data folder. You may also use
pip install -r requirements.txt
- Lucie Gabagnou👸 - Lucie.Gabagnou@etu.univ-paris1.fr
- Yanis Rehoune👨🎓 - Yanis.Rehoune@etu.univ-paris1.fr