RomainChor / DataScience

About Data Science.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Welcome !

About me

Currently preparing a PhD thesis at Huawei Technologies France & Laboratoire d'informatique Gaspard Monge (LIGM). I am interested in distributed statistical learning, exploring information theoretic and compressibility approaches, through study of theoretical limits & algorithms. More generally, I am eager to learn about exciting scientific topics in various fields.

News

My recent joint work with Milad Sefidgaran, Adbellatif Zaidi and Yijun Wan, titled "Lessons from Generalization Error Analysis of Federated Learning: You May Communicate Less Often!" has been accepted at ICML 2024. (May 2024)
You can find the source code for reproducing experiments in this repository.

Links

Mail: romain.chor@yahoo.fr
Website
LinkedIn
Google Scholar

What you will find here

You will find some work produced along the recent years and helpers I actually use.

  • toolbox.py: (helper) Python functions for data analysis and inference.
    It is not "perfect" and under continuous modifications.
    Please feel free to give your opinion so I can improve it :)

  • Generalization_NeurIPS2022 folder: code for simulations done for a joint work with Milad Sefidgaran and Abdellatif Zaidi that resulted in an accepted paper at NeurIPS2022. The work focuses on theoretical guarantees for the generalization error of distributed learning algorithms and shows some improvement from previous works. (June 2022)

  • AI_for_medicine folder: lecture notes about a Coursera's MOOC on Machine learning for medical diagnosis, prognosis and treatment. (February 2021)

  • Cassava_classification_challenge folder: notebooks used for Kaggle "Cassava leaf disease classification" challenge to which I contributed from December 2020 to February 2021. Link

  • Drugs_consumption folder: A work around the "Drugs consumption" dataset from UCI Machine learning repository. A proper study, with data exploration followed by a classification problem. Various methods used, from boosting to Neural Networks, and advanced aggregation models like stacking. (March 2020)

  • Pegasos folder: A work around Pegasos and online convex optimization algorithms. Theoretical results, implemetation and simulations on the famous MNIST handwritten digits dataset. (March 2020)

  • no_SQL folder: lecture notes about several courses on noSQL databases and mongoDB cheatsheets. (March 2021)

About

About Data Science.


Languages

Language:Jupyter Notebook 99.4%Language:Python 0.6%