Alexis Laks's repositories
signalML
@CoditEU Project to train a real time anomaly detection model. MelSpectrogram extraction from .wav -> Data augmentation (rolling, stretching, etc...) -> Generating multi class label (specific failure) with PCA+KMeans -> CNN training on multiclass labeled MelSpectrograms -> Configuring HTTP endpoint with Flask & deploying with Docker -> Uploading to ACR (Azure Container Registry) & testing with ACI (Azure Container Instance)
StructureMe
Repo for my master thesis. Program to structure text, images and tables in JSON with Data/MetaData/Summarization layers for structured server-side querying.
EuropeanComission_NLP_NetworkAnalysis
This project was led during my internship at PwC in collaboration with DG EAC and DG Informatics from the European Commission. Their end goal was to use data and machine learning to get insights on education in Europe to help Policy Officers in their decision making. In this context we developed two PoCs (delivered as ShinyApps) which were deployed on the Commissions' severs. The project involved WebScrapping, NLP and Graph Theory. http://34.245.7.232:3838/myapp/poc1/prototype1/
stsi-faiss
Vector based search index with FAISS and sentence transformers
awesome-public-datasets
A topic-centric list of high-quality open datasets in public domains. New PR ☛☛☛
CenterParcs_NLP_SentimentAnalysis_Webscraping
This project was conducted in collaboration with Capgemini consulting during my Msc at Ecole Polytechnique. We were to conduct a thorough analysis of customer reviews on Centerparc facilities. This projected is divided in days, each day corresponding to a specific task.
Eleven_Seven_Segment_OCR
This repo contains the data, model and development of a project I did in the context of my Msc with my colleagues for Eleven Strategy. We were tasked to develop a program for number recognition on images of fuel consumption. We combined computer vision (for digit detection) and deep learning (for number recognition) methods to obtain such a program.
MapReduce_Pyspark_Large_Matrix_Multiplication
This project was part of the final exam for my Database Management course in Ecole Polytechnique. We were tasked to develop two programs to do large scale multiplication in Python MapReduce and Pyspark.
DataScienceForFinance
Data Science applied to finance project
HackerRankStuff
My solutions to HackerRank problems
Imagenet_FineTuning_TransferLearning
Deep learning TD during my advanced machine learning class at Ecole Polytechnique. We were tasked to try both transfer learning and Fine tuning to construct an image classifier (Dogs vs. Cats)
Kaggle_competition_Restaurant_Visit_Forecast
For the final evaluation of my R course at Polytechnique, I was tasked to undergo a project that involved data and the development of both a shiny app and the construction of an R package. With my group, we chose to go through a Kaggle competition which involved forecasting the number of visitors in restaurant in Japan.
Medias_francais
Qui possède quoi ?
MetaDataExtractor
Python repo to extract metadata from a variety of documents (MS Office docs, PDF, images)
Mixed_Effects_Regression_Sales
This project is part of the final evaluation of my Statistics in Action course at Polytechnique, we were to conduct a series of linear and non linear mixed effects models on sales data.
Multiple_Comparison_Mice_Genetics
This project is part of my final evaluation in my statistics in action course at Ecole Polytechnique. We ere tasked to run a series of single and multiple comparison test on genetics data from mice.
Novotel_NLP_Rating_Prediction
This project is part of the project I had to conduct during my Msc at Ecole Polytechnique involving Capgemini consulting. We were to construct a classifier to predict rating of Novotel Hotels.
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
OverTheWire
Hacking tutorial for newbies. SSH connection to configured servers. CTF
Regression_Analysis_RealEstate
This project was the midterm of my Regression course at Ecole Polytechnique. We were tasked to conduct a thorough regression analysis of Real Estate data, providing an appendix containing all our development and a summarized version containing main results.
SMS_spam_detection
This project was the final evaluation of my Python class at Polytechnique. My goal was to create a classifier to detect spams in sms using two methodologies. First methodology was to use feature engineering and XGBoost. Second methodology was a classic approach to use NLP methods associated to a Naïve bayes classifier.
Unsupervised_Learning_Diecasting_data
This project was the midterm of my Introduction to Machine Learning course at Ecole Polytechnique. We were tasked to use different unsupervised learning methods to do a thorough analysis of data on decanting parts coming from various suppliers. This data is typically a dataset containing many variables and very few rows.