KunalArora / LOreal-coding-test-kunal

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

L'Oréal Tech Accelerator Technical Test

General instructions

This test aims to evaluate some basic skills in machine learning and programming. Some precisions :

  • You have to perform the test in 3 hours at home.
  • You can search for help on the internet (e.g. StackOverflow) or books, as you would do in a normal day at work.
  • We expect Python3 code but have no restrictions about frameworks and packages.

You will not be evaluated on model performance specifically. It's more about methodology and coding than raw performance.

The test will be divided in two different parts. In the first one you will implement your solution on a classical machine learning problem.

Then in the second problem, you will describe the pipeline and the methods to use to answer a business problem.

Once everything is done, please send your work. We will debrief it with you and ask a few questions.

Note : if you don't have access to a computer for this test, you can ask for an on-site test, with provided laptop and a 3-hour timeslot

Problem overview

Here you have to work on a very classical task in machine learning : a binary classification task from a tabular dataset.

You will have the dataset available with these instructions. Please take the time to look at the data before jumping to the resolution of the problem. All data or columns are not necessary useful to construct the model.

For this assignment you will need to answer two questions :

  • First, construct a model which predict if a person has an income lower or higher to 50k. It is expected that the code is well annotated to identify your strategy and methods to build this model.
  • Second make your model interpretable by the method of your choice (like feature importance for example)

Expected output

  • Replicable code (jupyter notebook is ok if well documented)
  • Trained model

Instructions

  • Train a classifier based on provided data. You can enrich the dataset but then have to document your strategy.
  • Report the performances of your model.
  • Make your model interpretable (using feature importance for example)

Business Problem overview and instructions

For this problem, you are not asked to code a solution. You only need to expose a possible pipeline with the methods (library) that you would use to answer this problem.

The business situation is the following :

  • L'Oréal marketing is interested by a visualization tool of the Instagram beauty ecosystem (only business accounts for privacy reason)
  • We are speaking of thousands of beauty images with no labels
  • Describe a pipeline and the associated methods which goes from the data gathering to the application. You are not necessarily an expert or familiar with all the steps of the pipeline. You need to be more exhaustive, precise on the data preprocessing/machine learning parts which are inherent to the data scientist job.

About


Languages

Language:Jupyter Notebook 100.0%