mbalcerzak / Code4Life_MAB

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tasks: Roche Polska - Junior Data Scientist

What you need to provide for this task:
1. The report (pdf/html/markdown) which includes the description of the model used.
2. The source code written in Python or R (published on one of the git platforms such as github or bitbucket).

Task 1

Your customer is importing flowers (irises) from China. He always buys the same three species: Virginica, Versicolor oraz Setosa. Unfortunately, in the last order all the flowers got mixed up and nobody knows how to distinguish them (what is surprising, even the florist experts were not able to identify them). Luckily, there are still 150 flowers left from the previous order which are tagged properly. The client instructed his workers to measure the length and width of their sepals and petals so they could help to classify the newest batch correctly.

Goal: Based on attached data build a model which will classify flower species from the last order.

Task 2

Enclosed dataset contains data about press headlines – their content and type. Aim of this task is to create binary classification model for headline type (sarcastic / not sarcastic) based on headline content.

Goal: Based on attached data build a model that will classify headline types. Prepare a report where you describe your way of approaching the problem and the steps you took to solve it.

Task 3

Write a SQL query, in which you will retrieve the information about the students who scored a 4 and above on their algebra exam.

About


Languages

Language:Jupyter Notebook 100.0%