DmytroZH123 / Kaggle-Tabular-playground-series-aug-2022

Kaggle Competition - Prediction of the probability whether the product is succesfull or not based on the experiment data, representing various lab testing methods.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kaggle-Tabular-playground-series-aug-2022

Successfully participated in Kaggle Competition "Tabular Playground Series August 2022", top 45%.
The main product of the Keep It Dry company Super Soaker, used in factories to absorb spills and leaks, is needed to be improved.

This data represents the results of a large product testing study. The company has just completed a large testing study for different product prototypes. For each product_code is given a number of product attributes as well as a number of measurement values for each individual product, representing variety of different lab testing methods. Each product is used in a simulated real-world environment experiment, and absorbs a certain amount of fluid to see whether or not it fails.

The task is to use the data to predict individual product failures of new codes with their individual lab test results.

To see the full project click on the ZhukDmytro Kaggle Tabular Playground.ipynb file.


The data:

https://www.kaggle.com/competitions/tabular-playground-series-aug-2022/data

  1. product_code - Code of the certain product
  2. loading - Amount of fluid absorbed
  3. attribute_0, attribute_1, attribute_2, attribute_3 - Product attributes
  4. measurement_0, measurement_1, ..., measurement_17 - Individual measurement values
  5. failure - Target value

Comparison of Machine Learning algorithms selected by RandomSearchCV used for probability estimation:


Summary:

The best algortihm here is Logistic Regression with C=0.33 parameter, L1 regularization and solver 'saga'.

About

Kaggle Competition - Prediction of the probability whether the product is succesfull or not based on the experiment data, representing various lab testing methods.


Languages

Language:Jupyter Notebook 100.0%