Megh-Bhatt / health_insurance_predictor

It's a machine learning model using the Random Forest algorithm to predict health insurance premium amounts based on individual characteristics like age, sex, BMI, and smoker status. It includes data exploration, feature engineering, model training, and prediction.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Health Insurance Predictor

This repository explores predicting health insurance premium amounts using individual characteristics. The code leverages the Random Forest algorithm and various Python libraries like scikit-learn and pandas for data manipulation and analysis.The dataset that I am using for the task of health insurance premium prediction is collected from Kaggle. It contains data about: the age of the person, gender of the person, Body Mass Index of the person, how many children the person is having, whether the person smokes or not, the region where the person lives and the charges of the insurance premium.

Highlights

How the Database Looks like
Screenshot 2024-02-18 182352
Regression Function: Screenshot 2024-02-18 182651
Final Results: Screenshot 2024-02-18 182501

Libraries & Roles:

  • Scikit-learn: Provides the RandomForestRegressor algorithm for accurate predictions.
  • Pandas: Facilitates efficient data loading, manipulation, and exploration through DataFrames.
  • Numpy: Enables numerical computations and array operations for feature engineering and model training.
  • Plotly.express: Creates interactive visualizations like histograms and pie charts for data exploration.

How to use on your local device:

  • Download the repository
  • make a new folder extract the files in it
  • Install necessary packages using pip install -r requirements.txt.
  • After finally downloading the requirements you can run the code and train and experiment with the model

About

It's a machine learning model using the Random Forest algorithm to predict health insurance premium amounts based on individual characteristics like age, sex, BMI, and smoker status. It includes data exploration, feature engineering, model training, and prediction.


Languages

Language:Python 100.0%