Megh-Bhatt/health_insurance_predictor

data-science health-insurance machine-learning model-training numpy pandas prediction python random-forest regression scikit-learn

Health Insurance Predictor

This repository explores predicting health insurance premium amounts using individual characteristics. The code leverages the Random Forest algorithm and various Python libraries like scikit-learn and pandas for data manipulation and analysis.The dataset that I am using for the task of health insurance premium prediction is collected from Kaggle. It contains data about: the age of the person, gender of the person, Body Mass Index of the person, how many children the person is having, whether the person smokes or not, the region where the person lives and the charges of the insurance premium.

Highlights

How the Database Looks like

Regression Function:
Final Results:

Libraries & Roles:

Scikit-learn: Provides the RandomForestRegressor algorithm for accurate predictions.
Pandas: Facilitates efficient data loading, manipulation, and exploration through DataFrames.
Numpy: Enables numerical computations and array operations for feature engineering and model training.
Plotly.express: Creates interactive visualizations like histograms and pie charts for data exploration.

How to use on your local device:

Download the repository
make a new folder extract the files in it
Install necessary packages using pip install -r requirements.txt.
After finally downloading the requirements you can run the code and train and experiment with the model

About

It's a machine learning model using the Random Forest algorithm to predict health insurance premium amounts based on individual characteristics like age, sex, BMI, and smoker status. It includes data exploration, feature engineering, model training, and prediction.

data-science health-insurance machine-learning model-training numpy pandas prediction python random-forest regression scikit-learn

Languages

Language:Python 100.0%