This repository contains the code and analysis for my EDA and regression on the medical cost dataset from Kaggle.
- Dataset: Medical Cost Dataset
- Kaggle Notebook: EDA + Regression on Medical Cost Dataset
In this project, I performed EDA and regression on the medical cost dataset to predict the medical costs of individuals. The dataset contains information on various factors such as age, gender, BMI, smoking status, and chronic conditions. I used a variety of machine learning techniques, including linear regression, random forests, and gradient boosting machines, to train prediction models.
I achieved an R-squared score of 0.87 on the test set, which indicates that my models are able to predict medical costs with a high degree of accuracy. However, it is important to note that this is a relatively small dataset, and my results may not generalize to other populations.
I invite you to explore my notebook and analysis, and to provide suggestions, corrections, and insights. Your feedback is incredibly important to me, and I am eager to learn from the community's expertise.
Thank you!