ammar0211 / Heart-Disease-Prediction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Warm Up: Machine Learning with a Heart

Heart disease is the number one cause of death worldwide. To learn how to prevent heart disease we must first learn to reliably detect it. This dataset is from a study of heart disease that has been open to the public for many years. The study collects various measurements on patient health and cardiovascular statistics, and of course, makes patient identities anonymous.

Data is provided courtesy of the Cleveland Heart Disease Database via the UCI Machine Learning repository, which is still used by the researchers to this date.

  • Aha, D., and Dennis Kibler. "Instance-based prediction of heart-disease presence with the Cleveland database." University of California 3.1 (1988): 3-2.

In this project, we did an analysis on the data and do predictions thereafter. In brief, I did the following:

  • Feature engineering, data transformation, feature extraction, and predictive modeling using scikit-learn, NumPy, and XGBoost.
  • Performed some analyses on the dataset to draw insights.
  • Created logistic regression, kNN, random forests, and gradient boosting models and also tried blending.
  • Evaluated models using log_loss, accuracy, precision, and recall.
  • Successfully predicted the probability of heart disease in patients with high accuracy.

About


Languages

Language:Jupyter Notebook 100.0%