This project aims to develop regression models to predict diamond prices using historical transaction data.
In this project, we explore a dataset containing information about diamond transactions, including features such as carat weight, cut type, clarity, color, and city of transaction. Using these features, we build regression models to predict the price of diamonds.
data/: This directory contains the datasets used in the project. notebooks/: Here are the Jupyter notebooks used for data analysis and model construction.
1_RandomForestRegressor.ipynb: Notebook containing data preprocessing, including data loading, removal of unnecessary columns, and one-hot encoding of categorical variables.
2_Model_Training.ipynb: Notebook where regression models are trained using Random Forest and XGBoost.
solution_1.csv: CSV file containing predictions from the Random Forest model on the test dataset.
solution_2.csv: CSV file containing predictions from the XGBoost model on the test dataset.
README.md: This file providing information about the project.
This project requires the following dependencies:
-
Python 3.x
-
pandas
-
scikit-learn
-
XGBoost
Contributions are welcome! If you'd like to contribute to this project, please open an issue or submit a pull request.
βββ ih_datamadpt0923_project_m3
βββ 1_RandomForestRegressor
βββ 2_XGBOOST
βββ data
βββ LICENSE
βββ solution_1
βββ solution_2