This project is dedicated to predicting whether a passenger was satisfied with their travel experience on the Shinkansen Bullet Train in Japan. Utilizing machine learning models, the goal is to analyze various factors contributing to passenger satisfaction and predict future passengers' satisfaction levels.
The Shinkansen, also known as the Bullet Train, is a high-speed train service in Japan. Passenger satisfaction can vary based on numerous factors, including on-time performance, seating comfort, and overall service quality. This project aims to use data analytics and machine learning to predict passenger satisfaction levels.
The dataset includes information from two main sources:
- Traveldata: Contains information about the travel details of passengers.
- Surveydata: Contains feedback from passengers regarding various aspects of their travel experience.
Both datasets are divided into training and test sets, with the Overall_Experience
variable indicating passenger satisfaction available in the training set.
Key features analyzed in this project include:
- Demographic details of passengers
- Travel class and distance
- Delays in departure and arrival
- Passenger ratings on various service aspects
Several machine learning models were trained to predict passenger satisfaction, including:
- LightGBM
- XGBoost
- Random Forest
To run this project, you will need Python and the following libraries:
- pandas
- numpy
- matplotlib
- seaborn
- missingno
- statsmodels
- scipy
- hyperopt
- lightgbm
- xgboost
- scikit-learn
Most of these can be installed via pip:
pip install pandas numpy matplotlib seaborn missingno statsmodels scipy hyperopt lightgbm xgboost scikit-learn
To use this project, clone the repository and run the Jupyter notebook:
-
Clone the repository:
git clone <repository-url>
-
Navigate to the project directory:
cd Shinkansen-Passenger-Satisfaction-Prediction
-
Run Jupyter notebook:
jupyter notebook
Open the
MIT Program Hackathon.ipynb
notebook and execute the cells to train the models and make predictions.