Prometheussx / knn-customer-segmentation

This repository contains the code for a K-Nearest Neighbors (KNN) model built to classify customer segments in Türkiye using the teleCust1000T dataset. The project includes data cleaning, visualization, feature scaling, model training, and evaluation with accuracy metrics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Customer Segmentation With K-NN

Project: Customer Segmentation with K-Nearest Neighbors Algorithm

This project aims to segment customers in the teleCust1000T dataset using the K-Nearest Neighbors (KNN) algorithm. The project involves data visualization, feature analysis, model training and evaluation, and identification of the optimal number of neighbors for KNN.

Data and Libraries

The project utilizes the following:

  • Data: teleCust1000T.csv containing information about customers, such as tenure, age, income, and customer category.
  • Libraries: NumPy, Pandas, Scikit-learn, matplotlib

You can install these libraries using pip:

pip install numpy
pip install pandas
pip install scikit-learn
pip install seaborn
pip install matplotlib

Project Structure

The project is organized into the following sections:

  1. Data Import and Exploration: Reads the CSV data, analyzes data distribution, and identifies potential outliers.

  2. Feature Selection: Selects relevant features for the KNN model.

  3. Data Preprocessing: Standardizes numeric features and encodes categorical features.

  4. Train-Test Split: Divides data into training and testing sets for model training and evaluation.

  5. KNN Model Training: Trains a KNN model with different values of K.

  6. Model Evaluation: Evaluates the performance of trained models using metrics like accuracy and confusion matrix.

  7. Finding Optimal K: Identifies the optimal number of neighbors for KNN based on model performance.

  8. Visualization: Plots data distributions, accuracy curves, and confusion matrices for different K values.

  9. Results and Conclusion: Summarizes key findings and interpretations of the KNN model's performance.

Clone the project repository:

git clone https://github.com/Prometheussx/knn-customer-segmentation.git
cd knn-customer-segmentation

Key Results

  • The implemented KNN model achieved a best accuracy of [accuracy value]% with [optimal K value] neighbors.
  • The model was able to successfully identify patterns and segment customers into different categories based on their features.
  • The results demonstrate the effectiveness of KNN for customer segmentation and provide valuable insights for targeted marketing campaigns.

Future Work

  • This project can be extended by incorporating additional features and exploring other machine learning algorithms for customer segmentation.
  • Further analysis could be done to understand the influence of individual features on customer segmentation and develop explainable models.
  • The model could be integrated into a real-world application for customer targeting and personalized recommendations.

Images

The README.md file includes images to visualize data distributions, accuracy curves, and confusion matrices for different K values. This enhances the understanding of the project's results and provides visual aids for interpreting the KNN model's performance.

image

License

This project is released under the MIT License.

Author

Feel free to reach out if you have any questions or need further information about the project.

About

This repository contains the code for a K-Nearest Neighbors (KNN) model built to classify customer segments in Türkiye using the teleCust1000T dataset. The project includes data cleaning, visualization, feature scaling, model training, and evaluation with accuracy metrics.

License:MIT License


Languages

Language:Python 100.0%