ZahirAhmadChaudhry / Pulsar_dataset_Classification_using_SVM

This project focuses on classifying pulsar stars using the Support Vector Machine (SVM) algorithm, a powerful method in the realm of supervised learning. The goal is to automate the identification process of pulsar stars from candidates collected during surveys, based on predictive modeling.

Repository from Github https://github.comZahirAhmadChaudhry/Pulsar_dataset_Classification_using_SVMRepository from Github https://github.comZahirAhmadChaudhry/Pulsar_dataset_Classification_using_SVM

Pulsar Star Classification using SVM

This project focuses on classifying pulsar stars using the Support Vector Machine (SVM) algorithm, a powerful method in the realm of supervised learning. The goal is to automate the identification process of pulsar stars from candidates collected during surveys, based on predictive modeling.

Project Structure

  • Datasets: Holds the processed and raw datasets.
    • Processed_data: Contains processed data ready for analysis.
    • Raw_data: Contains raw data files.
  • v_pred_test: Stores predicted outcomes on test data.
  • notebooks: Jupyter notebooks for Exploratory Data Analysis (EDA) and model training.
  • venv: A virtual environment directory for project dependencies.
  • .gitignore: Specifies untracked files to ignore.
  • README.md: Provides an overview of the project.
  • requirements.txt: Lists all the necessary Python packages.

Setup

To run this project, follow these steps:

  1. Make sure Python 3.8 or later is installed on your machine.

  2. Clone the repository to your local environment.

  3. Navigate to the project's root directory and set up a Python virtual environment:

    python -m venv venv
  4. Activate the virtual environment:

    On Windows:

    .\venv\Scripts\activate

    On macOS and Linux:

    source venv/bin/activate
  5. Install the required dependencies:

    pip install -r requirements.txt

Usage

To perform EDA or train the SVM model, open the Jupyter notebooks located in the notebooks directory:

  • EDA_Test_Data.ipynb: For exploratory data analysis on test data.
  • EDA_Train_Data.ipynb: For exploratory data analysis on training data.
  • MODEL_TRAINING.ipynb: For training the SVM model.

Run the notebooks sequentially to explore the data and train the model.

Data

The Datasets directory is organized as follows:

  • Processed_data: Processed files like pulsar_data_test_processed.csv for use in modeling.
  • Raw_data: The original, unprocessed data files.

Predictions from the test data are saved in v_pred_test with filenames indicating they are predictions, such as Pulsar_data_test_Predicted.csv.

Contributing

If you'd like to contribute, please fork the repository and create a pull request with your features or changes.

License

Open-sourced software licensed under the MIT license.

About

This project focuses on classifying pulsar stars using the Support Vector Machine (SVM) algorithm, a powerful method in the realm of supervised learning. The goal is to automate the identification process of pulsar stars from candidates collected during surveys, based on predictive modeling.

License:MIT License


Languages

Language:Python 74.2%Language:C 13.4%Language:Jupyter Notebook 7.2%Language:Tcl 4.3%Language:C++ 0.4%Language:HTML 0.2%Language:Makefile 0.2%Language:Shell 0.0%Language:PowerShell 0.0%Language:Perl 0.0%Language:Roff 0.0%Language:CMake 0.0%Language:Batchfile 0.0%Language:CSS 0.0%Language:DTrace 0.0%