linear-svm machine-learning machine-learning-algorithms polynomial-kernel pulsars rbf-svm support-vector-machines

Pulsar Star Classification using SVM

This project focuses on classifying pulsar stars using the Support Vector Machine (SVM) algorithm, a powerful method in the realm of supervised learning. The goal is to automate the identification process of pulsar stars from candidates collected during surveys, based on predictive modeling.

Project Structure

Datasets: Holds the processed and raw datasets.
- Processed_data: Contains processed data ready for analysis.
- Raw_data: Contains raw data files.
v_pred_test: Stores predicted outcomes on test data.
notebooks: Jupyter notebooks for Exploratory Data Analysis (EDA) and model training.
venv: A virtual environment directory for project dependencies.
.gitignore: Specifies untracked files to ignore.
README.md: Provides an overview of the project.
requirements.txt: Lists all the necessary Python packages.

Setup

To run this project, follow these steps:

Make sure Python 3.8 or later is installed on your machine.
Clone the repository to your local environment.
Navigate to the project's root directory and set up a Python virtual environment:
```
python -m venv venv
```
Activate the virtual environment:

On Windows:
```
.\venv\Scripts\activate
```
On macOS and Linux:
```
source venv/bin/activate
```
Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

To perform EDA or train the SVM model, open the Jupyter notebooks located in the notebooks directory:

EDA_Test_Data.ipynb: For exploratory data analysis on test data.
EDA_Train_Data.ipynb: For exploratory data analysis on training data.
MODEL_TRAINING.ipynb: For training the SVM model.

Run the notebooks sequentially to explore the data and train the model.

Data

The Datasets directory is organized as follows:

Processed_data: Processed files like pulsar_data_test_processed.csv for use in modeling.
Raw_data: The original, unprocessed data files.

Predictions from the test data are saved in v_pred_test with filenames indicating they are predictions, such as Pulsar_data_test_Predicted.csv.

Contributing

If you'd like to contribute, please fork the repository and create a pull request with your features or changes.

License

Open-sourced software licensed under the MIT license.

About

This project focuses on classifying pulsar stars using the Support Vector Machine (SVM) algorithm, a powerful method in the realm of supervised learning. The goal is to automate the identification process of pulsar stars from candidates collected during surveys, based on predictive modeling.

linear-svm machine-learning machine-learning-algorithms polynomial-kernel pulsars rbf-svm support-vector-machines

MIT License

Languages

Language:Python 74.2%Language:C 13.4%Language:Jupyter Notebook 7.2%Language:Tcl 4.3%Language:C++ 0.4%Language:HTML 0.2%Language:Makefile 0.2%Language:Shell 0.0%Language:PowerShell 0.0%Language:Perl 0.0%Language:Roff 0.0%Language:CMake 0.0%Language:Batchfile 0.0%Language:CSS 0.0%Language:DTrace 0.0%