fork-commit-merge / fork-commit-merge

Fork, Commit, Merge. A project designed to help you familiarize yourself with the open source contribution workflow on GitHub!

Home Page:https://forkcommitmerge.vercel.app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fork, Commit, Merge - Hard Issue (Python)

nikohoffren opened this issue · comments

Fork, Commit, Merge - Hard Issue (Python)

Implementing a Decision Tree Classifier from Scratch

Note: You don't have ask permission to start solving the issue or get assigned, since these issues are supposed to be always open for new contributors. The actions-user bot will reset the file back to previous state for the next contributor after your commit is merged. So you can just simply start working with the issue right away!

How to get started

For this task you need to have Python and NumPy installed. Check out Installing Python section in README if you need to install Python. If you already have Python installed, you can install NumPy using pip (Python's package installer) with the following terminal command:

pip install numpy

Or, if you're using Python 3 specifically and have both Python 2 and Python 3 installed, you may need to use:

pip3 install numpy

After that you can open the tasks/python/hard directory from the root of the project.
Then open decision_tree.py file and start working on your solution!

Description

Implement a Decision Tree Classifier from scratch using Python. Do not use libraries like scikit-learn that have pre-implemented classifiers; instead, use basic libraries like NumPy for numerical calculations.

Objectives:

  • Understand the theory behind Decision Trees.
  • Implement a simple yet functional Decision Tree Classifier.
  • Test the classifier on a dataset.

Use any freely available classification dataset. Ensure the dataset has at least 3 features and at least 2 classes.

Implementation Steps:

  • Write functions to calculate Gini impurity or Information Gain.
  • Implement a class DecisionTree with methods for fitting and predicting.
  • Implement a function to print/visualize the tree.

Testing:

  • Split the dataset into training and test sets and evaluate your classifier.

Acceptance Criteria:

  • Successfully implement a Decision Tree Classifier from scratch.
  • Evaluate the classifier on a dataset, showing reasonable performance metrics (e.g., accuracy, precision, recall).

Resources:

How to run

Make sure you are in the right directory:

cd tasks/python/hard

Execute the following command to run your Python script:

python decision_tree.py

Expected output

Output should look similar to this:

The predicted class for the sample [0.4, 0.6] is 0.

If the output looks correct, you are ready to make a pull request!


To work with this issue, you need to have Python installed to your local machine.
Check out README.md for more instructions of installing Python and how to make a pull request.

Feel free to ask any questions here if you have some problems!

Also, kindly give this project a star to enhance its visibility for new developers!