algorithms clustering data-science machine-learning machine-learning-algorithms scikit-learn unsupervised-learning

Writing A Scikit Learn Compatible Clustering Algorithm

About

In this post, I will go over how to write a K-means clustering algorithm from scratch using NumPy. The algorithm will be explained in the next section and while seamingly simple, it can be tricky to implement efficiently! As an added bonus, I will go over how to implement a Scikit-Learn compatible clustering algorithm so that we can using Scikit-Learn's framework including Pipelines and GridSearchCV.

Using The Notebook

You can install the dependencies and access the notebook using Docker by building the Docker image with the following:

docker build -t kmeans .

Followed by running the command container:

docker run -ip 8888:8888 -v `pwd`:/home/jovyan -t kmeans

See here for more info.

Otherwise without Docker, make sure to use Python 3.9 and install the libraries listed in requirements.txt. These can be installed with the command,

pip install -r requirements.txt

About

Creating A Scikit-Learn Compatable Clustering Algorithm

algorithms clustering data-science machine-learning machine-learning-algorithms scikit-learn unsupervised-learning

Languages

Language:Jupyter Notebook 99.4%Language:Dockerfile 0.6%