mdh266 / KMeans

Creating A Scikit-Learn Compatable Clustering Algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Writing A Scikit Learn Compatible Clustering Algorithm


About


In this post, I will go over how to write a K-means clustering algorithm from scratch using NumPy. The algorithm will be explained in the next section and while seamingly simple, it can be tricky to implement efficiently! As an added bonus, I will go over how to implement a Scikit-Learn compatible clustering algorithm so that we can using Scikit-Learn's framework including Pipelines and GridSearchCV.

Using The Notebook


You can install the dependencies and access the notebook using Docker by building the Docker image with the following:

docker build -t kmeans .

Followed by running the command container:

docker run -ip 8888:8888 -v `pwd`:/home/jovyan -t kmeans

See here for more info.

Otherwise without Docker, make sure to use Python 3.9 and install the libraries listed in requirements.txt. These can be installed with the command,

pip install -r requirements.txt

About

Creating A Scikit-Learn Compatable Clustering Algorithm


Languages

Language:Jupyter Notebook 99.4%Language:Dockerfile 0.6%