jzyxn / ball-k-means

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ball-k-means

  • Ball k-means algorithms is described in detail in https://ieeexplore.ieee.org/document/9139397.

  • the implementation of the ball k-means algorithm of the C++ version can be found in the "C++Version" file.

  • the implementation of the ball k-means algorithm of the Python version can be found in the "PythonVersion" file.

  • All data used in the paper is in the compressed file "data+centers(1).zip".

C++/Python version (C++/python 版本):

  • the implementations of the ball k-means algorithm are "ball_k_means_Xf.cpp"/"ball_k_means_Xf.py" and "ball_k_means_Xd.cpp"/"ball_k_means_Xd.py", which are code for "float" and "double" versions respectively.

  • the param "isRing" is used to switch the ring version and the no ring version of the algorithm.

  • According to our experience, the "Xd" version can get more accurate results but the running time is slightly slower than "Xf"; the "Xf" version can reach the fastest running time, but low accuracy may result in many decimal places of data .

Requirements (环境要求)

Minimal installation requirements (C++) (最低需要安装要求):

  • C++ compiler supporting C++11

  • Linux operating system or Windows operating system

  • Eigen 3 template library

Optional but recommended (C++) (可选安装要求但建议):

Installation requirements (Python) (安装要求):

  • Only need to rely on the DLL files in the "PythonVersion" file.

Installation (C++) (安装)

Using (用法)

C++ version (C++ 版本):

Step 1: call "ball_k_means" function. (调用"ball_k_means"函数)
Parameters (参数说明):
  • dataset: clustering data in Matrix format in the Eigen library.

  • centroids: initial center point data in matrix format in the Eigen library.

  • isRing: bool type, optional parameters, switch the ring version and the no ring version of the algorithm. "true" means the current algorithm is a ring version, and "false" means the current algorithm is no ring version. The default is false.

  • detail: bool type, optional parameters, "true" means output detailed information (including k value, distance calculation times, time, etc.), "false" means no detailed information is output. The default is false.

Output (输出说明):
  • labels: labels of clustering data in matrix format in the Eigen library.

python version (python 版本):

Step 1: declare class "ball_k_means" and initialize algorithm. (声明"ball_k_means"类,算法初始化)
Parameters (参数说明):
  • isRing: bool type, optional parameters, switch the ring version and the no ring version of the algorithm. "true" means the current algorithm is a ring version, and "false" means the current algorithm is no ring version. The default is false.

  • detail: bool type, optional parameters, "true" means output detailed information (including k value, distance calculation times, time, etc.), "false" means no detailed information is output. The default is false.

Step 2: call "fit" function (调用"fit"函数)
Parameters (参数说明):
  • dataset: absolute path of th csv file of clustering data.

  • centroids: absolute path of th csv file of initial center point data.

Output (输出说明):
  • labels: labels of clustering data in numpy matrix format.

Examples (示例):

Doesn't work? (有疑问?)

About

License:Apache License 2.0


Languages

Language:C++ 96.4%Language:Python 3.6%