motiwari / BanditPAM

BanditPAM C++ implementation and Python package

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Categorical features in banditPAM

retzerjj opened this issue · comments

I was wondering if there was a way to include categorical features, along with numeric, for clustering in banditPAM.
many thanks,
Joe

Hi @retzerjj, my apologies for the delayed response.

It's possible to include categorical features by converting them to numeric features -- for example, by using dummy variables. (See for example https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html).

It's not possible to use categorical features directly as the metrics require numeric features to operate on (e.g., $L_2$ distance)