mhahsler / dbscan

Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms - R package

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add broom tidier methods

joeroe opened this issue · comments

Hello, thank you for this very helpful package.

Would you be open to a pull request adding broom tidier methods for dbscan and hdbscan objects?

Yes, definitely. An important thing is that the dependencies of dbscan do not get expanded. Suggests would be fine.

Great, I'll do that.

You wouldn't have to have a dependency on broom, but the recommended approach involves importing and re-exporting the generic functions from the generics package (which is designed to be as light a dependency as possible).

It looks like it might be possible to only have generics as a Suggests using this trick, but if it were me I'd prefer the extra Depends over messing with the namespace on load? Up to you of course.

I agree and think that importing generics is better.

Perhaps I don't understand correctly, but broom implements the transformation functions, i.e. there should never be a broom dependency in dbscan, but rather dbscan in broom, see the following example: https://github.com/tidymodels/broom/blob/HEAD/R/mclust-tidiers.R#L63. Since dbscan wrappers are not implemented yet in the broom package that would be an open topic. I don't see an open issue for that, but perhaps they're open for PRs in case this is relevant.

Good point. I will look into that.

@m-muecke broom used to implement the transformation methods, but due the volume now no longer accepts them.

Ah, my mistake. I must have missed that. But, the generics package seems very light with only importing from methods. Then there would always be the option to create an extension package for it like broomstick or mixed.models. For clustering I don't believe there is one yet, but there is tidyclust in case you look for tidymodels compatibility, which doesn't support dbscan though. On a related note there is support for dbscan in mlr3cluster.

I have added tidiers to the package. Check them out @joeroe

@m-muecke If you want to look at how to get integration with tidy clust, that would be great. I assume that since we have tidies, that should be possible now.