vedranav / hierarchy-decomposition-pipeline

Hierarchy decomposition pipeline is a supervised machine learning tool that constructs random forest ensembles from data sets with hierarchical class.

Home Page:https://vedranav.github.io/hierarchy-decomposition-pipeline/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hierarchy decomposition pipeline

Hierarchy decomposition pipeline is a supervised machine learning tool that constructs random forest ensembles from data sets with hierarchical class.

Data set with hierarchical class

Suitable data sets have:

  • Class labels organised in a hierarchy
  • Hierarchy in the shape of a tree or directed acyclic graph
  • Examples annotated with one or several paths from the hierarchy

Features

  • Five algorithms that construct ensemble models from data sets with hierarchical class
  • Tool that estimates models' predictive performance using cross-validation
  • Tool for predicting paths from the hierarchy that best describe unlabelled examples
  • Tool that computes data set properties

Quick start

java -jar hierarchy-decomposition-pipeline-0.0.1.jar settings.s

Reference

The pipeline is based on ideas presented in the following paper:

Vidulin V., Džeroski S. (2020) Hierarchy Decomposition Pipeline: A Toolbox for Comparison of Model Induction Algorithms on Hierarchical Multi-label Classification Problems. In: Appice A., Tsoumakas G., Manolopoulos Y., Matwin S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science, vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_32

If you find the pipeline useful, please cite that reference.

Contact

If you have a data mining problem with hierarchical class and are interested in cooperation, feel free to contact me.

Project website

The pipeline is described in more detail on the project website.

Warning

This project is a work in progress. If you have any problems with the code or documentation please report as issues.

About

Hierarchy decomposition pipeline is a supervised machine learning tool that constructs random forest ensembles from data sets with hierarchical class.

https://vedranav.github.io/hierarchy-decomposition-pipeline/

License:MIT License


Languages

Language:Java 100.0%