tmacmilan / 2021.0104

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

INFORMS Journal on Computing Logo

This archive is distributed in association with the INFORMS Journal on Computing under the MIT License.

The software and data in this repository are a snapshot of the software and data that were used in the research reported on in the paper An Ensemble Learning Approach with Gradient Resampling for Class-imbalanced problems by Chuang Zhao. The snapshot is based on JOC in the development repository.

Important: This code is being developed on an on-going basis at https://github.com/Data-Designer/JOC. Please go there if you would like to get a more recent version or would like support

Cite

DOI

elow is the BibTex for citing this data.

@article{PutABibTexKeyHere,
  author =        {Hongke Zhao, Chuang Zhao, Xi Zhang, Nanlin Liu, Hengshu Zhu, Qi Liu, Hui Xiong},
  publisher =     {INFORMS Journal on Computing},
  title =         {An Ensemble Learning Approach with Gradient Resampling for Class-imbalanced problems v2021.0104},
  year =          {2022},
  doi =           {10.5281/zenodo.6360996},
  note =           {https://github.com/INFORMSJoC/2021.0104},
}  

Description

The goal of this software is to demonstrate the effect of An Ensemble Learning Approach with Gradient Resampling for Class-imbalanced problems optimization.

In this paper, we propose a new approach from the sample-level classification difficulty identifying, sampling and ensemble learning. Accordingly, we design an ensemble approach in pipe with sample-level gradient resampling, i.e., Balanced Cascade with Filters (BCWF). Before that, as a preliminary exploration, we first design a Hard Examples Mining Algorithm (HEM) to explore the gradient distribution of classification difficulty of samples and identify the hard examples.

The figure below gives an overview of the our framework.

image-20220615204204109

Building

Main dependencies:

To install requirements, run:

pip install -r requirements.txt

Usage

A typical usage example:

# Define model
model_class = BcwfH(dataset_name, T=15) 
# Model calculation
model_class.apply_all()
# Metrics
metrics = model_class.display()

You can run .py file too.

Here is an example:

# Run Model
python main.py
# Hyper-test
python hyper.py
# Get comparsion
python result.py

Result

image-20221119101216770

image-20221119101233632

image-20221119101312282

Replicating

To replicate the results in any of the tables in the paper, simply follow the Usage or refer to https://github.com/Data-Designer/JOC.

About

License:MIT License


Languages

Language:Python 100.0%