KIZI / pyIDS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pyIDS

pyIDS is a custom implementation of IDS (Interpretable Decision Sets) algorithm introduced in

LAKKARAJU, Himabindu; BACH, Stephen H.; LESKOVEC, Jure. Interpretable decision sets: A joint framework for description and prediction. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016. p. 1675-1684.

If you find this package useful in your research, please cite our paper on this Interpretable Decision Sets Implementation:

Jiri Filip, Tomas Kliegr. PyIDS - Python Implementation of Interpretable Decision Sets Algorithm by Lakkaraju et al, 2016. RuleML+RR2019@Rule Challenge 2019. http://ceur-ws.org/Vol-2438/paper8.pdf

Installation

The pyarc, pandas, scipy and numpy packages need to be installed before using pyIDS.

All of these packages can be installed using pip.

For pyarc, please refer to the Installation section of its README file.

Examples

training a simple IDS model

import pandas as pd
from pyids.algorithms.ids_classifier import mine_CARs
from pyids.algorithms.ids import IDS

from pyarc.qcba.data_structures import QuantitativeDataFrame
import io
import requests

url = "https://raw.githubusercontent.com/kliegr/arcBench/master/data/folds_discr/train/iris0.csv"
s = requests.get(url).content
df = pd.read_csv(io.StringIO(s.decode('utf-8')))
cars = mine_CARs(df, rule_cutoff=50)
lambda_array = [1, 1, 1, 1, 1, 1, 1]

quant_dataframe = QuantitativeDataFrame(df)

ids = IDS(algorithm="SLS")
ids.fit(quant_dataframe=quant_dataframe, class_association_rules=cars, lambda_array=lambda_array)

acc = ids.score(quant_dataframe)

optimizing for best lambda parameters using coordinate ascent, as described in the original paper

import pandas as pd
import io
import requests

from pyids.algorithms.ids_classifier import mine_CARs
from pyids.algorithms.ids import IDS
from pyids.model_selection.coordinate_ascent import CoordinateAscent

from pyarc.qcba.data_structures import QuantitativeDataFrame


url = "https://raw.githubusercontent.com/jirifilip/pyids/master/data/titanic.csv"
s = requests.get(url).content
df = pd.read_csv(io.StringIO(s.decode('utf-8')))
quant_df = QuantitativeDataFrame(df)
cars = mine_CARs(df, 20)


def fmax(lambda_dict):
    print(lambda_dict)
    ids = IDS(algorithm="SLS")
    ids.fit(class_association_rules=cars, quant_dataframe=quant_df, lambda_array=list(lambda_dict.values()))
    auc = ids.score_auc(quant_df)
    print(auc)
    return auc



coord_asc = CoordinateAscent(
    func=fmax,
    func_args_ranges=dict(
        l1=(1, 1000),
        l2=(1, 1000),
        l3=(1, 1000),
        l4=(1, 1000),
        l5=(1, 1000),
        l6=(1, 1000),
        l7=(1, 1000)
    ),
    ternary_search_precision=50,
    max_iterations=3
)

best_lambdas = coord_asc.fit()

About

License:MIT License


Languages

Language:Jupyter Notebook 96.1%Language:Python 3.1%Language:TeX 0.6%Language:Yacc 0.2%