usercontext / TaskHierarchy138K

Hierarchy of Tasks crawled from WikiHow

Home Page:https://usercontext.github.io/TaskHierarchy138K/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TaskHierarchy138K

Please check the website for more details: https://usercontext.github.io/TaskHierarchy138K/

Hierarchical classifier

The task hierarchy consists of binary classifiers at each node individually trained on the respective wikihow datapoints as positive examples and sibling wikihow datapoints as negative examples. The incoming query is passed through this hierarchy in a top-down manner where each of these binary classifiers act as a gating mechanism.

Demonstration

Train

Go to exp folder

cd exp

Download the Bag of Words data attached to the task hierarchy: bow_data.zip and unzip.

python3 tasktrain.py

Test

In the same path, open python3 shell and follow the instructions:

$ cd exp
$ python3
>>> import json
>>> with open('../category_article.json', 'r') as f:
...   json_new = json.load(f)
>>> from tasktest import load_hier_model, greedytrickle, beamtrickle, clean_text
>>> load_hier_model(json_new)
>>> greedytrickle(json_new, clean_text("how to bake a strawberry cake"))
['Food and Entertaining', 'Recipes', 'Baking', 'Cakes', 'Fruit Cakes']
>>> beamtrickle(json_new, clean_text("how to bake a strawberry cake"))
[['Food and Entertaining', 0.58, 0.58, ['Recipes', 0.91, 0.53], ['Baking', 0.82, 0.44], ['Cakes', 1.0, 0.44]], 
['Food and Entertaining', 0.58, 0.58, ['Recipes', 0.91, 0.53], ['Baking', 0.82, 0.44], ['Scones', 0.19, 0.08]], 
['Food and Entertaining', 0.58, 0.58, ['Recipes', 0.91, 0.53], ['Fruits and Vegetables', 0.19, 0.1], ['Berries', 0.79, 0.08]], 
['Food and Entertaining', 0.58, 0.58, ['Recipes', 0.91, 0.53], ['Baking', 0.82, 0.44], ['Donuts and Doughnuts', 0.15, 0.07]], 
['Food and Entertaining', 0.58, 0.58, ['Recipes', 0.91, 0.53], ['Baking', 0.82, 0.44], ['Buns', 0.12, 0.05]]]

About

Hierarchy of Tasks crawled from WikiHow

https://usercontext.github.io/TaskHierarchy138K/


Languages

Language:Jupyter Notebook 95.2%Language:Python 4.8%