- guidelines (in Dutch)
- DONE human annotated data ( NAF)
- DONE machine annotated data (NAF)
- lexicon based on annotated data (LMF)
- overview annotation tags, and emotion hierarchies
-
- korte beschrijving per dataset
- data license
- Translation between Dutch and English labels
- Corpus metadata?
- anything else
The dataset consists of four subsets:
- Annotation corpus: texts selected from Nederlab manually annotaded with HEEM labels (including humor modifiers and intensifiers) (29 texts)
- Ceneton: texts selected from Ceneton (34 texts)
- Corpus big: other texts selected from Nederlab (149 texts)
- EDBO: texts selected from Early Dutch Books Online (67 texts)
The naf
directory contains the annotations and predicted labels in NAF-
format. The emotions can be found in the emotions-layer:
<emotions>
<emotion id="emo0">
<emotion_target/>
<emotion_holder/>
<emotion_expression/>
<span>
<!-- kop -->
<target id="t1874"/>
</span>
<externalReferences>
<externalRef reference="conceptType:bodyPart" resource="heem"/>
<externalRef reference="head" resource="heem:bodyParts"/>
</externalReferences>
</emotion>
<emotion id="emo1">
<emotion_target/>
<emotion_holder/>
<emotion_expression/>
<span>
<!-- maek my de kop niet warm -->
<target id="t1871"/>
<target id="t1872"/>
<target id="t1873"/>
<target id="t1874"/>
<target id="t1875"/>
<target id="t1876"/>
</span>
<externalReferences>
<externalRef reference="conceptType:bodilyProcess" resource="heem"/>
<externalRef reference="emotionType:anger" resource="heem"/>
<externalRef reference="anger" resource="heem:clusters"/>
<externalRef reference="negative" resource="heem:posNeg"/>
</externalReferences>
</emotion>