This repository provides human typicality ratings for each animal exemplar from three categories (bird, fish, mammal) which are based on the original Zoo dataset (see below).
Human typicality ratings were used in:
Belohlavek, R., Mikula, T.: Typicality: a formal concept analysis account (2021 - preprint).
Original Zoo dataset [1] consists of 101 animals and 17 features. Each animal is member of one of the 7 categories (types).
Dataset can be downloaded here: https://archive.ics.uci.edu/ml/datasets/zoo.
Bird, fish, mammal (type 1, 2, 4 in Zoo dataset) categories were selected for assessing typicality ratings. All exemplars for each category are listed in following table. Note that girl
exemplar was omitted.
Category | Count | Exemplars |
---|---|---|
bird | 20 | chicken, crow, dove, duck, flamingo, gull, hawk, kiwi, lark, ostrich, parakeet, penguin, pheasant, rhea, skimmer, skua, sparrow, swan, vulture, wren |
fish | 13 | bass, carp, catfish, chub, dogfish, haddock, herring, pike, piranha, seahorse, sole, stingray, tuna |
mammal | 40 | aardvark, antelope, bear, boar, buffalo, calf, cavy, cheetah, deer, dolphin, elephant, fruitbat, giraffe, goat, gorilla, hamster, hare, leopard, lion, lynx, mink, mole, mongoose, opossum, oryx, platypus, polecat, pony, porpoise, puma, pussycat, raccoon, reindeer, seal, sealion, squirrel, vampire, vole, wallaby, wolf |
Mean typicality ratings for each exemplar are available in data/typicality ratings/
folder. Alongside mean value, sample standard deviation (std) and number of non-missing human assessment (nonmissing) was calculated.
Respondents were native Czech and Slovak speakers. Each exemplar was translated to Czech language according to exemplar_translation.py
file.
Each exemplar from selected categories was assessed on scale 1 (least typical) to 5 (most typical) by up to 242 respondents (136 were women, 106 were men). Participants were able to see all exemplars from given category at once and were allowed to skip unknown exemplars, so not all of the exemplars were assessed by all 242 respondents. Median, minimum and maximum age of participants was 23, 17 and 81.
The original_responses.csv
file includes unprocessed responses from participants.
For convenient experiments, subset of original Zoo dataset is available as attachment to this dataset in data/features/mini_zoo.csv
.
Type | Features |
---|---|
bool | hair, feathers, eggs, milk, airborne, aquatic, predator, toothed, backbone, breathes, venomous, fins, tail, domestic, catsize, no legs, two legs, four legs |
str | exemplar, category |
List of modifications to the original Zoo dataset:
- Original numeric
legs
feature was converted to multiple boolean features (no legs
,two legs
,four legs
). - As mentioned, girl exemplar was removed.
- Original
animal name
feature was renamed asexemplar
. - Original
type
feature was renamed ascategory
and original numeric values (1, 2, 4) are transformed to strings (bird, fish, mammal). - Exemplars from other categories are removed.
[1] Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences (2019). http://archive. ics.uci.edu/ml