Morghulis is an attempt to create a common API for face datasets. There are many face datasets available. Each of them has its own conventions and annotation format, but at the end, they all consist of a set of images with the respective annotated faces.
To make things worse the existent object detection libraries: Detectron , Tensorflow Object Detection API and Darknet's YOLO, to name a few, all use different formats for train/eval/test. Detectron uses COCO json format, Tensorflow uses tf records, and so on.
Once Morghulis loads a dataset, it can be easily exported to different formats
Currently the following datasets are supported:
- WIDER FACE - 32,203 images and 393,703 faces.
- FDDB - 2,845 images and 5,171 faces.
- AFW - 205 images and 473 faces.
- PASCAL faces - 850 images and 1335 faces.
- TODO MAFA - 30,811 images and 35,806 masked faces.
- TODO IJB-C
- TODO Caltech faces - 450 frontal face images of 27 or so unique people
TODO
# Download wider face
docker run --rm -it \
-v ${PWD}/datasets:/datasets \
housebw/morghulis \
./download_dataset.py --dataset widerface --output_dir /datasets/widerface
# Download fddb
docker run --rm -it \
--volumes-from ds \
housebw/morghulis \
./download_dataset.py --dataset fddb --output_dir /ds/fddb/
# Generate TF records for fddb
docker run --rm -it \
--volumes-from ds \
housebw/morghulis \
./export.py --dataset=fddb --format=tensorflow --data_dir=/ds/fddb/ --output_dir=/ds/fddb/tensorflow/
# Generate COCO json files for widerface
docker run --rm -it \
-v ${PWD}/datasets:/ds \
housebw/morghulis \
./export.py --dataset=widerface --format=coco --data_dir=/ds/widerface/ --output_dir=/ds/widerface/coco/
Use a Wider
or FDDB
dataset object to download and export to different formats:
data_dir = '/datasets/WIDER'
ds = Wider(data_dir) # FDDB(data_dir)
# downloads train, validation sets and annotations
ds.download()
# generate darknet (YOLO)
ds.export(darknet_output_dir, target_format='darknet')
# generate tensorflow tf records
ds.export(tf_output_dir, target_format='tensorflow')
# generates COCO json file (useful for Detectron)
ds.export(coco_output_dir, target_format='coco')