Rhythmblue / Sign-Language-Datasets

Intro of some sign language datasets suitable for research

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sign Language Datasets

Here, we introduce several publicly available sign language datasets. They are suitable for multiple sign language (SL) processing tasks, including SL recognition, translation and generation.

We also provide the creation method of LMDB, which is space-saving and loading-friendly. All frames are converted to JPG format and saved as binary file in LMDB database.

Usage

usage: lmdb_dataset_modality.py [-h] [-nw NUM_WORKERS] [-tt TARGET_TMP_PATH] source_path target_path
  • source_path: the path where original data stored
  • target_path: the path where lmdb will be stored
  • target_tmp_path: the path where transformed images stored. If -tt ... not set, temporary .jpg file will be deleted after stored in LMDB.

RWTH-PHOENIX-Weather 2014 (German SL)

Keywords: continuous SL, sign gloss

Links: Homepage, Paper (CVIU'2015)

phoenix-2014-multisigner

LMDB Database

fullFrame-210x260px

python scripts/lmdb_ph14_full_rgb.py .../fullFrame-210x260px lmdb/ph14/full_rgb_224 -nw 4

trackedRightHand-92x132px

python scripts/lmdb_ph14_hand_rgb.py .../trackedRightHand-92x132px lmdb/ph14/hand_rgb_112 -nw 4

RWTH-PHOENIX-Weather 2014 T (German SL)

Keywords: continuous SL, sign gloss, German translation

Links: Homepage, Paper (CVPR'2018)

LMDB Database

fullFrame-210x260px

python scripts/lmdb_ph14-t_full_rgb.py .../fullFrame-210x260px lmdb/ph14T/full_rgb_224 -nw 4

Pose Annotation

In STMC (AAAI'20), authors use HRNet (CVPR'19) to conduct automatic pose annotation. The estimated upper-body keypoint array (T, 7, 2) are saved in a Dict indexed with video name. Each keypoint is recorded as (w, h) and normalized between [0, 1].

Download Links

Dataset HRNet
PHOENIX-2014 GoogleDrive
PHOENIX-2014-T GoogleDrive

How to read

import pickle as pkl
with open('pose_phoenix2014_up_hrnet_TxN_wh.pkl', 'rb') as f:
    dict_pose = pkl.load(f)
print(dict_pose['01April_2010_Thursday_heute_default-0'].shape)

About

Intro of some sign language datasets suitable for research

License:Other


Languages

Language:Python 100.0%