Dialog generator tool

This tool is intended to create dialog data

Quickstart

Environment: python 3.8.0 (tested on ubuntu/wsl) You can install the Python dependencies with

pip3 install -r requirements.txt

Setup face detection

cd dialog_system/FaceRecognition/process/module/face_detection
make

Speech to text: copy 2 weights at server23:/AIHCM/ASR/total_weight/dialog_generator_tool to STT/asr_model/base
FaceRecognize: copy weights folder at server23:/AIHCM/ASR/total_weight/dialog_generator_tool/FaceRecognize to FaceRecognition/weights

python inference.py -p 'video_path' -c True

-p: video path (str) (need save at "video" folder)
-c: cross check by Computer Vision (bool)

The result folder (ASR_result and CV_result) will be created at the folder that contain input_video
The dialog_file is located at ASR_result folder as txt file

Audio:
- Advantage: speech to text, denoiser, good verify with different voice
- Disadvantage: bad performance when same person but different tone and different person but same voice
Computer Vision:
- Advantage: filter deformed face, good verification with frontal face, mouth motion detection
- Disadvantage: bad performance when same person but different angle

Note: Highly recommend using with video contain 2 or 3 persons, with difference sex or voice, and the less noise is better

Language:Python 98.8%Language:Cython 0.6%Language:C 0.3%Language:Cuda 0.2%Language:Shell 0.1%Language:Makefile 0.0%Language:C++ 0.0%