feizc / Cleaned-Webvid

Use strategy to achieve clean webvid-10m dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cleaned-Webvid

Use strategy to achieve cleaned webvid-10m video-text dataset for video generation modeling.

  1. Get watermark with blackground.

black

  1. Expand the area using erosion and dilation algorithms.

black_binary

  1. Inpainting image with OpenCV.

image dst

Repeat the watermark removing for each frame in video.

About

Use strategy to achieve clean webvid-10m dataset


Languages

Language:Python 100.0%