magic-research / PLLaVA

Can you provide a demo of captioning, which is to input a folder containing a large number of videos and output the caption of the video?

hi @963658029, we have the recaption recipe to help caption a bunch of videos. In this case, you need to download our models in huggingface and run on your own machine.

To use your own dataset, you should replace the video folders and annotation.json here: https://github.com/magic-research/PLLaVA/blob/6c81a8867574dc44ee7a96319297fdc976a867a4/tasks/eval/recaption/__init__.py#L212C1-L213C1

To run, use the script:

PLLaVA/scripts/eval.sh

Line 46 in 6c81a88

conv_mode=eval_recaption

We will update the readme to show this functionality soon.

We've updated the instructions #8, mainly here. Try construct a video gallery dataset for recaptioning.

You could also consider sharing the dataset with us! ❤️

provide more demo