dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some questions about the demo

cyy-1234 opened this issue · comments

RnI1apjJ56
Hi,Author,
I recently experienced the demo version you released and found it very interesting. I would like to ask what kind of general Instruction can be used to obtain the prompt word format before generating the picture. I feel that these prompt words before generating the picture are very regular. Can it be provided? Instructions for generating prompt words, I would be grateful if you could

Hi, during the training and inference, we use the <GEN> as a trigger to initiate the generation process. In this website demo, for user convenience, we insert the trigger <GEN> to instruction prompts when users input the keyword generate. Of course, you can click the Yes button in the left bottom corner to enable the generate image function whenever you want to use this function.

When I input a longer or shorter sentence or picture, how should I ask the question?the model can output the prompt word in that format (red box). Only this prompt format is needed.

image
How should I ask to get “output text”?