gpt-2 huggingface pytorch visual-transformers

MMTOD

Multimedia dialogue systems have become more prevalent in various industries, including travel, retail, and others. Multimedia systems have gotten less attention than text media, although most previous research on conversational systems has exclusively concentrated on text. In addition to obtaining outstanding results in earlier studies, other difficulties—such as selecting the appropriate medium for the response and retrieving the most suitable image throughout the conversation—have received less attention. This thesis used the high-end image question answering approaches to address these issues, significantly improving the results of earlier multimedia dialogue system models in terms of the image matching criterion.

About

Multi Modal Task Oriented Dialogue System (MMTOD)

gpt-2 huggingface pytorch visual-transformers

MIT License

Languages

Language:Python 70.7%Language:CSS 12.0%Language:HTML 11.4%Language:JavaScript 5.9%