microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Home Page:https://aka.ms/GeneralAI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can LayoutLM be used for language generation ?

pzdkn opened this issue · comments

I am using LayoutLM2 and LayoutLM3 for Key-Information Extraction. Since the output annotations are normalized, it's difficult to get token-level annotations.

I thought about rephrasing such tasks as a language generation problem instead, similar to Marksend et al, Doc2Dict: Information Extraction as Text Generation. However, is LayoutLM even capable/good at language generation ?

@pzdkn LayoutLM can be used as a general-purpose encoder for downstream tasks. You may need to design the decoder for generation or copy operations for language generation tasks.

@pzdkn any update ?