Image & Text attention

Question

Image & Text attention

SuryaThiru opened this issue a year ago · comments

Surya Krishnamurthy commented a year ago

I'm looking to visualize the attention heads of models like layoutlmv3 that taken in both the text tokens and image inputs in a meaningful way. Is there a way I can do this with bertviz?

Thank you!

Jesse Vig · Answer 1 · Mon Mar 06 2023 03:58:00 GMT+0800 (China Standard Time)

Hi @SuryaThiru there may be a way to extend it somehow with the patch embeddings, but it isn't something I've looked at in detail. Sorry not to be more helpful here.