kulits / IG-LLM

Code for "Re-Thinking Inverse Graphics With Large Language Models"

Home Page:https://ig-llm.is.tue.mpg.de

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Re-Thinking Inverse Graphics With Large Language Models

Peter Kulits*, Haiwen Feng*, Weiyang Liu, Victoria Abrevaya, Michael J. Black

[Project Page]

Data and code coming soon.

Summary

We present the Inverse-Graphics Large Language Model (IG-LLM) framework, a general approach to solving inverse-graphics problems. We instruction-tune an LLM to decode a visual (CLIP) embedding into graphics code that can be used to reproduce the observed scene using a standard graphics engine. Leveraging the broad reasoning abilities of LLMs, we demonstrate that our framework exhibits natural generalization across a variety of distribution shifts without the use of special inductive biases.

image

About

Code for "Re-Thinking Inverse Graphics With Large Language Models"

https://ig-llm.is.tue.mpg.de