“Task Success” is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool