stepjam / RLBench

A large-scale benchmark and learning environment.

Home Page:https://sites.google.com/corp/view/rlbench

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question: Understanding the "task description" given by init_episode() for each task.py

yiqiwang8177 opened this issue · comments

Hi,

I'm new to RLbench and want to do research on language-conditioned imitation learning using RLbench.

Could someone helping me to confirm my understanding of the task description returned by each init_episode() function?
Take straighten_rope.py as an example, these instructions are returned
['straighten rope', 'pull the rope straight', 'grasping each end of the rope in turn, leave the rope straight' ' on the table', 'pull each end of the rope until it is straight', 'tighten the rope', 'pull the rope tight']
I found that reset method of task_environment.py make use of init_episode() so that the initial observation and a set of sentences (I listed above) will be returned to the agent at timestep t = 0.

Does this mean a language-conditioned policy have to conditioned on "straighten ..., grasping ..., on the table, ...., pull the rope tight" at once starting from t = 0 ? Naively, one could summarize all sentences into one big vector and let policy condition it.

Or the policy is first conditions on "straighten rope", then at some point during evaluation, we receives "pull the rope straight" and after this sub-goal is done, the agent receives new instruction "grasping each end of the rope in turn, leave the rope straight" and we keep going until all sub-goal (instruction in the list) are consumed?

I personally prefer the first setting so that the agent is expose to all instructions and is responsible to figure out how to follow it on its own instead of receiving new instructions after each sub-goal of the task is completed.
I found lots of other benchmark have the 2nd setting which is the main reason I want to use RLbench for my research!

Thank you very much

Problem solved!

I notice that those task descriptions have overlap meanings!