Efficient Recovery Learning using Model Predictive Meta-Reasoning

This code is associated with our ICRA'23 paper: [Paper] [Project Website]

If you find our paper or code useful, please cite us as

@inproceedings{vats2023efficient,
  title={Efficient Recovery Learning using Model Predictive Meta-Reasoning},
  author={Vats, Shivam and Likhachev, Maxim and Kroemer, Oliver},
  booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={7258--7264},
  year={2023},
  organization={IEEE}
}

Installation Instructions

Dependencies

Download MuJoCo200 if you want to run the door opening environment.

Steps

Clone this repository with git clone --recurse-submodules.
cd metareskill
pip install -e robosuite
pip install -e rl-utils
pip install -e .

Training

Learning Preconditions

Before learning the recovery skills, you need to learn task sub-goals. This requires a set of nominal controllers. You might need to do steps 1-4 a few times to learn good subgoal classifiers.

Generate subgoals from nominal skill chain

python scripts/learn_skill_chain_preconditions.py intiailize=True

Input - Nominal skill chain
Output - all_subgoals.pkl + all_gt_subgoals.pkl + all_labels.pkl

Move the outputs to data/door_opening/final/nominal_skills/subgoals/<dir>

Train subgoal classifiers

python scripts/learn_subgoals.py

Input - all_subgoals.pkl + all_gt_subgoals.pkl + all_labels.pkl
Output - subgoals.pkl

Move the generated subgoals.pkl to nominal_skills/subgoals

Learn skill preconditions iteratively by chaining

python scripts/learn_skill_chain_preconditions.py finetune=True skill_ids=[2]

Input - subgoals.pkl
Output - learnt_preconds.pkl + learnt_gt_preconds.pkl

Move the generated files to nominal_skills/preconds/ and iteratively train the preconditions back from the goal to the start.

Verify the preconditions

python scripts/visualize_sim_states.py view_learnt_preconds_probs=True
python scripts/visualize_sim_states.py view_backchaining_states=True

Failure Discovery

Collect failures

python scripts/evaluate_nominal_skills.py

Input - Nominal skill chain + init sets
Output - failures.pkl

Cluster failures

python scripts/cluster_failures.py -i <path_to_failures.pkl>

Input - list of failures
Output - list of GMM models

Recovery Learning

Learn recovery skills

python scripts/learn_recovery_skills.py mode=train

Input - failures.pkl + Nominal skill chain + failure clusters
Output - recovery_skills.pkl

shivamvats / metareskill