RL Difficulty
Can environment difficulty predict agent performance?
Environment Types
- Fully vs partially observable
- Deterministic vs stochastic
- Competitive vs collaborative
- Single agent vs multiple agents
- Static vs dynamic
- Discrete vs continuous
- Episodic vs sequential
- Known vs unknown
However, there exist other factors about the environment that affects task difficulty.
Other Difficulties in RL
- Reward sparseness
- Long-term credit assignment
- State rarity
- Safety
- Presence of distractions
Experiment Design
The maze is a well-studied environment that can be generally modified in complexity.
- Formally define parameters of difficulty in the context of a maze, and build the corresponding gym environments.
- Benchmark SB3 algorithms to varying difficulties.
- Formulate a measure of difficulty by relating different difficulty parameters.
- Generalize the measure of difficulty and difficulty parameters to environments in general.
- Measure the correlation of environment difficulty to agent performance.
Other Approaches
- Use the entropy of a predictive model as the measure of difficulty
- Use multiple environments to benchmark agent capabilities (bsuite)
- Use the frequency and magnitude of direction change to measure difficulty of mazes (McClendon 2001)