crzdg / environment-framework

Loose building blocks to create agent-environment loops.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

License Last Commit Coverage Tests PyPI PyPI PyPI

🌐 Environment Framework

This repository contains the Python package environment-framework. The project aims to provide loose building blocks to manage the logic, observation, estimation and visualization of an agent-environment loop. It can be used to implement problems which might be solved with reinforcement learning or dynamic programming algorithms.

A wrapper around gymnasium is provided to connect to well-known frameworks in the field.

The wrapper for gymnasium uses the gymnasium>=0.26 API structure!

πŸ€” Why create this project?

The project emerges from a previous project of mine. It was used to separate the different elements of the projects agent-environment-loop.

πŸš€ Get Started


pip3 install environment-framework

πŸ‘©β€πŸ« GridWorld Example

The implemented example of GridWorld can also be found in a Jupyter notebook grid_world.ipynb.

pip3 install "environment-framework[extra]"
jupyter lab
class Action(Enum):
    UP = 0
    DOWN = 1
    RIGHT = 2
    LEFT = 3

class GridWorldGame:
    def __init__(self, size: int) -> None:
        self.size = size
        self.player_position = (0, 0)
        self.target_position = (0, 0)

    def done(self) -> bool:
        return self.player_position == self.target_position

    def space(self) -> Space:
        return Discrete(4)

    def act(self, action: Action, **_: Any) -> None:
        if action == Action.UP:
            self.player_position = (self.player_position[0], self.player_position[1] - 1)
        if action == Action.DOWN:
            self.player_position = (self.player_position[0], self.player_position[1] + 1)
        if action == Action.RIGHT:
            self.player_position = (self.player_position[0] + 1, self.player_position[1])
        if action == Action.LEFT:
            self.player_position = (self.player_position[0] - 1, self.player_position[1])
        corrected_x = max(0, min(self.size - 1, self.player_position[0]))
        corrected_y = max(0, min(self.size - 1, self.player_position[1]))
        self.player_position = (corrected_x, corrected_y)

    def reset(self) -> None:
        def get_random_position() -> int:
            return randint(0, self.size - 1)
        self.player_position = (get_random_position(), get_random_position())
        self.target_position = (get_random_position(), get_random_position())
        if self.done:

class GridWorldObserver:
    def __init__(self, game: GridWorldGame) -> None: = game

    def space(self) -> Space:
        return Box(shape=(4,), low=-math.inf, high=math.inf)

    def observe(self, _: Any) -> NDArray:
        return np.array([*, *])

class GridWorldEstimator:
    def __init__(self, game: GridWorldGame) -> None: = game

    def estimate(self, _: Any) -> float:
        return -1 + float(

class GridWorldVisualizer:
    # We use BGR
    BLUE = [255, 0, 0]
    GREEN = [0, 255, 0]

    def __init__(self, game: GridWorldGame) -> None: = game

    def render(self, _: Any) -> Any:
        frame = [[[0 for k in range(3)] for j in range(] for i in range(]
        frame[[1]][[0]] = self.BLUE
        frame[[1]][[0]] = self.GREEN
        return frame

class GridWorldLevel(Level):
    _game: GridWorldGame
    _observer: GridWorldObserver
    _estimator: GridWorldEstimator
    _visualizer: GridWorldVisualizer

    def __init__(
        game: GridWorldGame,
        observer: GridWorldObserver,
        estimator: GridWorldEstimator,
        visualizer: GridWorldVisualizer,
    ) -> None:
        super().__init__(game, observer, estimator, visualizer)

    def reset(self) -> None:

    def step(self, action: Action) -> Any:
        if isinstance(action, np.int64):  # handle integer inputs
            action = Action(action)

game = GridWorldGame(7)
level = GridWorldLevel(game, GridWorldObserver(game), GridWorldEstimator(game), GridWorldVisualizer(game))
simulator = Simulator(level)
while not simulator.done:
    action = Action(randint(0, 3))

πŸ“ƒ Documentation

Some doc-strings are already added. Documentation is a work-in-progress and will be updated on a time by time basis.

πŸ’ƒπŸ•Ί Contribution

I welcome everybody contributing to this project. Please read the for more information. Feel free to open an issue on the project if you have any further questions.

πŸ’» Development

The repository provides tools for development using hatch.

All dependencies for the project also can be found in a requirements-file.

Install the development dependencies.

pip3 install -r requirements/dev.txt


pip3 install "environment-framework[dev]"


To run all development tools, type checking, linting and tests hatch is required.

make all

Type checking

Type checking with mypy.

make mypy


Linting with pylint.

make lint


Run tests with pytest.

make test

Update dependencies

Update python requirements with pip-compile.

make update

🧾 License

This repository is licensed under the MIT License.


Loose building blocks to create agent-environment loops.

License:MIT License


Language:Python 69.4%Language:Jupyter Notebook 29.2%Language:Makefile 1.4%