Markus Semmler's repositories
burrolib
Burrolib provides a library for multi-agent Markov games for researchers. It considers Markov games from an economical perspective. The modular agent design allows different agent implementations for a single game. The user can choose whether to design an expert system or a free learning approach.
abstract_rl
A modular python implementation of various policy gradient algorithms for use in control problems on experimental quanser robots. This repository includes implementations of Maximum A Posteriori Policy Optimization, Trust Region Policy Optimization and a draft for Soft Actor Critic.
boed-pytorch
A simple project, which explores the variational estimators of Foster (https://arxiv.org/abs/1903.05480) in a Bayesian linear regression setting. Using nested Monte Carlo estimators the exact (convex) information gain is calculated for the regression.
bootstrapped-dqn
An implementation of boostrapped DQN (https://arxiv.org/abs/1602.04621). It was created during my bachelor thesis at TU Darmstadt, and you can find thesis at http://www.ias.tu-darmstadt.de/uploads/Theses/Abschlussarbeiten/markus_semmler_bsc.pdf.
sticky-hdp-slds-hmm
An implementation of a hierarchical Dirichlet process (HDP) combined with a switching linear dynamical systems (SLDS) from https://arxiv.org/abs/1003.3829. It is a rather complex model and thus computationally expensive. Note the hyper parameters have to be adjusted.
approximate-signal-cancellation
This is a small framework to simulate algorithms in the area of signal processing. A GUI was implemented and simple transformations like direct inversion, Fourier transformation techniques over regression techniques are available.
qlearn
This repository contains scripts for executing QLearning algorithms onto different environments. It uses Tensorflow and features different discrete state/action environments. One can display different plots, like the value function or a comparison between different agents. The focus lies on the exploration efficiency.
rnn-tetherball-dynamics
Bachelor project by using recurrent neural networks to predict the 3-dimensional dynamics of a tetherball. It implements highway networks and also gated recurrent units. This implementation is highly modular, based on Tensorflow and can be adapted accordingly.
sampleproject
A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"
univariate-distributions
This repository contains several continuous and discrete univariate distributions. It uses MRG32k3a generator to create uniform samples. These uniform samples are then transformed to yield a sample from any supported distribution. A space system is used to represent the domain of the samples.