This SARSA implementation uses neural network with eligibility traces as an approximator of Q-function.
SARSA (State-Action-Reward-State-Action) - http://en.wikipedia.org/wiki/SARSA
To play with it download and build
Sarsa Lander - https://github.com/apancik/SarsaLander