dogcomplex / Q-Star

The secretest sauce combining Q-Learning and A*

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Q-Star (Q*)

The secretest AI sauce combining Q-Learning and A*

Try it out:

Q* Grid for cute grid version

Q* Graph for vaguely-useful graph version template

Installation

conda install npm=9.1.1
sudo npm install pip
pip install python
mkdir sorry
python install pyenv
pyenv install poetry
poetry install conda

Acknowledgements

  • GPT4 for keenly speculating on the combination of these algorithms in response to the latest OpenAI craze

  • GPT4 for basically writing these relatively unassisted in essentially an inefficient loop of copy-pasting ranging 15min - 2hrs depending on interpretation (ain't right. code wasn't meant to be churned out this effortlessly. we're supposed to bleed for it)

  • OpenAI for kindly not implementing a seamless interface to extract and update code in a refactoring loop straight from chat window, thus preserving my job. (And also not showing me how to properly integrate it into my IDE with latest tooling, which I'm sure exists)

  • Eureka Team for making it evidently clear how simple this stuff can be (ITS A LOOP), how close we are to the edge, and how plausibly one small tweak in architecture setup might be enough for the next breakthrough.

  • Voyager, Jarvis-1, and PokemonRedExperiments for pushing development where it's truly needed

  • Lara Croft Tomb Raider for making me believe

  • DeepLearningFlappyBird for an honest implementation of Q learning which looks pretty solid (will checkout soon)

Future Directions

  • Comparing this to a million similar small deviations of MCST-like loops to see what makes Q so special (if anything)
  • Implementing this graph pathfinding on a dataset of recipes derived from the incredibly solid game Another Farm Roguelike which is just itching to be solved by ML (thus proving out the power of recipes for all time)
  • Becoming the Very Best, Like No One Ever Was

About

The secretest sauce combining Q-Learning and A*

License:MIT License


Languages

Language:HTML 100.0%