This is a small project exploring systems that self learn to play simple games. This is inspired by Deepmind's AlphaGo, but aimed at Tic Tac Toe and Checkers. Hopefully this will keep the systems simple so that the code can be concise, and they can be trained on hardware available to mere mortals.
To use this library, simply clone the repository then execute either train.py or game.py.
- The train.py file trains agents from scratch and saves the best agent it finds in a file called champion.p.
- The game.py file allows a human to play against this agent on the console. User moves are specified by specifying the row and col number separated by a space, i.e. 0 1 [ENTER] to play in the top middle space.
-
Code seems to be working well, but agent performance vs human is poor. ..+ Investigate adding penalities to losers. Currently code is designed so that only agent who last moved can recieve a reward/penalty, so this will require some refactoring.train.py produces an agent that seems to play optimally when manually testing. -
Implement unit tests
-
Implement a minimax/ provably optimal agentto benchmark RL agent against -
Implement a checkers ruleset
-
Tackle the larger state space.