TabularQLearning_C

Assignment for ENGG1110

In this project, I utilise the Reinforcement Learning technique of Tabular Q-Learning, in order to train intelligent, competent Artificial Intelligence Tic Tac Toe playing agents. The AI agents are trained in an adversarial manner, starting from a random seed, against each other and against a random algorithm (to improve model robustness and generalisation).

Tabular Q-Learning is a model-free, dynamic-programming based Reinforcement Learning algorithm. Reinforcement Learning (RL) is a branch of Machine Learning, which powers innovations such as AlphaGo. RL agents learn from trial and error, where rewards and penalties are assigned for each decision made for each state. The objective of an RL agent is to maximise its reward while minimising its penalty, which, for a well-designed objective function, should incrementally improve the performance of the AI. The Q-Learning algorithm is based on the Q-value iteration, where with each iteration, the agent learns the projected reward q_i given a state S_i. In Tabular Q-Learning, we try to find the optimal Q-values, via the following update policy:

Tabular Q-Learning is grounded on the principle of Dynamic Programming, which is an optimisation method similar to but more efficient than recursion (often linear time complexity). Evoking program flow models learned in ENGG1110, the tabulation method implemented in this project is a Bottom-Up approach. Q-value calculation is optimised starting from individual state-action pairs, accumulating increasingly sophisticated and optimised answers to the top, overall Q-table structure.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Capture.PNG		Capture.PNG
README.md		README.md
test.c		test.c
train.c		train.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TabularQLearning_C

Assignment for ENGG1110

About

Releases

Packages

Languages

chinglamchoi/TabularQLearning_C

Folders and files

Latest commit

History

Repository files navigation

TabularQLearning_C

Assignment for ENGG1110

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages