A theory of how to make optimal sequential decisions under uncertainty. It outlines how an optimal AI will act.
Formal Definition
- Assuming we are in a Reinforcement Learning setup
- An agent interacts with environment by taking actions and recieving observations and Reward
- We do not assume the environment is Markovian