A theory of how to make optimal sequential decisions under uncertainty. It outlines how an optimal AI will act.

Formal Definition

  • Assuming we are in a Reinforcement Learning setup
  • An agent interacts with environment by taking actions and recieving observations and Reward
  • We do not assume the environment is Markovian