Definition

Value function with:

  • A Policy
  • In Environment
  • A Discount Factor
  • A history Is defined as: $$V_{v}$\pi(h_{<t}) := \mathbb{E}{v}^{\pi} [ \sum{k=1}^{\infty}y^{k-t}r_{k} | h_{<t}]
- The optimal value is defined as $V_{v}^{*}(h_{<t}) := sup_{\pi}V_{v}^{\pi}(h_{<t})$