Effective planning horizon class

#TODO move this to math section after SPAR end.

Let $J_{R} : Π \to R$ be a return function, $π^{*} = \arg max_{π} J_{R} (π)$ , the sequence ${π^{k}}_{k \in N}$ is fixed, and $J_{R} (π^{*}) \neq J_{R} (π^{0})$ . We define:

H_{eff} (R) = min {k : \frac{J_{R} (π^{*}) - J_{R} (π^{k})}{J_{R} (π^{*}) - J_{R} (π^{0})} \leq ε}

Then we prove that:

(\forall ε > 0 : H_{eff} (R^{'}) = H_{eff} (R)) ⟺ \exists a \neq 0, b \in R : J_{R^{'}} (π) = a J_{R} (π) + b \forall π \in {π^{*}, π^{0}} \cup {π^{k}}_{k \in N}

Lemma

Let $J_{R} (π^{*}) - J_{R} (π^{0}) \neq 0$ . Then:

(\forall ε > 0 : H_{eff} (R^{'}) = H_{eff} (R)) ⟺ \forall k \in N : f (k, R^{'}) = f (k, R)

where $f (k, R) := \frac{J_{R} (π^{*}) - J_{R} (π^{k})}{J_{R} (π^{*}) - J_{R} (π^{0})}$ .

Proof

$(⟸)$ : Suppose $\forall k : f (k, R^{'}) = f (k, R)$ . Then $\forall ε : {k : f (k, R^{'}) \leq ε} = {k : f (k, R) \leq ϵ} ⟹ H_{eff} (R^{'}) = H_{eff} (R)$ .

$(⟹)$ : Assume there exists $k_{0}$ such that $f (k_{0}, R^{'}) \neq f (k_{0}, R)$ . Without loss of generality, let $f (k_{0}, R^{'}) < f (k_{0}, R)$ (the case $>$ is symmetric). Choose:

ε_{0} \in (f (k_{0}, R^{'}), f (k_{0}, R))

Then $k_{0}$ is included in the set ${k : f (k, R^{'}) \leq ε_{0}}$ , but not in ${k : f (k, R) \leq ε_{0}}$ . Consequently:

H_{eff} (R^{'}) \leq k_{0} < H_{eff} (R) ⟹ ε = ε_{0} : H_{eff} (R^{'}) \neq H_{eff} (R)

From necessity and sufficiency, the lemma is proven.

Now, based on the lemma, it suffices to prove:

(\forall k : f (k, R^{'}) = f (k, R)) ⟺ \exists a \neq 0, b : J_{R^{'}} (π) = a J_{R} (π) + b \forall π \in {π^{*}, π^{0}} \cup {π^{k}}_{k \in N}

Let's introduce the notation: $J := J_{R}$ , $J^{'} := J_{R^{'}}$ , $Δ_{k} := J (π^{*}) - J (π^{k})$ , $Δ_{0} := J (π^{*}) - J (π^{0}) \neq 0$ .

$(⟸)$ : Suppose $J^{'} (π) = a J (π) + b$ for all $π \in {π^{*}, π^{0}} \cup {π^{k}}$ , with $a \neq 0$ . Then for any $k$ :

J^{'} (π^{*}) - J^{'} (π^{k}) = (a J (π^{*}) + b) - (a J (π^{k}) + b) = a (J (π^{*}) - J (π^{k})) = a Δ_{k}

J^{'} (π^{*}) - J^{'} (π^{0}) = a (J (π^{*}) - J (π^{0})) = a Δ_{0}

Since $a \neq 0$ , the denominator $a Δ_{0} \neq 0$ , and:

f (k, R^{'}) = \frac{a Δ_{k}}{a Δ_{0}} = \frac{Δ_{k}}{Δ_{0}} = f (k, R)

$(⟹)$ : Suppose $\forall k \in N : f (k, R^{'}) = f (k, R)$ . This means:

\begin{matrix} (1) & \frac{J^{'} (π^{*}) - J^{'} (π^{k})}{J^{'} (π^{*}) - J^{'} (π^{0})} = \frac{J (π^{*}) - J (π^{k})}{J (π^{*}) - J (π^{0})} \end{matrix}

Where $J^{'} (π^{*}) \neq J^{'} (π^{0})$ , since otherwise $f (k, R^{'})$ would be undefined for all $k$ , contradicting the assumption $\forall k : f (k, R^{'}) = f (k, R)$ .

Let's define:

a := \frac{J^{'} (π^{*}) - J^{'} (π^{0})}{J (π^{*}) - J (π^{0})}

b := J^{'} (π^{*}) - a J (π^{*})

From $(1)$ , by multiplying both sides by $a Δ_{0}$ :

J^{'} (π^{*}) - J^{'} (π^{k}) = a (J (π^{*}) - J (π^{k})) ⟹ J^{'} (π^{k}) = J^{'} (π^{*}) - a J (π^{*}) + a J (π^{k}) ⟹ J^{'} (π^{k}) = a J (π^{k}) + b

We also separately check the cases with $π^{*}$ and $π^{0}$ :
$a J (π^{*}) + b = a J (π^{*}) + J^{'} (π^{*}) - a J (π^{*}) = J^{'} (π^{*})$
From the definition of $a$ , we have $J^{'} (π^{0}) = J^{'} (π^{*}) - a Δ_{0} = J^{'} (π^{*}) - a (J (π^{*}) - J (π^{0})) = a J (π^{0}) + b$ .

Thus, from necessity and sufficiency, the relation $J^{'} (π) = a J (π) + b$ is proven for all $π \in {π^{*}, π^{0}} \cup {π^{k}}_{k \in N}$ , where $a \neq 0$ , meaning $J^{'}$ is an affine transformation of $J$ on the specified set.

Since the proof establishes that $\forall ε > 0 : H_{eff} (R^{'}) = H_{eff} (R)$ if and only if $J_{R^{'}}$ is an affine transformation of $J_{R}$ ( $a \neq 0$ ) on $π^{*}, π^{0} \cup {π^{k}}_{k \in N}$ , and STARC-equivalence requires $J_{R^{'}} = a J_{R} + b$ with $a > 0$ on the entirety of $Π$ , it follows that STARC-equivalence is a strictly stronger condition. In particular, $[R]_{STARC} \subseteq [R]_{H_{eff}}$ , because any $a > 0$ is a special case of $a \neq 0$ , and the condition on the entirety of $Π$ implies the condition on a subset.