Weight Functions
Standard RL
\(T = 1\)
MaxRL
\(T = N\)
Maximum Likelihood
\(T \to \infty\)
\(T = N\) (number of rollouts) is what we use in practice.
Parameter \(T\)
32