Weight Functions

Standard RL \(T = 1\)
MaxRL \(T = N\)
Maximum Likelihood \(T \to \infty\)
\(T = N\) (number of rollouts) is what we use in practice.
Parameter \(T\) 32