Variance reduction for policy gradient with action-dependent factorized baselines

7 years ago 14
Read Entire Article