Reinforcement Learning

TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning

New Tricks for Estimating Gradients of Expectations