一篇关于“Defense-in-Depth Verifier”的论文,主要致力于击败奖励黑客,是RL环境中的一项工作。
Been a while since we've had a paper on provers.
This "Defense-in-Depth Verifier" is actually a clever trick. Most of the paper is dedicated to defeating reward hacks. An exemplary work on what actually goes into "RL environments". https://t.co/pzbJPE3Vs6
likes: 9 | retweets: 0 | replies: 2 | views: 2665