@NVIDIAAI: RL post-training is hitting a rollout bottleneck. This new paper from #NVIDIAResearch shows how speculative decoding in NeMo-RL + @vllm_pr...

@NVIDIAAI 3 信息等级 3 发布：2026-05-01T20:00 抓取：2026-05-03 15:25

🔗 原文链接

AI 算力研究

摘要

NVIDIA Research发布新论文，提出在NeMo-RL结合vLLM中使用推测解码加速强化学习后训练，实现8B模型吞吐量提升1.8倍，235B模型端到端加速2.5倍。

客观事实

NVIDIA Research提出推测解码加速RL后训练
NeMo-RL+vLLM实现1.8倍吞吐量提升（8B模型）
235B模型端到端加速达2.5倍

NVIDIA NeMo-RL vLLM

原文

RL post-training is hitting a rollout bottleneck.

This new paper from #NVIDIAResearch shows how speculative decoding in NeMo-RL + @vllm_project can accelerate rollouts losslessly, with 1.8x higher throughput at 8B and projected 2.5x end-to-end speedup at 235B.

Read the full paper: https://t.co/twR4LEQNmy

likes: 571 | retweets: 86 | replies: 12 | views: 49758