← 返回列表

@NVIDIAAI: SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell. Good to see fast progress in open source DeepSeek-V4 ...

@NVIDIAAI 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-04-30T21:31 抓取:2026-05-03 15:25
🔗 原文链接
摘要

英伟达AI宣布,SGLang在Blackwell硬件上对DeepSeek-V4推理达到180 tok/s/GPU,支持约1M上下文,该优化来自lmsysorg利用模型混合稀疏注意力的Blackwell特定优化。

客观事实
  • SGLang在Blackwell上对DeepSeek-V4推理速度达180 tok/s/GPU
  • 支持约1M上下文长度
  • 优化来自lmsysorg的Blackwell特定混合稀疏注意力利用
NVIDIA SGLang DeepSeek-V4 Blackwell lmsysorg

原文

SGLang is hitting 180 tok/s/GPU on DeepSeek-V4 decode with ~1M context on Blackwell.

Good to see fast progress in open source DeepSeek-V4 inference on new hardware.

This comes from Blackwell-specific optimizations by @lmsysorg that better use the model’s hybrid sparse attention.

likes: 311 | retweets: 32 | replies: 11 | views: 32014