← 返回列表

@SemiAnalysis_: AMD ALERT 🚀 MI355 is now 40% cheaper than B200 on GLM5 architecture for Single Node serving FP8 14 weeks after the initial launch of GLM5 on...

@SemiAnalysis_ 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-05-19T17:01 抓取:2026-05-19 23:19
🔗 原文链接
摘要

AMD MI355在GLM5架构下单节点FP8推理比NVIDIA B200便宜40%,该信息于GLM5初始发布14周后披露,支持SGLang v0.12的CUDA和ROCm环境。

客观事实
  • AMD MI355比NVIDIA B200便宜40%
  • 适用于GLM5架构的单节点FP8推理
  • 支持SGLang v0.12的CUDA和ROCm
AMD MI355 NVIDIA B200 GLM5 SGLang CUDA ROCm

原文

AMD ALERT 🚀 MI355 is now 40% cheaper than B200 on GLM5 architecture for Single Node serving FP8 14 weeks after the initial launch of GLM5 on both non-MTP & MTP with spec decode for SGLang v0.12 for both CUDA & ROCm.  SPEED IS THE MOAT!! Great work to @AnushElangovan, @roaner, HaiShaw & his team!

Next step is for MI355X to catch up to CUDA when composing production inference optimizations like FP4 & on distributed inferencing where you can gang up MI355 boxes such that per GPU performance goes up thus the cost per million tokens goes down.

likes: 314 | retweets: 32 | replies: 6 | views: 26875