← 返回列表

@SemiAnalysis_: As we've come to expect from a DeepSeek release, DeepSeek V4 comes with more flashy ML systems optimizations. This time? MegaMoE, a 1400 lin...

@SemiAnalysis_ 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-05-15T23:00 抓取:2026-05-16 04:03
🔗 原文链接
摘要

DeepSeek发布V4版本,引入MegaMoE技术,这是一个1400行的融合CUDA内核,用于计算整个MoE前向传播。

客观事实
  • DeepSeek V4发布
  • MegaMoE是1400行融合CUDA内核
  • 用于计算整个MoE前向传播
DeepSeek CUDA

原文

As we've come to expect from a DeepSeek release, DeepSeek V4 comes with more flashy ML systems optimizations. This time? MegaMoE, a 1400 line fused CUDA kernel that computes the entire MoE forward pass. Let's see how it works (1/4) 🧵 https://t.co/rqv6y2i3JV

likes: 122 | retweets: 14 | replies: 4 | views: 15997