DeepSeek v4 发布,展示长上下文效率技术 CSA、HCA、mHC 等,成本仅为 pro 版本的 8%,并推出最佳开源基础模型。
IMO DeepSeek v4 demonstrated utter confidence and competence by not benchmaxxing, not focusing on some BS final run cost, not even spending inference-optimal compute.
just showed up, demonstrated SOTA long context efficiency techniques (CSA, HCA, mHC, flash at 8% cost of pro, which itself is 14% cost of opus), dropped the best open base models in the world, peaced out.
BYO posttraining. leave that to the agent labs to pick up the scraps. bravo.
likes: 1356 | retweets: 71 | replies: 66 | views: 103683