Deep|DeepSeek V4: The Inflection Point for Large-Scale NAND-Based KV Cache

FundaAI 4 信息等级 4 发布：2026-04-29T13:45 抓取：2026-05-06 07:19

🔗 原文链接

AI 算力行业

摘要

DeepSeek V4第二轮API降价，缓存命中层价格降至每百万token仅0.025元，缓存命中率超95%。V4将KV缓存压缩至V3.2的10%，并工程化实现基于SSD的大规模KV缓存迁移，推动NAND需求增长。

客观事实

DeepSeek V4第二轮API降价，缓存命中价格降至每百万token 0.025元
V4缓存命中率达95%以上，KV缓存大小压缩至V3.2的10%
基于SSD的KV缓存技术推动NAND需求指数级增长

DeepSeek NAND SSD

原文

In our previous article we discussed DeepSeek V4’s architectural customization on non-NVIDIA hardware and the first round of API price cuts at 75% off. This article focuses on V4’s second round of cuts: DeepSeek separately took the input cache-hit tier further down to 1/10 of list, stacked on top of the 75% off from the previous round, with the floor at ¥0.025 per million tokens. This widens the cache hit / cache miss spread from 1/12 to 1/120 (cache hit ¥0.025 vs. cache miss ¥3). DeepSeek V4’s real-world cache hit rate in agent settings has reached 95%+, and based on our research, DeepSeek’s current SSD configuration and utilization have stepped up materially versus before. Behind this is V4 compressing KV cache size to 10% of V3.2’s, plus DeepSeek’s accumulated engineering work on SSD-based KV cache, which together migrate KV cache from expensive, capacity-limited DRAM / HBM onto larger and cheaper SSD at scale. We believe DeepSeek V4’s cache-hit repricing implies upside for SSD, with NAND demand set to grow exponentially.
Subscribe now

          Read more