推文称有人从第一性原理复现了DeepSeek的光学上下文压缩技术,指出其令牌实际上是KV槽,磁盘空间仍大于文本和渲染图像,故该技术有效是因为多数缓存设计臃肿。
They've recreated DeepSeek-Optical Contexts Compression from first principles…
to be clear, these "tokens" are, rather, KV slots. In disk space it's still larger than both the text and even the rendered image. So the only reason this works is that most cache designs are BLOATED https://t.co/clpsLC2SXW
likes: 21 | retweets: 1 | replies: 0 | views: 9819