推特用户@teortaxesTex指出,某V4模型缓存占用进一步缩小至每1M上下文360 MB,相当于每token 360字节,接近原始明文限制的两个数量级。
This is pretty crazy ("Project status" under the abstract is also an insane detail). Further shrinking of V4 cache footprint to… 360 MB per 1M context? 360 bytes per token? Just 2 OOMs from the raw plaintext limit?
Calling CSA «conventional» is crazy work lmao.
@antirez !! https://t.co/X6AY2muwrW
likes: 46 | retweets: 2 | replies: 1 | views: 5604