推特消息称,有人通过第一性原理推导出GPT-5预训练的token数量、Gemini 3的KV缓存字节数以及Claude缓存命中的内存类型。
Btw a bunch of the questions were just off the cuff - nothing @reinerpope prepped for.
The guy is just first principles deriving how many tokens GPT 5 was pretrained on, or the bytes per token in Gemini 3's KV cache, or which kind of memory each Claude cache hit sits on.
likes: 838 | retweets: 23 | replies: 19 | views: 72180