@teortaxesTex: no, that era has *ended*, nobody is seriously benchmaxxing now. Originally this term meant training on actual test, at best on paraphrases, ...

@teortaxesTex: no, that era has ended, nobody is seriously benchmaxxing now. Originally this term meant training on actual test, at best on paraphrases, ...

@teortaxesTex 3 信息等级 3 发布：2026-06-08T01:10 抓取：2026-06-08 05:19

🔗 原文链接

AI 行业

摘要

一条推特讨论AI基准测试趋势，指出benchmaxxing时代已结束，业界不再严重依赖训练集作弊；持续更新的基准测试如LCB出现；大约2025年初，模型在截断后切片上不再崩溃。

客观事实

Benchmaxxing时代结束
持续更新的基准测试如LCB出现
模型在2025年初停止崩溃

LCB

原文

no, that era has ended, nobody is seriously benchmaxxing now. Originally this term meant training on actual test, at best on paraphrases, and we came up with continuously updated benchmarks like LCB. Around early 2025 (≈R1), models stopped crashing on post-cutoff slices. https://t.co/Y87LKngIXk

likes: 34 | retweets: 0 | replies: 2 | views: 4553