← 返回列表

@huggingface: RT @nathanhabib1011: The SWE-bench Verified leaderboard on @huggingface now compares almost 50 models... Community benchmarking > closed...

@huggingface 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-05-06T13:58 抓取:2026-05-06 16:02
🔗 原文链接
摘要

Hugging Face上的SWE-bench Verified排行榜现在比较近50个模型,强调社区基准测试优于封闭测试。

客观事实
  • SWE-bench Verified排行榜在Hugging Face上比较近50个模型
Hugging Face SWE-bench

原文

RT @nathanhabib1011: The SWE-bench Verified leaderboard on @huggingface now compares almost 50 models...

Community benchmarking > closed b…

likes: 14 | retweets: 10 | replies: 7 | views: 4121