Hugging Face上的SWE-bench Verified排行榜现在比较近50个模型,强调社区基准测试优于封闭测试。
RT @nathanhabib1011: The SWE-bench Verified leaderboard on @huggingface now compares almost 50 models...
Community benchmarking > closed b…
likes: 14 | retweets: 10 | replies: 7 | views: 4121