← 返回列表

@ClementDelangue: We just crossed 1,000,000 public datasets on Hugging Face! That's petabytes of data available that millions of AI builders are downloading, ...

@ClementDelangue 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-05-12T15:16 抓取:2026-05-12 16:02
🔗 原文链接
摘要

Hugging Face 公开数据集数量突破100万个,过去8个月数量翻倍(从50万到100万),加速原因与AI Agents能力提升相关。数据被视为AI构建的下一个瓶颈。

客观事实
  • Hugging Face 公开数据集达到100万个
  • 过去8个月数据集数量翻倍
  • 数据被认为是AI构建的下一个瓶颈
Hugging Face

原文

We just crossed 1,000,000 public datasets on Hugging Face! That's petabytes of data available that millions of AI builders are downloading, analyzing, and training AI models on every day!

What's interesting is that we see a clear acceleration since agents started to be good as the number of datasets doubled over the past 8 months (it took 4 years to reach the first 500k). It's becoming easier and faster to build, share and use your own datasets!

Many are saying the next bottleneck for more people to build AI themselves (instead of relying on APIs) is better data so we're just getting started! Thanks everyone for your amazing contributions, we couldn't do it without you!

likes: 77 | retweets: 18 | replies: 12 | views: 3477