@ClementDelangue: We just crossed 1,000,000 public datasets on Hugging Face! That's petabytes of data available that millions of AI builders are downloading, ...

@ClementDelangue 3 信息等级 3 发布：2026-05-12T15:16 抓取：2026-05-12 16:02

🔗 原文链接

AI 行业动态算力

摘要

Hugging Face 公开数据集数量突破100万个，过去8个月数量翻倍（从50万到100万），加速原因与AI Agents能力提升相关。数据被视为AI构建的下一个瓶颈。

客观事实

Hugging Face 公开数据集达到100万个
过去8个月数据集数量翻倍
数据被认为是AI构建的下一个瓶颈

Hugging Face

原文

We just crossed 1,000,000 public datasets on Hugging Face! That's petabytes of data available that millions of AI builders are downloading, analyzing, and training AI models on every day!

What's interesting is that we see a clear acceleration since agents started to be good as the number of datasets doubled over the past 8 months (it took 4 years to reach the first 500k). It's becoming easier and faster to build, share and use your own datasets!

Many are saying the next bottleneck for more people to build AI themselves (instead of relying on APIs) is better data so we're just getting started! Thanks everyone for your amazing contributions, we couldn't do it without you!

likes: 77 | retweets: 18 | replies: 12 | views: 3477