Sourcing — Feed

3 @huggingface: RT @abidlabs: This week, I got our GitHub Actions to use @HuggingFace Jobs instead of the default GitHub CI runners, making workflows run o…

2026-05-29T17:59

abidlabs 在推特上分享，本周成功将 GitHub Actions 的工作流从默认 CI runners 迁移至 HuggingFace Jobs，运行速度显著提升。

abidlabs 将 GitHub Actions 切换到 HuggingFace Jobs
工作流运行速度得到提升

@huggingface ↗ X AI 云计算算力

3 @huggingface: RT @oscarmartin: El mundo de la IA es local, ya no me cabe duda 💪 @_nasch_ sacando 87 tok/s con Qwen3.6 27B en una AMD de consumo. Yo en…

2026-05-29T16:58

推特消息称，用户 @nasch 在消费级 AMD 显卡上运行 Qwen3.6 27B 模型，推理速度达到 87 tok/s，展示了 AI 本地推理的性能进展。

在消费级 AMD 显卡上，Qwen3.6 27B 模型推理速度达 87 tok/s

@huggingface ↗ X AI 算力半导体

3 @huggingface: RT @VikParuchuri: Announcing Surya OCR 2: - 650M params - 83.3% olmocr bench score (top under 3B) - 87% on internal 91-lang benchmark - 5…

2026-05-29T02:19

VikParuchuri 宣布发布 Surya OCR 2 模型，拥有 6.5 亿参数，在 olmocr 基准测试中得分 83.3%，在内部 91 语言基准测试中得分 87%，在其他多项基准测试中表现领先。

Surya OCR 2 模型参数规模为 650M。
olmocr 基准得分 83.3%，在 3B 以下排名第一。
内部 91 语言基准得分 87%。

@huggingface ↗ X AI 算力

3 @huggingface: RT @Gradio: A hackathon called "Build Small" max 32B params. the model fits on a laptop. somehow that pitch got us OpenAI, NVIDIA, OpenBMB…

2026-05-29T02:11

Gradio推文宣布一场名为Build Small的黑客马拉松，要求参赛模型参数不超过32B并能在笔记本电脑上运行，赞助商包括OpenAI、NVIDIA和OpenBMB。

Build Small黑客马拉松限制模型参数不超过32B
活动要求模型可运行在笔记本电脑上
赞助商包括OpenAI、NVIDIA和OpenBMB

@huggingface ↗ X 行业 AI 算力

3 @huggingface: RT @mr_r0b0t: Official @NVIDIAAI GLM5.1-NVFP4 spotted on @huggingface 🤩 https://t.co/A2ycGBIpDq

2026-05-28T13:53

NVIDIA的GLM5.1-NVFP4模型在Hugging Face平台上被发现，标志着NVIDIA在AI模型生态中的新动作。

NVIDIA GLM5.1-NVFP4模型出现在Hugging Face上

@huggingface ↗ X AI 算力

3 @huggingface: RT @julien_c: With 104M of image-text pairs, this is one of the largest, if not the largest, openly-licensed image dataset And it's on @hu…

2026-05-28T13:52

开源数据集发布：含1.04亿图像-文本对，是目前最大的开放许可图像数据集之一，托管在Hugging Face上。

1.04亿图像-文本对数据集发布
该数据集是最大开放许可图像数据集之一
托管于Hugging Face平台

@huggingface ↗ X AI 算力

3 @huggingface: RT @ClementDelangue: The HF science team just made async RL weight sync ~100x cheaper on bandwidth, and you don't need a shared cluster any…

2026-05-28T13:45

Hugging Face 科学团队宣布异步强化学习权重同步在带宽成本上降低约100倍，无需共享集群即可实现。

Hugging Face 科学团队实现异步RL权重同步带宽成本降低约100倍
该技术不需要共享集群

@huggingface ↗ X AI 算力

3 @huggingface: RT @multimodalart: NVidia just released PiD: super resolution in pixel space directly from model latents 🔎 4X resolution for any generated…

2026-05-26T16:34

英伟达发布PiD超分辨率技术，可在像素空间直接从模型潜在变量实现4倍分辨率提升，适用于任何生成图像。

英伟达发布PiD超分辨率技术
PiD技术可实现4倍分辨率提升

@huggingface ↗ X AI 算力

3 @huggingface: RT @ClementDelangue: llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation bel…

2026-05-24T22:13

HuggingFace转发推文称，llama.cpp增加MTP支持后，Qwen3.6-27B密集生成模型在本地运行速度足够作为日常使用。推文获122点赞、12转发、11回复、9051次浏览。

llama.cpp新增MTP支持
Qwen3.6-27B模型本地生成速度提升
模型可日常使用

@huggingface ↗ X AI 算力

3 @huggingface: RT @ggerganov: Highlighting the new WebGPU backend in llama.cpp/ggml The work to bring full-fledged WebGPU support in llama.cpp started ab…

2026-05-22T16:13

llama.cpp 项目宣布新增 WebGPU 后端支持，该项目是 ggml 库的一部分，旨在提升在浏览器中的推理性能。

llama.cpp 新增 WebGPU 后端支持
该工作基于 ggml 库实现
旨在实现完整的 WebGPU 支持

@huggingface ↗ X AI 算力

3 @huggingface: RT @populartourist: llama.cpp release b9235 added some new toys for boosting inference. Benchmarked Qwen3.6 27B on an RTX 5090 with llama.…

2026-05-20T14:39

llama.cpp 发布 b9235 版本，新增推理加速功能，并在 RTX 5090 上对 Qwen3.6 27B 模型进行了基准测试，展示了性能提升。

llama.cpp b9235 版本发布，新增推理加速功能
在 RTX 5090 上对 Qwen3.6 27B 进行了基准测试

@huggingface ↗ X AI 算力行业

3 @huggingface: RT @alvarobartt: Latest `hf-mem` now breaks down Mixture-of-Experts (MoE) memory estimations into base weights, routed experts, and KV cach…

2026-05-18T22:10

huggingface的hf-mem工具更新，新增将混合专家模型（MoE）内存估计分解为基础权重、路由专家和KV缓存三个部分的功能。

hf-mem工具新增将MoE内存估计分解为三个部分的功能

@huggingface ↗ X AI 算力

3 @huggingface: RT @victormustar: llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation (on A10…

2026-05-18T22:09

llama.cpp 新增 MTP 支持，使本地模型运行速度足够日常使用。Qwen3.6-27B 模型在 A10 GPU 上实现快速文本生成。

llama.cpp 增加 MTP 支持，提升本地模型推理速度。
Qwen3.6-27B 模型在 A10 上实现快速生成。

@huggingface ↗ X AI 算力动态

3 @huggingface: RT @ggerganov: llama.cpp adds MTP for the Qwen3.6 family This is a significant milestone for the local AI ecosystem. The performance jump…

2026-05-18T18:41

llama.cpp 新增对 Qwen3.6 系列的多 Token 预测（MTP）支持，被视为本地 AI 生态的重要里程碑，带来性能提升。

llama.cpp 添加 MTP 支持 Qwen3.6 系列
该更新被认为是本地 AI 生态的里程碑

@huggingface ↗ X AI 算力

3 @huggingface: RT @ngxson: Qwen3.6-27B running 100% on WebGPU. Not the best speed but still 😁 https://t.co/Z1dpMkzykr

2026-05-18T13:26

推特用户ngxson宣布Qwen3.6-27B模型在WebGPU上实现100%运行，但速度并非最佳。该演示展示了AI模型在浏览器端推理的可行性。

Qwen3.6-27B模型在WebGPU上实现100%运行
运行速度并非最佳

@huggingface ↗ X AI 算力

3 @huggingface: RT @neural_avb: I am working on porting SAM models and harness into Apple silicon. Already seeing 1.25x inference speed increase on mlx w…

2026-05-18T13:24

开发者正在将SAM模型移植到Apple silicon，并在MLX上实现1.25倍推理速度提升。

正在将SAM模型移植到Apple silicon
在MLX上推理速度提升1.25倍

@huggingface ↗ X AI 算力

3 @huggingface: RT @ErikKaum: Releasing my first kernel on @huggingface: MaxSim Late-interaction retrieval (ColBERT / PyLate) bottlenecks on materializing…

2026-05-18T13:22

用户ErikKaum在huggingface上发布了其首个kernel MaxSim，旨在优化延迟交互检索（ColBERT/PyLate）的瓶颈问题。

ErikKaum在huggingface发布了MaxSim kernel
MaxSim用于优化ColBERT/PyLate的交互检索瓶颈

@huggingface ↗ X AI 算力

3 @huggingface: RT @stingning: We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluations: I…

2026-05-15T17:02

HuggingFace发布了一个30B-A3B推理模型，在物理和数学奥林匹克评估中达到金牌水平。该模型在推理能力上取得突破，引起广泛关注。

发布30B-A3B推理模型
物理和数学奥赛评估达金牌水平

@huggingface ↗ X AI 研究算力

3 @huggingface: RT @stevibe: Unsloth just published MTP-enabled quantized GGUFs for Qwen3.6-35B-A3B. https://t.co/9iuepdo5AW

2026-05-12T14:53

Unsloth发布了支持MTP的量化GGUF格式，适用于Qwen3.6-35B-A3B模型。

Unsloth发布了MTP-enabled的量化GGUF文件
针对Qwen3.6-35B-A3B模型

@huggingface ↗ X AI 算力行业

3 @huggingface: RT @mervenoyann: Meta silently dropped Sapiens2 last week 🔥 a family of high-res models trained on 1B human images > for pose estimati...

2026-05-12T14:20

Meta上周悄然发布Sapiens2模型家族，这是一组高分辨率模型，基于10亿张人类图像训练，主要用于姿态估计等任务。

Meta发布Sapiens2模型家族
模型训练使用10亿张人类图像
模型用于姿态估计等任务

@huggingface ↗ X AI 算力研究

3 @huggingface: RT @sudoingX: update: qwen 3.6 27b dense q4 just one shotted octopus invaders game on a single 3090. hermes agent drove the whole thing, ~4…

2026-05-11T16:23

用户sudoingX发布更新，称Qwen 3.6 27B dense q4模型在单张NVIDIA RTX 3090上一次性完成了Octopus Invaders游戏，由Hermes代理驱动整个过程。

Qwen 3.6 27B dense q4模型在单张3090上运行
模型一次性完成了Octopus Invaders游戏

@huggingface ↗ X AI 算力

3 @huggingface: RT @mervenoyann: Gemma 4 just got a massive speed-up with MTP drafters ⚡️ > speculative decoding (up to 3x tokens/sec improvement compa...

2026-05-05T21:45

Gemma 4 通过 MTP drafters 实现投机解码，速度提升高达 3 倍 tokens/sec。

Gemma 4 使用 MTP drafters 进行投机解码
速度提升高达 3 倍 tokens/sec

@huggingface ↗ X AI 算力