← 返回列表

@ClementDelangue: llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀 Qwen3.6-27B dense generation below on A10G: From 25 t...

@ClementDelangue 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-05-24T22:12 抓取:2026-05-24 23:18
🔗 原文链接
摘要

llama.cpp 新增 MTP 支持,使本地模型运行速度显著提升。在 A10G 上,Qwen3.6-27B 密集生成速度从 25 tok/s 提升至 45 tok/s,增幅达 78%。

客观事实
  • llama.cpp 增加 MTP 支持
  • Qwen3.6-27B 在 A10G 上速度提升 78%
llama.cpp Qwen3.6-27B A10G

原文

llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀

Qwen3.6-27B dense generation below on A10G: From 25 tok/st to 45 tok/s (+78%)! https://t.co/rLjBVa3Yzh

likes: 122 | retweets: 12 | replies: 11 | views: 9051