llama.cpp 新增 MTP 支持,使本地模型运行速度足够日常使用。Qwen3.6-27B 模型在 A10 GPU 上实现快速文本生成。
RT @victormustar: llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀
Qwen3.6-27B dense generation (on A10…
likes: 429 | retweets: 51 | replies: 19 | views: 51513