llama.cpp 发布 b9235 版本,新增推理加速功能,并在 RTX 5090 上对 Qwen3.6 27B 模型进行了基准测试,展示了性能提升。
RT @populartourist: llama.cpp release b9235 added some new toys for boosting inference.
Benchmarked Qwen3.6 27B on an RTX 5090 with llama.…
likes: 237 | retweets: 25 | replies: 17 | views: 18542