Unsloth Studio 推出新功能,支持自动推测解码和 MTP,可将推理速度提升至 2 倍且无精度损失,并针对 Mac、GPU 和 CPU 优化了参数。
Unsloth Studio now has auto speculative decoding & MTP support for GGUFs! Get up to 2x faster inference with no accuracy loss!
We ran many experiments from small models to MoEs, and optimized the params for Mac, GPUs & CPUs.
There's also a new toggle for MTP / ngram or auto!
likes: 101 | retweets: 11 | replies: 5 | views: 8783