NVIDIA发布Nemotron-Labs-Diffusion系列扩散语言模型,参数规模3B至14B,支持并行生成多个token并在生成过程中进行修订,提升推理速度,还包括视觉语言变体。
Most language models only generate one token at a time.
We just released Nemotron-Labs-Diffusion, a family of diffusion language models that take a different approach, generating multiple tokens in parallel within a single model. Rather than committing to each token permanently, these models can revise as they go, resulting in faster inference that better utilizes modern GPUs.
The full model family ranges from 3B to 14B, including vision-language variants. Available now: https://t.co/L1Tp2aQDLJ
likes: 918 | retweets: 142 | replies: 28 | views: 58909