Nvidia发布Nemotron 3 Super和Ultra模型,参数规模分别为120B和约500B,均预训练在NVFP4格式下,其中Super使用了25T tokens。
RT @ctnzr: We've gone even farther:
Nemotron 3 Super is 120B and pretrained on 25T tokens in NVFP4.
Nemotron 3 Ultra is ~500B and also pret…
likes: 451 | retweets: 48 | replies: 16 | views: 56414