NVIDIA官方推特表示Perplexity运行在NVIDIA上,并介绍了团队使用CUTLASS Python堆栈优化推理模型的细节。
Perplexity runs on NVIDIA.
Nice breakdown from the team on how they’re using the CUTLASS Python stack to optimize their models for inference 👇
likes: 231 | retweets: 21 | replies: 7 | views: 30158