Sourcing — Feed

3 @teortaxesTex: Huawei has finally credibly (?) pretrained a big LLM on Ascends. "hyper-node optimized training" suggests 950s I guess. Builds on DSA ("with...

2026-06-12T14:53

华为在昇腾芯片上成功预训练了一个大语言模型，采用超节点优化训练和DSA技术，旨在证明其硬件能力。

华为在昇腾芯片上预训练大语言模型
采用超节点优化训练和DSA技术
华为意在证明其硬件可完成大模型训练

Teortaxes ↗ X AI 半导体算力

3 @teortaxesTex: Been a while since we've had a paper on provers. This "Defense-in-Depth Verifier" is actually a clever trick. Most of the paper is dedicated...

2026-06-12T08:45

一篇关于“Defense-in-Depth Verifier”的论文，主要致力于击败奖励黑客，是RL环境中的一项工作。

论文提出Defense-in-Depth Verifier方法
主要目标为击败奖励黑客
涉及RL环境中的验证器设计

Teortaxes ↗ X AI 研究

3 @teortaxesTex: > It doesn’t matter how powerful China’s supernodes are if it can only make ~100 of them 100 SuperPODs is 820K Ascend 950DTs. That's act...

2026-06-12T08:35

Twitter用户评论中国超算节点制造能力，认为即使只能制造约100个SuperPODs（需82万颗Ascend 950DTs），今年仍将生产超过100万颗Hopper级芯片。

100个SuperPODs需要82万颗Ascend 950DTs
预计今年将制造超过100万颗Hopper级芯片

Teortaxes ↗ X 行业 AI 半导体算力

3 @teortaxesTex: They've recreated DeepSeek-Optical Contexts Compression from first principles… to be clear, these "tokens" are, rather, KV slots. In disk sp...

2026-06-12T08:21

推文称有人从第一性原理复现了DeepSeek的光学上下文压缩技术，指出其令牌实际上是KV槽，磁盘空间仍大于文本和渲染图像，故该技术有效是因为多数缓存设计臃肿。

DeepSeek光学上下文压缩技术被从第一性原理复现
该技术的令牌实际上是KV槽
磁盘空间仍大于文本和渲染图像，有效源于缓存设计臃肿

Teortaxes ↗ X AI 算力

3 @teortaxesTex: RT @ibab: We are releasing River API, our first product, in early access. The API gives you access to the same battle-tested tools that we’…

2026-06-10T22:12

团队宣布其第一个产品River API进入早期访问阶段，该API提供经过实战测试的工具。

发布River API早期访问版本

Teortaxes ↗ X 行业动态 AI 云计算

3 @teortaxesTex: 1024 "NotSuperPod" actually makes a lot of sense. 1EFLOPS FP8, 20 cabinets, 125 m^2. ≈3x of GB300 NVL72. It's a decent mini-cluster, a manag...

2026-06-10T19:46

推特用户@teortaxesTex讨论了一种名为NotSuperPod的配置：1EFLOPS FP8算力、20个机柜、占地125平米，规模约为GB300 NVL72的3倍，并提到与DeepSeek此前列出的配置相似。

NotSuperPod配置为1EFLOPS FP8、20机柜、125平米
其规模约为GB300 NVL72的3倍
该配置与DeepSeek此前列出的配置类似

Teortaxes ↗ X AI 算力行业

3 @teortaxesTex: > this is the first time DeepSeek has fully shown its hand on owning compute infrastructure rather than just renting it. … DeepSeek is li...

2026-06-10T17:24

DeepSeek首次公开表明拥有计算基础设施而非租赁，并发布关于其数据中心的论文。这标志着DeepSeek在算力布局上的战略转变，与其他AGI实验室形成对比。

DeepSeek首次公开拥有计算基础设施而非租赁
DeepSeek发布了关于其数据中心的论文

Teortaxes ↗ X AI 数据中心算力

2 @teortaxesTex: RT @ibab: We’re launching River AI, a new AI company with the mission to build AI systems that are owned and shaped by you. I’m extremely e…

2026-06-10T16:51

一家名为River AI的新AI公司宣布成立，使命是构建由用户拥有和塑造的AI系统。该消息由个人在Twitter上发布，引发一定关注。

River AI公司正式宣布成立
公司使命是构建用户拥有和塑造的AI系统

Teortaxes ↗ X 行业 AI

3 @teortaxesTex: 8 months, Q1 2027 the bigger problem is that Mythos itself is a last generation pretrain, and by then Anthropic will likely have completed s...

2026-06-10T13:52

推文指出Anthropic的Mythos模型属于上一代预训练技术，预计到2027年第一季度，Anthropic将完成更强大的模型，目标将领先优势扩大至24个月。

Mythos是上一代预训练模型
Anthropic计划实现24个月领先
Anthropic将完成更强大的模型

Teortaxes ↗ X 行业 AI

3 @teortaxesTex: This is pretty crazy ("Project status" under the abstract is also an insane detail). Further shrinking of V4 cache footprint to… 360 MB per ...

2026-06-10T03:29

推特用户@teortaxesTex指出，某V4模型缓存占用进一步缩小至每1M上下文360 MB，相当于每token 360字节，接近原始明文限制的两个数量级。

V4缓存占用缩小至每1M上下文360 MB
每token缓存占用360字节

Teortaxes ↗ X AI 算力

3 @teortaxesTex: My personal simple trial for t2i models: feed a real image to a good captioner (eg 3.5-Flash) to generate a prompt, then have it be executed...

2026-06-09T22:07

用户通过图像描述器生成提示词，测试了HiDream-O1-Image和GPT-Image-2的文本生成图像能力，发现HiDream-O1-Image在逼真度上优化强势，但不及GPT-Image-2。

用户用真实图像生成prompt，测试t2i模型
HiDream-O1-Image在photorealism上优化强
HiDream表现不如GPT-Image-2

Teortaxes ↗ X AI 行业

3 @teortaxesTex: Did you notice anon? They've moved the schedule up, compared to the September-2025 roadmap. Ascend 950DT was supposed to go live in Q4 2026,...

2026-06-08T23:53

据推特用户@teortaxesTex透露，华为已将Ascend 950DT芯片的上市时间从原计划的2026年第四季度提前，相比2025年9月的路线图有所调整。该消息引发关注，提及HBM供应问题。

华为Ascend 950DT原计划2026年Q4上市
路线图显示上市时间已提前

Teortaxes ↗ X 行业 AI 半导体动态

3 @teortaxesTex: First confirmation that DeepSeek is targeting to build infrastructure from megawatt to *gigawatt* range. And again, as in other listings: it...

2026-06-08T22:55

DeepSeek首次确认目标建设从兆瓦到千兆瓦范围的基础设施，并透露正在自研系统，不打算购买华为的预制950 pods。

DeepSeek目标建设千兆瓦级基础设施
DeepSeek自研系统，不买华为预制舱

Teortaxes ↗ X 行业动态 AI 算力数据中心

3 @teortaxesTex: no, that era has *ended*, nobody is seriously benchmaxxing now. Originally this term meant training on actual test, at best on paraphrases, ...

2026-06-08T01:10

一条推特讨论AI基准测试趋势，指出benchmaxxing时代已结束，业界不再严重依赖训练集作弊；持续更新的基准测试如LCB出现；大约2025年初，模型在截断后切片上不再崩溃。

Benchmaxxing时代结束
持续更新的基准测试如LCB出现
模型在2025年初停止崩溃

Teortaxes ↗ X AI 行业

3 @teortaxesTex: I have no idea where "2M HBM stacks" comes from. Now CXMT should be around 50-60K WPM for HBM. You get, what, 2TB HBM3E per wafer? 75PB nee...

2026-06-06T02:56

推文作者对HBM产能需求进行估算：CXMT HBM产能约50-60K WPM，每片晶圆产出约2TB HBM3E，每个SuperCluster需75PB，对应3.75万片晶圆，约3周产出，认为当前产能不构成瓶颈。

CXMT HBM产能约50-60K WPM
每个HBM3E晶圆可产出约2TB
每个SuperCluster需要75PB HBM

Teortaxes ↗ X 行业半导体算力

3 @teortaxesTex: In 1995 (when that Armitage III OVA was released), median age in Japan was 39 years, and GDP per capita was $40K, the third after Switzerlan...

2026-06-06T02:25

推特对比日本与中国人口年龄中位数和人均GDP：1995年日本中位年龄39岁，人均GDP 4万美元，排名第三；如今日本人均GDP降至3.57万美元，排名第39。2023年中国中位年龄达39岁，人均GDP 1.3万美元。

1995年日本年龄中位数39岁，人均GDP 4万美元
当前日本人均GDP 3.57万美元，全球排名第39
2023年中国年龄中位数39岁，人均GDP 1.3万美元

Teortaxes ↗ X 宏观策略

3 @teortaxesTex: But seriously. Does anyone believe this? Eric Xu at Huawei Connect 2025 (Sep 18) announced Ascend SuperCluster by Q4'2026. It's a system occ...

2026-06-05T12:58

华为在2025年全联接大会上宣布Ascend SuperCluster计划，预计2026年第四季度实现。该系统占地64000平方米，配备524000个NPU，搭载75.5PB的HiZQ 2.0 HBM，总物料成本超过已售CloudMatrix 384系统的5倍。

华为发布Ascend SuperCluster计划，目标2026年Q4实现
系统含524K NPU和75.5PB HBM，占地64K平方米
BOM超过所有已售CloudMatrix 384的5倍

Teortaxes ↗ X AI 半导体算力数据中心

3 @teortaxesTex: What's the largest WSE-3-based cluster? I see there were plans for max 2048 "systems" (ie wafers). That's 47 MW, plus change. Huawei 950 Sup...

2026-06-05T12:17

讨论Cerebras WSE-3最大集群规模为2048系统（47MW），华为950超级集群可能超过500MW，2027年960集群可能超过1GW，并计划2030年实现30KW芯片。

Cerebras WSE-3集群最大计划为2048系统，功耗47MW
华为950超级集群功耗可能超过500MW
华为计划2030年实现30KW芯片

Teortaxes ↗ X 行业动态算力

3 @teortaxesTex: ok what the hell. I missed this completely So, Huawei expects to deploy LogicFolding Ascends in 2030/31, and have >400 Mtr/mm^2, *and* ha...

2026-06-05T10:53

华为计划在2030/31年部署LogicFolding Ascends芯片，密度超过400 Mtr/mm²，单芯片功耗达30KW，可能采用晶圆级引擎设计。

华为计划2030/31年部署LogicFolding Ascends芯片
芯片密度超过400 Mtr/mm²
单芯片功耗30KW，可能采用晶圆级引擎

Teortaxes ↗ X AI 半导体行业算力

3 @teortaxesTex: Another chance for China to make it into the club of nations with truly reusable rockets (population: the US). BF-20 is pretty mature now. A...

2026-06-05T01:12

有观点称中国BF-20发动机已成熟，可能用于ZQ-3火箭，使中国有望成为第二个拥有真正可重复使用火箭的国家。

BF-20发动机已成熟
ZQ-3火箭可能采用BF-20发动机

Teortaxes ↗ X 行业动态