← 返回列表

@ClementDelangue: The future of biology shouldn’t stay behind black-box APIs. Especially when it touches personal health. Whether you’re @bryan_johnson measu...

@ClementDelangue 3 信息等级 3 1 噪音/剔除;2 较弱;3 普通事实;4 重要行业动态;5 极重大事件。该分数是信息显著性,不是投资建议。 发布:2026-05-20T12:10 抓取:2026-05-20 23:20
🔗 原文链接
摘要

Hugging Face 发布了开源 DNA 基础模型 Carbon,具有开放权重、训练代码和数据管道,比同尺寸最佳模型快 275 倍,可在笔记本电脑上本地运行,使用 DNA 原生 tokenizer 分割序列。

客观事实
  • Hugging Face 发布开源 DNA 基础模型 Carbon
  • Carbon 比同尺寸最佳模型快 275 倍
  • 模型采用 6-base 块 tokenizer 提高效率
Hugging Face Carbon

原文

The future of biology shouldn’t stay behind black-box APIs. Especially when it touches personal health.

Whether you’re @bryan_johnson measuring every biomarker, or @sytses openly sharing and analyzing his own immune-genetics data, you need open, local, transparent AI.

@huggingface wasn’t created to be a biology company. It’s not the most obvious focus for us. But it feels too important not to do something.

That’s why we built and released Carbon 🧬: a frontier DNA base model with open weights, training code and data pipeline, designed to be fine-tuned or continually pretrained for downstream biological tasks.

Carbon is 275x faster than the next best model at its size. Fast enough to run locally on your laptop. Powerful enough to process a whole human genome on a single GPU in less than 2 days.

The technical unlock: a DNA-native tokenizer that splits sequences into 6-base chunks for efficiency, while preserving single-base resolution during training and inference. More people able to inspect, run, fine-tune, improve and build on top of the models shaping biology.

Open weights: https://t.co/vgEklL5q4q
Dataset: https://t.co/R960HgOvSP
Demo: https://t.co/tnujkPeaNb

Let's go open AI biology!

likes: 210 | retweets: 43 | replies: 17 | views: 17061