CANN Recipes
文档
场景库
代码示例
快速开始
进入文档
Docs
Inference Recipes
面向 LLM / 多模态推理部署与加速的实践合集。
文档正文
Inference Recipes
面向 LLM / 多模态推理部署与加速的实践合集。
Repo:
https://gitcode.com/cann/cann-recipes-infer
Featured Recipes
Card
Level
Description
Link
GPT-OSS
Beginner
20B 单卡 / 120B 8 卡推理
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/gpt-oss/README.md
LongCat-Flash
Intermediate
低时延推理,支持控核与权重预取
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/longcat-flash/README.md
HunyuanImage-3.0
Intermediate
CFG / VAE 并行 + Operator Fusion
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/hunyuan-image-3.0/README.md
Qwen3-MoE
Intermediate
Atlas A3 推理适配,支持 TP / EP 部署
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/qwen3-moe/README.md
HunyuanVideo
Intermediate
xDiT 推理,支持 Ulysses / RingAttention / TeaCache
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/hunyuan-video/README.md
Wan2.2-I2V
Intermediate
Transformers 推理适配
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/wan2.2-i2v/README.md
DeepSeek-R1 / Kimi-K2
Intermediate
低时延与高吞吐部署,支持 DP / TP+SP / EP
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/deepseek-r1/README.md
Kimi-K2-Thinking
Advanced
256K 长序列推理,原生量化 W4A16
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/kimi-k2-thinking/README.md
DeepSeek-V3.2-Exp
Advanced
CP 并行 + 大 EP 并行,Operator Fusion 与 Multi-Stream 优化
https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/deepseek-v3.2-exp/README.md
返回 Docs 总览
回到首页