Docs
By CANN Capability
按 CANN 的能力入口查找对应实践。
文档正文
By CANN Capability
按 CANN 的能力入口查找对应实践。
Operator Fusion
| Card | Description | Link |
|---|---|---|
| GPT-OSS Fusion | MoE/Attention 融合算子 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/gpt-oss/gpt_oss_optimization.md |
| HunyuanImage-3.0 Fusion | MoE 与并行优化 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/hunyuan-image-3.0/hunyuan_image_3_optimization.md |
| Pi0 Fusion | 具身模型融合算子适配 | https://gitcode.com/cann/cann-recipes-embodied-intelligence/-/blob/master/manipulation/pi0/infer_with_torch/README.md |
| VGGT Fusion | NPU 融合算子替换 | https://gitcode.com/cann/cann-recipes-spatial-intelligence/-/blob/master/models/vggt/vggt_optimization.md |
Graph Engine / Graph Mode
| Card | Description | Link |
|---|---|---|
| TorchAir GE (RL) | 训练场景图模式优化 | https://gitcode.com/cann/cann-recipes-train/-/blob/master/docs/llm_rl/deepseek_rl_train_optimization.md |
| Hunyuan3D Graph Mode | 端到端图模式推理 | https://gitcode.com/cann/cann-recipes-spatial-intelligence/-/blob/master/models/Hunyuan3D/Hunyuan3D_optimization.md |
| Pi0 Graph Mode | 具身推理图模式 | https://gitcode.com/cann/cann-recipes-embodied-intelligence/-/blob/master/manipulation/pi0/infer_with_torch/README.md |
Multi-Stream / Compute-Communication Overlap
| Card | Description | Link |
|---|---|---|
| DeepSeek RL Multi-Stream | MoE/MLA 多流并行 | https://gitcode.com/cann/cann-recipes-train/-/blob/master/docs/llm_rl/deepseek_rl_train_optimization.md |
| LongCat-Flash Multi-Stream | 低时延推理优化 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/longcat-flash/README.md |
| HunyuanImage-3.0 Multi-Stream | MoE 多流并行 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/hunyuan-image-3.0/hunyuan_image_3_optimization.md |
Parallelism (TP/EP/CP/DP)
| Card | Description | Link |
|---|---|---|
| Kimi-K2-Thinking | 并行策略细节 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/kimi-k2-thinking/kimi_k2_thinking_inference_guide.md |
| DeepSeek-V3.2-Exp | CP/EP 并行策略 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/models/deepseek-v3.2-exp/README.md |
| Qwen3 Long Seq RL | 训练并行与调度 | https://gitcode.com/cann/cann-recipes-train/-/blob/master/docs/llm_rl/qwen3_235B_32k_longseq_rl_train_optimization.md |
Scheduling / Load Balancing
| Card | Description | Link |
|---|---|---|
| RL Rollout Rebalance | 推理阶段负载均衡 | https://gitcode.com/cann/cann-recipes-train/-/blob/master/docs/features/rollout_rebalance.md |
| 3DGS Load Balance | AscendC 负载均衡 | https://gitcode.com/cann/cann-recipes-spatial-intelligence/-/blob/master/docs/algorithms/gaussian_splatting/gaussian_splatting_load_balance_optimization.md |
Memory Layout / KV Cache
| Card | Description | Link |
|---|---|---|
| KV Cache NZ | 推理 KV Cache 布局优化 | https://gitcode.com/cann/cann-recipes-train/-/blob/master/docs/llm_rl/deepseek_rl_train_optimization.md |
Custom Operator Development
| Card | Description | Link |
|---|---|---|
| AscendC Dev Guide | 自定义算子开发与入图 | https://gitcode.com/cann/cann-recipes-harmony-infer/-/blob/master/docs/ascendc_develop_guide.md |
| TileLang Operator | DSL 算子开发实践 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/docs/models/deepseek-v3.2-exp/deepseek_v3.2_exp_tilelang_operator_guide.md |
| AscendC Operator | 融合算子优化实践 | https://gitcode.com/cann/cann-recipes-infer/-/blob/master/docs/models/deepseek-v3.2-exp/deepseek_v3.2_exp_ascendc_operator_guide.md |