Kimi K2
Moonshot AI Kimi K2. 61 decoder layers, MLA attention (64 heads, TP=8 → h=8), sparse MoE (384 experts, top-8 DeepSeek routing, EP=8 → 48 local experts). Servable on 8×B200 (4×B200×2) at FP8.
Architecture Overview
Loading visualization...
Architecture Summary
22
Total Modules
6
Blocks
16
Kernels
9
Traced Kernels
