Kimi K2

Moonshot AI Kimi K2. 61 decoder layers, MLA attention (64 heads, TP=8 → h=8), sparse MoE (384 experts, top-8 DeepSeek routing, EP=8 → 48 local experts). Servable on 8×B200 (4×B200×2) at FP8.

Architecture Overview

Loading visualization...

Architecture Summary

22

Total Modules

6

Blocks

16

Kernels

9

Traced Kernels