Back to Dashboard

Moonshot AI (Kimi)

kimi-k2.6

VS

xAI

Grok 4.20

Decision Recommendation

⚖️ Trade-off decision: Choose kimi-k2.6 for deep reasoning tasks, or Grok 4.20 for real-time user experiences since it has higher speed.

Model Specs

kimi-k2.6

Benchmarks & Scores

Coding (swe-bench-pro)Winner (+6.8%)

58.6%

Reasoning (gpqa-diamond)Winner (+0.5%)

90.5%

Cost & Performance

Cost (per 1M tokens)1.8x cheaper

$1.71Input: $0.95 | Output: $4.00

Speed

193 tps

Context Window

262.14k tokens

Model Specs

Grok 4.20

Benchmarks & Scores

Coding (swe-bench-pro)

51.8%

Reasoning (gpqa-diamond)

90%

Cost & Performance

Cost (per 1M tokens)

$3.00Input: $2.00 | Output: $6.00

Speed1.2x faster

233 tps

Context WindowLarger

1M tokens

Want to customize weights or add more models?

Open our interactive dashboard where you can adjust your priority levels for speed, budget, or accuracy slider-bars and watch model rankings calculate dynamically.

Customize in Interactive Dashboard