Snowflake: GLM‑5.2 competitive with Opus 4.7 on coding tasks at much lower price

In detail

103 tasks run three times: solved rate GLM 66% vs Opus 67%
First‑attempt accuracy: Opus 53.7% vs GLM 47.6%
GLM averaged 99 runs per task and used 860M tokens vs Opus's 80 runs and 439M tokens
Pricing (from Zhipu and comparative sheets): GLM $1.40/M input, $4.40/M output; Opus $5/M input, $25/M output

Why it matters

Lower per‑token pricing from models like GLM‑5.2 can disrupt economics for coding use cases, but higher token consumption and lower first‑pass correctness affect latency and operational cost.

For you Benchmark alternative models on your actual dev tasks and include token usage and first‑attempt success in cost and SLA calculations before switching providers.

Sources

The Decoder