Mistral AI has released Leanstral 1.5, a free open-source model under Apache 2.0 license, specialized in formal verification using the Lean 4 programming language. The key distinction: the model not only masters mathematical proofs but also finds real bugs in production code – a concrete breakthrough in AI-assisted software verification.
The essentials
- Leanstral 1.5 achieves 100 percent on the miniF2F benchmark for formal mathematics and solves 587 of 672 tasks in the demanding PutnamBench
- When scanning 57 open-source repositories, the model found five previously unknown bugs, including an overflow bug in the Rust library varinteger
- The model is free via Hugging Face and a free API
- Training used Mid-Training, supervised fine-tuning, and Reinforcement Learning
Mathematics at olympiad level
The benchmark results demonstrate the model's reach: on miniF2F, which covers tasks from school level to olympiad problems, Leanstral 1.5 achieves a perfect score of 100 percent. In PutnamBench – a benchmark with 672 problems from the renowned Putnam Mathematics Competition – it solves 587 problems. On the more demanding algebra benchmarks FATE-H and FATE-X, which test master's and doctoral-level problems in areas like group theory and ring theory, the model achieves 87 and 34 percent respectively.
| Benchmark | Tasks | Leanstral 1.5 | Characteristic |
|---|---|---|---|
| miniF2F | variable | 100 % | School to olympiad level |
| PutnamBench | 672 | 587 solved | Renowned math competition |
| FATE-H | Master level | 87 % | Group theory, ring theory |
| FATE-X | Doctoral level | 34 % | Highest difficulty |
From math to real bugs
What makes the model exceptional: although trained primarily on mathematical verification, it demonstrates strong capabilities in practical code verification according to Mistral. This isn't just theory – when scanning 57 open-source repositories, Leanstral 1.5 found five previously unknown bugs. Among them was an overflow bug in the Rust library varinteger, a genuine security risk lurking in production code.
This means: the model can not only formally verify mathematical proofs but also scan real software projects for errors – and finds things previous methods missed.
Training with three techniques
The performance builds on a combined training approach: Mid-Training, supervised fine-tuning, and Reinforcement Learning were used together to optimize the model for both mathematical precision and practical code analysis.
What this means for enterprises
For software developers and security teams, Leanstral 1.5 could become relevant for code review and bug detection. A free, open-source model that finds real bugs could become part of a local AI infrastructure – without dependency on external APIs. However, questions remain about how reliable the model is with larger, more complex codebases and how it performs in compliance scenarios. But the release demonstrates: formal verification with AI is no longer a future prospect but practically deployable today.
Sources
Editorially owned by Ideal Syka. Sources and method: Newsroom & method. Tips and corrections: ai@i6eal.de.




