v0.23.0 OS-model smoke leaderboard
Auto-generated by ailang eval-publish v0.23.0.
Per-benchmark pass rate
| Benchmark | opencode-gemma4-26b |
|---|---|
| adt_option | 0% (n=3) |
| balanced_parens | 0% (n=3) |
| binary_tree_sum | 0% (n=3) |
| canonical_convergence | 0% (n=3) |
| canonical_normalization | 0% (n=3) |
| dense_operator_program | 0% (n=3) |
| explicit_state_threading | 0% (n=3) |
| fizzbuzz | 0% (n=3) |
| gcd_lcm | 0% (n=3) |
| immutable_data_structures | 0% (n=3) |
| inline_tests | 0% (n=3) |
| nested_records | 0% (n=3) |
| numeric_modulo | 0% (n=3) |
| record_update | 0% (n=3) |
| records_book | 0% (n=3) |
| recursion_fibonacci | 0% (n=3) |
| type_safe_record_access | 0% (n=3) |
Generated from N-trial rotation data via the local-ollama eval rig.