Skip to main content

v0.23.0 OS-model smoke leaderboard

Auto-generated by ailang eval-publish v0.23.0.

Per-benchmark pass rate

Benchmarkopencode-gemma4-26b
adt_option0% (n=3)
balanced_parens0% (n=3)
binary_tree_sum0% (n=3)
canonical_convergence0% (n=3)
canonical_normalization0% (n=3)
dense_operator_program0% (n=3)
explicit_state_threading0% (n=3)
fizzbuzz0% (n=3)
gcd_lcm0% (n=3)
immutable_data_structures0% (n=3)
inline_tests0% (n=3)
nested_records0% (n=3)
numeric_modulo0% (n=3)
record_update0% (n=3)
records_book0% (n=3)
recursion_fibonacci0% (n=3)
type_safe_record_access0% (n=3)

Generated from N-trial rotation data via the local-ollama eval rig.