AILANG Vision
A symbolic reasoning kernel for neural program synthesis.
AILANG sits between probabilistic program synthesis (LLMs) and deterministic execution environments. It constrains, verifies, and traces programs synthesized by neural systems.
LLM (Neural Synthesis)
│
│ Symbolic Program
▼
┌──────────────────────────┐
│ AILANG KERNEL │
│ │
│ • Effect relations │
│ • Contract verification │
│ • Budget enforcement │
│ • Trace generation │
│ • Authority bounds │
└──────────────────────────┘
│
│ Constrained Execution
▼
OS / Runtime / Cloud
The Five Pillars
AILANG's design rests on five pillars that together form a coherent system for trustworthy AI execution:
| Pillar | What It Means | Status |
|---|---|---|
| 1. Symbolic/Concrete Separation | Programs represent symbolic intent; concrete values arise at execution within constraints | Core AST, elaboration, deterministic eval |
| 2. Effects as Relations | Effects are (Intent × EffContext) ⇒ Outcome | Constraints, not opaque functions | Capabilities, budgets (@limit=N), policies |
| 3. Contracts as Specs | requires/ensures are symbolic predicates for AI-facing specifications | Runtime enforcement, panic/warn modes |
| 4. Traces as Artifacts | Execution produces structured, replayable, auditable traces | Agent traces (ailang chains) + program traces (--emit-trace) |
| 5. Explicit Authority | All authority must be declared, bounded, and enforced — no ambient capabilities | Deny-by-default --caps, budget limits, structured failures |
How AILANG Differs from Traditional Languages
| Aspect | Traditional Languages | AILANG |
|---|---|---|
| Primary author | Human | AI |
| Execution model | Direct | Mediated through kernel |
| Effects | Invisible side-effects | Explicit relations with budgets |
| Verification | Type-centric | Contract + trace-centric |
| Traces | Ad-hoc logs | Structured semantic artifacts |
| Authority | Ambient (can do anything) | Declared and bounded |
| Purpose | Program correctness | Trustworthy AI execution |
Effects as Capability Relations
Normal compilers:
doAnything()is fine as long as types line up- Side-effects are invisible to the type system
- No semantic boundary between "pure logic" and "this talks to the database"
AILANG:
-- The type signature constrains what this function CAN do
let processUser: User -> ! {DB} UserRecord =
\user. dbWrite(transformUser(user))
-- This CANNOT compile - no DB capability granted
let pureTransform: User -> UserRecord =
\user. dbWrite(user) -- ERROR: Effect mismatch
Why this matters for AI:
- The model literally cannot "hallucinate" a network call in a pure function
- Effect boundaries are machine-checkable constraints, not just documentation
- Budget limits (
! {IO @limit=5}) bound resource consumption per function
Traces as Semantic Artifacts
AILANG captures traces at two complementary levels:
Level 1: Agent Execution Traces
The ailang chains system captures what AI agents do during task execution:
- Full conversation history with reasoning blocks
- Tool usage timeline (file edits, commands, searches)
- Cost, tokens, turns, and duration metrics
- Session correlation across multi-step workflows
- Approval workflow tracking
ailang chains view <chain-id> --spans # See agent execution flow
ailang chains tree <chain-id> --detailed # Tree view with tool timeline
ailang chains stats --by-agent # Cost and token breakdown
Level 2: Program Execution Traces
The --emit-trace flag captures what happens when AILANG code evaluates:
- Function call entry/exit with arguments and results
- Effect invocations (IO, FS, Net operations)
- Contract check results (requires/ensures)
- Budget consumption deltas
- Call depth tracking
# Export program-level trace as JSONL
ailang run --emit-trace jsonl --caps IO --entry main program.ail
# Validate with jq
ailang run --emit-trace jsonl --quiet --caps IO --entry main program.ail 2>/dev/null | jq .
Example trace output:
{"version":"1.0","event":"function_enter","timestamp_ns":42459,"depth":1,"function":{"name":"processFile","args":["\"input.txt\""]}}
{"version":"1.0","event":"function_enter","timestamp_ns":44125,"depth":2,"function":{"name":"std/io.println","args":["\"Processing...\""]}}
{"version":"1.0","event":"function_exit","timestamp_ns":45750,"depth":2,"function":{"name":"std/io.println","result":"()","duration_ns":1625}}
{"version":"1.0","event":"function_exit","timestamp_ns":46500,"depth":1,"function":{"name":"processFile","result":"()","duration_ns":4041}}
Why both levels matter: When something goes wrong, agent traces show what the AI decided to do; program traces show what the code actually did. Together they provide full observability from intent to execution.
Explicit Authority and Bounded Power
AILANG eliminates ambient authority. Every capability must be explicitly declared:
-- Declare: this function needs IO and FS, with bounded IO usage
export func process(path: string) -> string ! {IO @limit=3, FS} =
let data = readFile(path) -- FS capability required
let _ = println("Processing") -- IO capability, budget: 1 of 3
transform(data) -- Pure: no capability needed
Budget enforcement:
- Functions declare maximum effect usage:
! {IO @limit=5} - Runtime enforces the limit — exceeding it is a structured failure
- Callers are charged the callee's declared budget (semantic charging)
- Minimum budgets ensure effects were actually exercised
Contract verification:
requiresblocks check preconditions before executionensuresblocks verify postconditions after execution- Contract checks are recorded in traces for auditability
Better Feedback Loops
AI code generation is iterative. AILANG makes each iteration more productive:
| Traditional | AILANG |
|---|---|
| Arbitrary logs | Structured effect traces |
| "Something crashed" | "Effect mismatch at line 42" |
| Blind retry | Targeted fix with trace evidence |
| No resource bounds | Budget-limited, auditable execution |
The result: faster convergence to working code.
Research Foundation: LoCoBench
Recent research validates AILANG's design thesis. The LoCoBench benchmark (PDF) (Salesforce AI Research, 2025) evaluated LLMs on 8,000 multi-file software engineering tasks and found:
- 30-50% performance degradation as context grows from 10K to 1M tokens
- Systems languages (C, Rust) score worst; high-level languages (Python, PHP) score best
- Architectural reasoning is a key failure mode — especially for tightly-coupled systems
AILANG's explicit effects, static imports, and deterministic semantics directly address these failure modes.
Benchmark Results — M-EVAL
We continuously track AI code generation success across 46 benchmarks with Claude, GPT, and Gemini.
Key metrics:
- Zero-Shot: Code works on first try
- Final Success: Code works after self-repair
- Agent Success: Multi-turn agent completes task
The jump from zero-shot to agent mode shows structured error feedback helping models self-correct.
See the Benchmark Dashboard for live results, trends, and per-model breakdowns.
Core Capabilities
AILANG provides:
- Algebraic Effects —
! {IO, FS, Net, AI, SharedMem}declares capabilities in types - Deterministic Core — pure functions are referentially transparent
- Structured Traces — agent-level (
ailang chains) and program-level (--emit-trace) - Contracts —
requires/ensuresas machine-readable specifications - Budget Enforcement —
@limit=Nbounds effect usage per function - Go Codegen — compile to native performance
- Semantic Caching — multi-agent coordination through shared memory
- AI Effect — unified interface for LLM calls across providers
See Implementation Status for detailed feature availability.
What AILANG Deliberately Excludes
AILANG prioritizes machine reasoning over human ergonomics:
- No LSP/IDE servers — AIs use CLI/API, not text editors
- No multiple syntaxes — one canonical way to express each concept
- No implicit behaviors — all effects are explicit
- No ambient authority — everything requires explicit capability grants
These aren't limitations — they're design choices that make the language a reliable substrate for AI-synthesized programs.
See Design Lineage for the full adopted/rejected analysis.
Get Involved
- Try it: See Getting Started
- Benchmark it: Run
ailang eval-suiteto test AI code generation - Contribute: github.com/sunholo-data/ailang
AILANG — a symbolic reasoning kernel for trustworthy AI execution.
Theoretical Foundations
For the academic grounding behind these design choices:
- Design Axioms — The 12 non-negotiable principles
- Philosophical Foundations — Block-universe determinism and computation as navigation
- Design Lineage — What we adopted, rejected, and why
- Citations & Bibliography — Full academic references