Execution Traces

AILANG captures detailed execution traces when programs run — every function call, effect invocation, contract check, and budget change. These traces enable determinism verification, debugging, and AI training data export.

Why Traces Matter

AILANG is designed as a symbolic reasoning kernel where AI agents write and execute programs. Traces provide the evidence chain:

Determinism proof: Run a program twice, compare traces — identical traces mean identical behavior
Debugging: See exactly which functions ran, in what order, with what arguments
Quality scoring: Automatically score program executions for complexity, correctness, and resource efficiency
Training data: Export high-quality traces as fine-tuning data for AI models

Quick Start

# 1. Capture a trace
ailang run --emit-trace jsonl --caps IO --entry main program.ail > trace.jsonl

# 2. Inspect the trace
cat trace.jsonl | jq .

# 3. Verify determinism (re-runs and compares)
ailang replay trace.jsonl

# 4. Score the trace quality
ailang export-training --score trace.jsonl

# 5. Export as training data
ailang export-training --min-score 0.5 traces/

Capturing Traces

Add --emit-trace jsonl to any ailang run command. When JSONL tracing is active, all status messages and program output go to stderr, leaving stdout as clean JSONL that pipes directly to jq:

# Capture trace, see program output on screen
ailang run --emit-trace jsonl --caps IO --entry main module.ail > trace.jsonl

# Pipe directly to jq (stdout is pure JSONL)
ailang run --emit-trace jsonl --caps IO --entry main module.ail 2>/dev/null | jq .

# Capture both trace and program output separately
ailang run --emit-trace jsonl --caps IO --entry main module.ail > trace.jsonl 2> output.txt

# Also emit OTEL spans (for Cloud Trace integration)
ailang run --emit-trace jsonl,otel --caps IO --entry main module.ail > trace.jsonl

Trace Event Types

Each line in the JSONL file is one event:

Event	When	Key Fields
`module_start`	Program begins	module name, granted capabilities
`function_enter`	Function called	function name, arguments, call depth
`function_exit`	Function returns	function name, result, duration
`effect`	Side effect invoked	effect name, operation, args
`contract_check`	Pre/postcondition tested	kind, passed/failed, message
`budget_delta`	Resource consumed	used, limit, remaining
`error`	Runtime error	message, position
`module_end`	Program ends	module name, error count

Example Trace

Running a simple "Hello, AILANG!" program produces:

{"version":"1.0","event":"module_start","timestamp_ns":750,"module":{"name":"examples/runnable/hello","caps":["IO"]}}
{"version":"1.0","event":"function_enter","timestamp_ns":24375,"depth":1,"function":{"name":"std/io.print","args":["Hello, AILANG!"]}}
{"version":"1.0","event":"effect","timestamp_ns":30000,"depth":1,"effect":{"effect_name":"IO","op_name":"print","args":["Hello, AILANG!"],"result":"()"}}
{"version":"1.0","event":"function_exit","timestamp_ns":39167,"depth":1,"function":{"name":"std/io.print","result":"()","duration_ns":15000}}
{"version":"1.0","event":"module_end","timestamp_ns":39375,"module":{"name":"examples/runnable/hello","duration_ns":38625}}

You can see: the module started with IO capability, called std/io.print with the greeting, the IO effect was invoked (effect event), the function returned () in 15μs, and the module ended with total duration recorded.

Non-deterministic effects (like Net.httpGet, Clock.now, IO.readLine) are automatically flagged with "deterministic": false in their effect event. The replay comparator skips argument/result comparison for these events, allowing traces with network calls or time-dependent operations to replay successfully.

Replaying Traces (Determinism Verification)

The replay command re-executes the source program and compares the new trace against a baseline:

# Basic replay — auto-resolves source file and capabilities from the trace
ailang replay trace.jsonl

# JSON output for programmatic comparison
ailang replay --json trace.jsonl

# Override the source file (e.g., to test a modified version)
ailang replay --file modified.ail trace.jsonl

How Auto-Resolution Works

Replay reads the module_start event from the baseline trace to determine:

Source file: Module name examples/runnable/hello → resolves to examples/runnable/hello.ail
Capabilities: "caps":["IO"] → passes --caps IO to the re-execution

This means you typically don't need --file or --caps overrides — just point at the trace file.

Exit Codes

Code	Meaning
0	Traces match — deterministic
1	Traces differ — non-deterministic behavior detected
2	Error (file not found, parse error, etc.)

What Gets Compared

Replay compares:

Event types and order
Function names and arguments
Effect operations and parameters
Contract check results

Replay ignores:

Timestamps (execution speed varies)
Durations (performance isn't determinism)

Use Cases

Regression testing: Capture a baseline trace, make code changes, replay to verify behavior is preserved:

# Before changes
ailang run --emit-trace jsonl --caps IO --entry main module.ail > baseline.jsonl

# Make your changes...

# After changes — exit 0 means behavior unchanged
ailang replay baseline.jsonl

CI/CD integration: Add replay checks to your CI pipeline:

# Exits non-zero on mismatch — fails the build
ailang replay tests/traces/critical_path.jsonl

Scoring Traces

Every trace can be scored for quality on a 0.0-1.0 scale:

# Human-readable score report
ailang export-training --score trace.jsonl

# Machine-readable (JSON)
ailang export-training --score --json trace.jsonl

# Score all traces in a directory
ailang export-training --score traces/

Scoring Components

Component	Weight	What It Measures
Completion	30%	1.0 if clean `module_end`, 0.0 if errors
Complexity	25%	Function count, call depth, total calls (log scale)
Contracts	20%	Pass rate of pre/postconditions (0.5 neutral if none)
Budget efficiency	15%	1.0 if 20-80% of budget used; penalizes waste or exhaustion
Effect diversity	10%	More effect types = higher score

Interpreting Scores

0.0-0.3: Trivial or broken — program crashed or did very little
0.3-0.5: Simple — basic execution, few interesting behaviors
0.5-0.7: Good — non-trivial logic with some verification
0.7-1.0: Excellent — complex logic, contracts passing, efficient resource use

Why Score?

Scoring enables automated quality gating for AI training pipelines. Instead of training on every program execution, you can filter for high-quality examples:

# Only export traces scoring above 0.7
ailang export-training --min-score 0.7 traces/ > training.jsonl

Exporting Training Data

Convert scored traces into AI fine-tuning data:

# Export all traces as training JSONL
ailang export-training traces/

# Filter by quality
ailang export-training --min-score 0.5 traces/

# Write to file instead of stdout
ailang export-training --output training.jsonl traces/

# Include source code resolution
ailang export-training --source-dir src/ traces/

Output Format

Each line is a complete training example:

{
  "source": "module examples/runnable/hello\nimport std/io...",
  "trace": "{\"version\":\"1.0\",\"event\":\"module_start\"...}\n{...}\n...",
  "score": 0.85,
  "metadata": {
    "module": "examples/runnable/hello",
    "caps": ["IO"],
    "event_count": 4,
    "function_count": 1,
    "max_depth": 1,
    "has_errors": false
  }
}

The source field contains the original AILANG source code (resolved from the module name), and the trace field contains the full JSONL trace. Together they form a (program, execution) pair suitable for training.

Source Resolution

The exporter tries to find source files in this order:

--source-dir flag (if provided)
Same directory as the trace file
Current working directory

It resolves the module name from the module_start event (e.g., examples/runnable/hello → examples/runnable/hello.ail).

Full Pipeline Example

Here's a complete workflow for building an AI training dataset from AILANG program executions:

# Step 1: Run multiple programs, capturing traces
for f in examples/runnable/*.ail; do
  name=$(basename "$f" .ail)
  ailang run --emit-trace jsonl --caps IO --entry main "$f" > "traces/${name}.jsonl" 2>/dev/null
done

# Step 2: Verify all traces are deterministic
for t in traces/*.jsonl; do
  if ! ailang replay "$t" > /dev/null 2>&1; then
    echo "WARNING: Non-deterministic trace: $t"
  fi
done

# Step 3: Score and review
ailang export-training --score traces/

# Step 4: Export high-quality examples
ailang export-training --min-score 0.5 --source-dir examples/runnable/ --output training.jsonl traces/

# Result: training.jsonl contains scored (source, trace) pairs
echo "Training examples: $(wc -l < training.jsonl)"

Two Levels of Traces

AILANG has traces at two complementary levels:

Aspect	Program Traces (`--emit-trace`)	Agent Traces (`ailang chains`)
What	AILANG program execution	AI agent workflows
Granularity	Functions, effects, contracts	Sessions, turns, tool calls
When	`ailang run program.ail`	Coordinator task execution
Storage	JSONL files (standalone)	observatory.db (SQLite)
Example	"println called 3 times, budget 3/5 used"	"Agent ran 12 turns, called Bash 5 times"

Program traces (this guide) capture what happens inside AILANG code. Agent traces capture what happens around it — the AI agent's reasoning, tool usage, and multi-step workflows.

For agent-level tracing, see the Telemetry & Tracing and Coordinator guides.

WASM Trace Streaming

Trace events can be streamed to JavaScript in real-time when running AILANG in WASM mode. This enables browser-based trace visualization, debugging panels, and observability dashboards.

Setting Up a Trace Handler

// Register a callback that receives trace events as they occur
ailangSetTraceHandler((event) => {
  console.log(event.event, event.function?.name || event.effect?.op_name);
  // event.trace_id   — consistent across all events in one execution
  // event.span_id    — unique span identifier (W3C compatible)
  // event.parent_span_id — parent span for tree construction
});

// Run AILANG code — events stream to handler in real-time
ailangEval('import std/trace (spanStart, spanEnd, event)\nspanStart("demo")');

// Unregister handler
ailangSetTraceHandler(null);

Custom Spans from AILANG Code

The std/trace module lets AILANG programs emit custom spans and events:

import std/trace (spanStart, spanEnd, event)

func fetchData(url: string) -> string ! {Trace, Net} {
    spanStart("fetchData");
    event("url", url);
    let result = perform Net.httpGet(url);
    spanEnd("fetchData");
    result
}

OTEL Forwarding

Trace events include W3C-compatible span IDs, enabling forwarding to Cloud Trace or any OTEL collector:

ailangSetTraceHandler((event) => {
  if (event.span_id) {
    otelExporter.addSpan({
      traceId: event.trace_id,
      spanId: event.span_id,
      parentSpanId: event.parent_span_id,
      name: event.function?.name || event.event,
    });
  }
});

Known Limitations

Non-module files: Traces currently require module files (module declaration + --entry main). Single-expression files are not yet supported
Step-through mode: No interactive step-through replay yet (planned for future)

Why Traces Matter​

Quick Start​

Capturing Traces​

Trace Event Types​

Example Trace​

Replaying Traces (Determinism Verification)​

How Auto-Resolution Works​

Exit Codes​

What Gets Compared​

Use Cases​

Scoring Traces​

Scoring Components​

Interpreting Scores​

Why Score?​

Exporting Training Data​

Output Format​

Source Resolution​

Full Pipeline Example​

Two Levels of Traces​

WASM Trace Streaming​

Setting Up a Trace Handler​

Custom Spans from AILANG Code​

OTEL Forwarding​

Known Limitations​