Skip to main content

Three Camps of AI-Native Languages

In May 2026, Negroni Venture Studios published a survey of programming languages designed specifically for LLM-generated code. The post catalogued ~20 such languages — most of them less than 12 months old — and grouped them into three camps based on what their authors believe is the critical failure mode of LLM codegen.

This page mirrors that survey, places AILANG inside it, and adds something the original post didn't: a gap analysis showing where each camp's hypothesis would shine on benchmarks AILANG (and the field) doesn't yet test.

Bottom line

Twenty teams arriving at the same three answers in six months is not coincidence. It's the early shape of a consensus that agents need languages built for them, not languages tolerated by them. The three camps disagree about which property matters most. AILANG's bet: verification + orchestration are the same problem.

The Three Camps

CampClaimMechanismExample languages
SyntacticLLMs fail because tokens are ambiguousRestructure syntax to remove token-level ambiguityX07, NERD, Magpie, Laze
VerificationLLMs fail because output isn't checkedMake contracts mechanically verifiable; ship a checkerVera, Aver, Raskell, Prove, Pact, MoonBit, Zero, AILANG
OrchestrationLLMs fail because the loop around them is wrongReframe as agent coordination, not language designPel, Marsha, Plumbing, Quasar, Boruna

Each camp encodes a testable hypothesis about where LLM codegen breaks down. They're not abstract disagreements — they're concrete design choices producing measurable outcomes.

The Camps in Detail

Syntactic Camp — fix the tokens going in

Authors in this camp believe the LLM's tokenizer is the bottleneck: ambiguous operators, optional punctuation, and inconsistent whitespace handling produce more codegen failures than missing knowledge. Their fix is to restructure the language surface itself.

LanguageDistinguishing mechanismURL
X07Eliminates text syntax; programs are JSON ASTs edited through RFC 6902 patchesx07lang.org
NERDReplaces all operators with English keywords (PLUS, EQUALS)nerd-lang.org
MagpieSurfaces Static Single-Assignment (SSA) form as user-facing syntaxmagpie-lang.com
LazeMinimal indentation-based, no punctuation, compiles to Cgithub.com/kerv/laze

Verification Camp — fix the contract on what comes out

This is the largest camp. Authors here believe LLM codegen looks plausible but is semantically wrong too often to trust — the fix is mechanical verification that runs before the human or downstream agent sees the output.

LanguageDistinguishing mechanismURL
VeraZ3 verification + De Bruijn slot references (no variable names)veralang.dev
AverLean 4 proof export, co-located verify blocks, decision blocks (ADRs in code)averlang.dev
RaskellBuilds on Haskell; focus is fixing the tooling/runtime, not creating a new languageraskell.io
ProveDeliberately AI-resistant — license explicitly forbids use as training dataprove.botwork.se
PactIntent annotations per function, explicit effects, MCP servergithub.com/KikotVit/pact-lang
MoonBitSemantics-aware token sampler that constrains LLM generation to valid codemoonbitlang.com
Zero"One canonical form" + structured diagnostics for agents; no mandatory contractsgithub.com/vercel-labs/zerolang
AILANGZ3-backed requires/ensures contracts, row-polymorphic effects, HM typesailang.sunholo.com

Orchestration Camp — fix the loop around the language

Authors here believe the language itself is not the problem; what's missing is primitives for coordinating agents. Their fix is at the runtime, harness, or coordination layer — sometimes wrapping a conventional language, sometimes baking coordination into the language itself.

LanguageDistinguishing mechanismURL
PelReframes the problem as agent coordination as a language primitivearxiv.org/abs/2505.13453
MarshaAgent coordination frameworkgithub.com/alantech/marsha
PlumbingGraph-level wiring connecting agents into typed, streaming pipelines with static well-formednessBaez blog
QuasarPython-subset transpile + automated parallelization + uncertainty quantification (UPenn, 42% time reduction)arxiv.org/abs/2506.12202
BorunaCapability-gated bytecode VM + hash-chained tamper-evident audit logsgithub.com/escapeboy/boruna

Where AILANG Fits

AILANG sits in the Verification camp by the original survey's grouping. That's correct as far as it goes — requires/ensures with Z3 is squarely verification-camp machinery — but it understates AILANG's full position. The honest scorecard:

CampMembershipEvidence
Verification✅ Full memberZ3-backed contracts via ailang verify; row-polymorphic effect rows that mechanically check what a function can do; HM type inference
Orchestration✅ Strong member (under-surfaced publicly)std/ai as a first-class effect; coordinator + executor/provider architecture; managed_agents; chain telemetry; eval harness; agent messaging; MCP server
Syntactic❌ Deliberate non-memberML-family readable syntax with conventional operators. Bets that contract on output matters more than tokens on input

The interesting bit is that orchestration and verification turn out to be the same problem once you commit to both:

  • Contracts specify what the agent must produce.
  • Effect rows specify how the agent's code touches the world.
  • std/ai makes the agent itself a typed citizen of the program it's writing.

No other language in the survey occupies this intersection.

The Gap Analysis

The original post stopped at categorization. The natural next question is: do the camps' hypotheses actually hold up under measurement? This requires benchmarks designed specifically to probe each camp's claims — most of which don't yet exist in AILANG's eval suite or anywhere else in the field.

The table below maps each peer language's distinguishing capability to a benchmark gap. Each row is a testable hypothesis about why that camp exists.

Gaps driven by Syntactic camp claims

Gap benchmarkInspired byHypothesis under test
ast_patch_roundtripX07Does generating a code transformation as a structural diff produce fewer errors than free text?
dense_operator_programNERDDo tokenizer-ambiguous operators (<<, >>, &&, ==) measurably hurt LLM pass rate? Direct refutation test for AILANG.
explicit_dataflow_ssaMagpieDoes SSA-shaped code (heavy let-chain, single assignment) improve LLM reasoning?

Gaps driven by Verification camp claims

Gap benchmarkInspired byHypothesis under test
shadowing_heavy_contractVeraDo named identifiers break down under heavy shadowing? AILANG's HM should hold.
decision_block_captureAverDoes requiring agents to emit structured rationale alongside code improve auditability?
intent_annotated_solverPactDoes @intent("...")-style prompting measurably improve LLM pass rate? Direct test of Pact's hypothesis.
canonical_convergenceZeroRun N=20 generations of the same prompt; measure how often the LLM converges on semantically-equivalent code.

Gaps driven by Orchestration camp claims

Gap benchmarkInspired byHypothesis under test
multi_agent_handoffPel / MarshaAgent uses std/ai to delegate a subtask, composes the result. AILANG's std/ai makes this expressible without external wrapping.
typed_stream_pipelinePlumbingStatic well-formedness of a streaming transform — does the type system catch wiring errors?
parallel_independent_subtasksQuasarCode structure that exposes parallelism — can the LLM produce code an optimizer could parallelize?
audit_chain_replayBorunaExecute → capture chain → replay; bit-identical output. Direct test of AILANG's A2 (replayability).

Gaps for AILANG's own untested differentiators

Gap benchmarkWhy it matters
ai_effect_summarizestd/ai is AILANG's biggest unique capability, currently unbenchmarked
ai_effect_json_schemaStructured AI calls with schema enforcement (callJson)
unauthorized_fs_refusedTests A4 (explicit authority) — code that should fail because the FS capability wasn't granted

Total: ~14 gap benchmarks, each one a testable claim about why a particular camp exists. Some will refute their camp's hypothesis. Some will surface real AILANG weaknesses. Both outcomes are informative.

The self-audit results — AILANG running against all 14 — are on the Three Camps Self-Audit companion page.

What This Map Tells Us

A few observations from sitting with the survey for a few days:

The camps aren't separable. AILANG's design proves verification and orchestration can be the same problem. Pact's MCP server is an orchestration-camp move from a verification-camp language. Zero's "canonical form" is a verification-camp argument applied to a systems-flavored design. The categories are useful shorthand but they're not load-bearing.

The syntactic camp's claim is the most testable. dense_operator_program directly refutes or confirms NERD's hypothesis. If LLMs hit pass-rate parity on operator-heavy AILANG code vs Python, the tokenizer-ambiguity bet doesn't hold up under measurement.

No one is building "a normal language, but better." Every team in the survey took a strong position about what to change. That's the real consensus — and it suggests the field's collective belief that conventional language design cannot be patched into agent-usability.

AILANG's eval harness is reusable. Adding peer languages to AILANG's benchmark grid is mechanically straightforward: write a teaching prompt (~3k tokens) from public docs, install the toolchain, register a runner. The harness then measures what actually matters: how well an LLM can learn a new language from documentation alone. This is the methodology AILANG already uses on itself — AILANG isn't in any LLM's training data either.

What's Next

This survey is a 2026-05 snapshot. The post is still being edited; some languages may move between camps; new ones will be added. Tracking the field at this density is itself a contribution.

The follow-up work on this site:

  • Three Camps Self-Audit — AILANG's own results on the 14 gap benchmarks (initial run live)
  • Peer-language comparison data (MoonBit, Vera, Aver — work in progress)
  • Audit memos for borrowable ideas (decision blocks, intent annotations, audit chains, typed streaming pipelines)

If you're working on an AI-native language that should appear here, open an issue — the goal is for this page to be a useful reference for the whole field, not just AILANG's positioning.

See Also