Skip to main content

Custom AI Providers (Config-Driven)

Status: Available since AILANG v0.15.0.

Add a new AI provider as a package — no Go code, no binary fork. Declare an [[ai_provider]] block in your ailang.toml, and AILANG registers it as a first-class provider with full budget tracking, AI capability gating, and trace integration. Same machinery as the hardcoded built-ins (openai, anthropic, gemini, ollama, openrouter).

Quick start

# ailang.toml in your project root
[[ai_provider]]
schema_version = 1
name = "my-llm"
endpoint = "http://localhost:8000/v1/chat/completions"
request_shape = "openai_chat"
response_path = "$.choices[0].message.content"
auth = { type = "none" }

Then call it from AILANG:

import std/ai (call)

export func main() -> string ! {AI} = call("Hello!")
ailang run --caps AI --ai my-llm --model my-llm/some-model main.ail

That's it. Budget, AI cap, and trace spans all flow through the standard AILANG effect machinery.

When to use this vs a built-in provider

Use casePath
OpenAI / Anthropic / Gemini / Ollama / OpenRouterBuilt-in providers — already wired
vLLM, llama.cpp, Together, Groq, Anyscale, Fireworks, DeepInfra, Perplexity, Mistral native, any OpenAI-compatible HTTP endpointConfig-driven (this guide)
Bedrock, Vertex AI, Azure OpenAI (custom auth flows like SigV4 / Azure AD / OAuth)Stay built-in or escape via auth_headers
Non-HTTP transports (gRPC, WebSocket-only)Stay built-in

Schema reference

[[ai_provider]]
# REQUIRED FIELDS
schema_version = 1 # only "1" accepted in v0.15.0
name = "my-llm" # routing prefix; call("my-llm/<model>", ...) routes here
endpoint = "https://api.example.com/v1/chat/completions"
request_shape = "openai_chat" # "openai_chat" | "anthropic_messages" | "simple_completion" | "custom"
response_path = "$.choices[0].message.content" # JSONPath to extract text on 2xx
auth = { type = "none" } # required — see auth shapes below

# OPTIONAL FIELDS
error_path = "$.error.message" # JSONPath to extract error text on 4xx/5xx
cost = { input_per_1m_usd = 1.0, output_per_1m_usd = 2.0, currency = "USD" }
# OR
cost = { per_call_usd = 0.01 }

capabilities = { tool_calling = false, json_mode = true, streaming = true, vision = false, structured_outputs = false }

# CUSTOM AUTH ESCAPE (use either `auth.type` OR `auth_headers`, or both)
auth_headers = { Authorization = "Bearer ${MY_TOKEN}", X-Org = "${MY_ORG}" }

# CUSTOM REQUEST SHAPE ESCAPE (only when request_shape = "custom")
request_template = "{ \"model\": \"{{model}}\", \"input\": \"{{prompt}}\" }"

# STREAMING (consumed by M-AI-STREAMING-HELPER, v0.17.0)
[ai_provider.streaming]
enabled = true
delta_path = "$.choices[0].delta.content"
reasoning_path = "$.choices[0].delta.reasoning_content" # optional; o1, DeepSeek-R1
done_sentinel = "[DONE]"

# MODELS ALLOW-LIST (optional; if absent, prefix-match accepts any model)
[ai_provider.models]
allowed = ["llama-3.1-70b", "llama-3.1-8b"]

Request shapes (v1 catalog)

ShapeWire formatUse for
openai_chatOpenAI Chat Completions ({messages, model, max_tokens, ...})OpenAI, OpenRouter, Together, Groq, Anyscale, Fireworks, vLLM, llama.cpp openai-compat, anything else OpenAI-shaped
anthropic_messagesAnthropic Messages API ({messages: [{role, content: [{type:"text", text}]}], model, max_tokens, system})Anthropic native, Anthropic-shaped proxies
simple_completionSingle-prompt format ({prompt, model, max_tokens, ...})Ollama-style endpoints
customYour request_template Go templateAnything else (deferred to schema v2 — not implemented in v0.16)

Auth shapes

auth.typeWire effectRequired fields
bearerAuthorization: Bearer ${env}env
x-api-keyx-api-key: ${env}env
query-param?<name>=${env} appended to URLenv, name
noneNo auth header

For anything else (additional headers, custom prefixes, multi-header schemes), use auth_headers — a literal header dict with ${VAR} interpolation. ${VAR} patterns are restricted to uppercase-snake-case ([A-Z_][A-Z0-9_]*) for safety; literal substitution only, no shell expansion.

Capabilities

The TOML keys match the wire identifiers used in internal/ai/routing.go (the AICapability type). Routing-policy requirements (when an OpenRouter call needs tool_calling, etc.) and registration-time declarations share the same vocabulary.

KeyMeaning
tool_callingFunction/tool calling supported
json_modeJSON-mode output supported
streamingSSE token streaming supported. Surface: std/ai/streaming.callStream (v0.15.1+) and openaiCompatStream/anthropicStream (v0.15.0+)
visionMultimodal image input/output supported
structured_outputsSchema-enforced structured output (response_format = json_schema) supported

Calls requiring an unsupported capability fail fast with AIError{ code: "CapabilityNotSupported" }.

Recipe 1: vLLM (local, no auth)

# ailang.toml
[[ai_provider]]
schema_version = 1
name = "vllm"
endpoint = "http://localhost:8000/v1/chat/completions"
request_shape = "openai_chat"
response_path = "$.choices[0].message.content"
auth = { type = "none" }
cost = { input_per_1m_usd = 0.0, output_per_1m_usd = 0.0 }
capabilities = { tool_calling = false, json_mode = true, streaming = true, vision = false, structured_outputs = false }
ailang run --caps AI --ai vllm --model vllm/llama-3.1-70b my_app.ail

Recipe 2: llama.cpp (local server with custom auth header)

[[ai_provider]]
schema_version = 1
name = "llamacpp"
endpoint = "http://localhost:8080/completion"
request_shape = "simple_completion"
response_path = "$.content"
auth_headers = { X-Internal-Token = "${LLAMACPP_TOKEN}" }
cost = { per_call_usd = 0.0 }

The X-Internal-Token header is constructed at call time from $LLAMACPP_TOKEN. If the env var is unset the call fails with a clear "missing credential" error before any HTTP request happens.

Recipe 3: Anthropic native via the config-driven path

This works but the built-in anthropic provider is recommended for production — it has features (tool use, image input, native streaming, error normalization) that the v1 schema doesn't yet cover. Use config-driven Anthropic when you specifically need to test the schema's expressiveness, or when wrapping a proxy that's Anthropic-shaped but doesn't have a dedicated provider.

[[ai_provider]]
schema_version = 1
name = "anthropic-via-config"
endpoint = "https://api.anthropic.com/v1/messages"
request_shape = "anthropic_messages"
response_path = "$.content[0].text"
auth = { type = "x-api-key", env = "ANTHROPIC_API_KEY" }
auth_headers = { anthropic-version = "2023-06-01" }
cost = { input_per_1m_usd = 3.0, output_per_1m_usd = 15.0 }
capabilities = { tool_calling = true, json_mode = false, streaming = true, vision = true, structured_outputs = false }

Multiple providers per package

A single ailang.toml may declare multiple [[ai_provider]] blocks — they're independent registrations. Useful when wrapping a multi-tenant proxy or shipping a curated set of related providers.

Conflicts and dispatch order

When multiple installed packages declare [[ai_provider]] blocks:

  • Within one package: duplicate names error at manifest load (per-manifest validation).
  • Across packages: duplicate names error at registration time, naming both source manifests so the user can resolve.
  • Built-in vs config-driven: built-in always wins at dispatch with a warning to stderr. Don't shadow built-in names.

What lives in the registry

The runtime tracks the harvested providers in a global registry (see internal/ai/registry.go). The dispatch chain in cmd/ailang/ai_handlers.go consults this registry after the built-in switch falls through.

Limitations (v0.15.0)

  • Custom auth flows (AWS SigV4, Azure AD, OAuth) — not supported; use the auth_headers escape with ${ENV_VAR} for static tokens, or rely on built-in providers for full custom auth.
  • Non-HTTP transports (gRPC, persistent WebSocket) — Go-side; not in v1 schema.
  • Streaming[ai_provider.streaming] blocks are runtime-supported via std/ai/streaming: callStream for the accumulated final string (v0.15.1+), openaiCompatStream/anthropicStream + the manual event loop for per-delta hooks (v0.15.0+).
  • Tool-use templatingcapabilities.tool_calling flag works, but the request-shape templating for tool definitions is deferred to schema v2.
  • Image input templating — same.
  • request_shape = "custom" + request_template — reserved in the schema; runtime support deferred to schema v2.
  • Go plugin / WASM provider extensions — explicitly rejected (see m-ai-provider-config.md Decision D2).

If you hit a limitation, file feedback via the MCP submit_feedback tool or open an issue at github.com/sunholo-data/ailang/issues.

See also