# AXF Specification

**Version:** 0.1.0  
**Status:** Draft  
**Editors:** AXF contributors  
**Format name:** AXF  
**File extension:** `.axf`  
**MIME type:** `application/axf`  
**Module name:** `axf`

---

# Abstract

AXF is a text-based wire format for AI agent communication, tool invocation, and structured message exchange. It is designed to preserve the density and streaming ergonomics of legacy segment-based interchange systems such as EDI X12 and EDIFACT while adapting those ideas for modern agent workloads, tokenizer behavior, and schema discovery.

AXF defines a positional, segment-oriented encoding in which message semantics are primarily established by schema rather than by repeated key names. The format is intended to be:

- materially more token-efficient than equivalent JSON or XML payloads;
- human-debuggable in plain text terminals;
- stream-friendly and incrementally parseable;
- schema-discoverable in-band or via well-known URLs; and
- suitable for agent-to-agent, agent-to-tool, and tool-to-agent communication.

This document defines AXF v0.1, including lexical structure, message framing, headers, body segments, trailers, schema discovery, versioning expectations, and an illustrative comparison with JSON.

---

# Status of This Document

This document is a draft specification for experimentation and implementation feedback.

Version 0.1 is intentionally conservative. It establishes a minimal interoperable core and leaves some policy-heavy concerns—authentication, signatures, binary mode, compression negotiation, and advanced schema evolution—for future versions.

Implementers should treat this specification as stable enough for prototypes and controlled interop tests, but not yet as a final internet standard.

---

# 1. Conformance

The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **NOT RECOMMENDED**, **MAY**, and **OPTIONAL** in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.

A **AXF sender** is any implementation that produces a AXF message.

A **AXF receiver** is any implementation that parses or validates a AXF message.

A **schema** is any machine-readable definition that assigns meaning to segment identifiers, positional fields, cardinality, requiredness, types, and validation rules.

A **segment** is the fundamental line- or record-level unit of a AXF message.

An **element** is a positional field inside a segment.

A **component** or **sub-element** is a positional subdivision inside an element.

A **repetition** is one of several repeated values in a single element slot.

An **atomic word** is a reserved, top-level intent marker such as `QUERY`, `RESULT`, `DEFER`, `ERROR`, or `ACK`.

---

# 2. Design Goals

## 2.1 Token Efficiency

AXF is designed to reduce token usage relative to structurally equivalent JSON and XML by:

- avoiding repeated property names in every message instance;
- relying on schema-defined positional meaning;
- using short segment identifiers;
- minimizing punctuation variety; and
- enabling compact, repetitive message patterns that tokenize favorably.

A AXF message SHOULD generally target a **60–80% token reduction** relative to a comparable JSON payload when the same semantics can be expressed through a known schema and compact segment vocabulary.

This target is aspirational rather than normative. Actual savings depend on tokenizer, schema design, identifier length, and payload shape.

## 2.2 Human Readability

AXF MUST remain text-based and inspectable with ordinary terminal tools.

A developer SHOULD be able to:

- read a message in `cat`, `less`, or a log viewer;
- identify segment boundaries quickly;
- see high-level message intent without first decoding binary framing; and
- manually construct or repair small messages when debugging.

AXF does not aim to be self-explanatory without schema knowledge. It aims to be **human-debuggable**, not fully self-describing.

## 2.3 Streaming

AXF is designed for incremental parsing.

A receiver SHOULD be able to:

- parse the header before the full message is available;
- learn the message intent from the atomic word immediately;
- process complete segments as they arrive;
- route or reject a message before the final trailer is read; and
- validate segment count and checksum when the trailer is eventually received.

## 2.4 Schema Flexibility

AXF separates the transport syntax from the domain schema.

The core syntax SHOULD remain small and stable, while application domains define their own segment vocabularies and positional meanings. This allows:

- tool-call schemas;
- agent chat schemas;
- planning or delegation schemas;
- telemetry schemas; and
- future domain-specific conventions.

## 2.5 Tokenizer Friendliness

AXF is specifically intended for LLM-era systems.

Accordingly, it SHOULD:

- prefer short repeated delimiters over many punctuation classes;
- allow recurring prefixes and reserved words to become stable tokenizer units;
- avoid heavy nesting syntax where possible;
- support schemas that cluster semantically related values in predictable positions; and
- make “first-pass” intent legible to both machines and language models.

## 2.6 Interoperability Over Cleverness

The format SHOULD prefer explicit, regular syntax over aggressive micro-optimizations that make parsers fragile.

The parser model should be simple enough to implement in a few hundred lines in most languages.

---

# 3. Design Lineage and Inspirations

AXF does not pretend to be invented from thin air. It explicitly draws from several prior traditions.

## 3.1 EDI X12 and EDIFACT

The primary structural inspiration is EDI X12, with additional conceptual affinity to EDIFACT.

Those systems demonstrated, over roughly 40 years of production use, that:

- positional records can be dramatically denser than tagged trees;
- segment-oriented interchange works well in streaming and batch pipelines;
- compact delimiters are operationally practical; and
- schemas, implementation guides, and conventions can provide rich semantics without bloating each message.

AXF adopts that lesson directly: **battle-tested density is a feature, not a historical accident**.

AXF differs from X12 and EDIFACT in several important ways:

- it is designed for internet-native agents and tools rather than business document interchange;
- it assumes modern UTF-8 text rather than legacy character set constraints;
- it formalizes schema discovery over HTTP;
- it introduces explicit atomic intent words for agent workflows; and
- it targets tokenizer efficiency as a first-class design concern.

## 3.2 MessagePack and Other Dense Encodings

MessagePack, CBOR, protobuf, and similar formats show the value of compact encodings and schema-guided semantics.

AXF shares the goal of dense interchange but intentionally remains plain text for debuggability, transport convenience, and LLM legibility.

## 3.3 JSON and XML

JSON and XML provide broad interoperability and good tooling, but they pay repeated overhead in punctuation, field names, and deeply nested syntax. AXF is an answer to workloads where those costs matter materially.

## 3.4 MCP and Agent Protocols

Model Context Protocol (MCP) and other agent/tool messaging patterns provide useful modern examples of structured tool calls, responses, capabilities, and error envelopes.

AXF is not a replacement for every agent protocol. Instead, it is a candidate transport syntax and framing model for the messages those protocols exchange.

---

# 4. Overview of the Format

A AXF message consists of:

1. one **atomic word** indicating top-level intent;
2. one **header segment** declaring protocol metadata;
3. zero or more **body segments** defined by schema; and
4. one **trailer segment** declaring integrity metadata.

A canonical message therefore has this abstract shape:

```text
ATOMIC_WORD
FXH*version*sender*receiver*schema-ref*auth-slot
...
body segments...
...
FXT*segment-count*checksum
```

Where newline-delimited framing is used, each segment occupies one line.

Where tilde-delimited framing is used, each segment ends with `~`.

The atomic word is logically above the segment layer. It is not merely another data element in the header; it is a top-level framing cue intended to support immediate routing and coarse-grained intent handling.

---

# 5. Lexical Structure

## 5.1 Character Encoding

AXF text MUST be encoded as UTF-8.

Senders SHOULD emit Unicode text in NFC normalization form when practical.

Receivers MUST accept well-formed UTF-8 and MAY reject malformed byte sequences.

## 5.2 Reserved Delimiters

Version 0.1 defines the following canonical delimiter characters:

- **segment delimiter:** `~` or newline, depending on framing mode;
- **element delimiter:** `*`;
- **sub-element delimiter:** `:`;
- **repetition delimiter:** `^`;
- **escape character:** `?`.

These delimiters are fixed for v0.1 core interoperability.

Future versions MAY define negotiable delimiter profiles, but implementations claiming v0.1 compatibility MUST support the canonical set above.

## 5.3 Atomic Words

AXF reserves the following atomic words in v0.1:

- `QUERY`
- `RESULT`
- `DEFER`
- `ERROR`
- `ACK`

An atomic word MUST appear before the header segment.

An atomic word MUST occupy its own top-level frame token:

- in newline mode, it MUST appear on its own line; or
- in tilde mode, it MUST appear as its own atomic frame before the first segment.

The atomic word tier exists above segments because many agent systems need to know message intent before parsing schema-specific body content.

### 5.3.1 Semantics of Reserved Atomic Words

#### `QUERY`

Indicates a request for action, information, or computation.

Typical uses include:

- tool invocation;
- agent question/answer prompts;
- structured data retrieval; and
- delegated work requests.

#### `RESULT`

Indicates successful or at least substantive completion of a prior query.

A `RESULT` may represent:

- a final success;
- a partial result when allowed by schema; or
- a normal response containing data.

#### `DEFER`

Indicates that processing has been accepted but not yet completed.

A `DEFER` message SHOULD include body fields sufficient to correlate future completion, estimated timing, or continuation channels.

#### `ERROR`

Indicates unsuccessful processing.

A AXF `ERROR` is a transport-level or application-level negative result frame, not merely a string inside a successful payload.

#### `ACK`

Indicates receipt, acceptance, or protocol acknowledgment.

Schemas MAY distinguish subtypes of acknowledgments in body segments.

### 5.3.2 Extensibility of Atomic Words

Receivers MUST recognize the reserved atomic words above.

Schemas MAY define additional atomic words, but senders SHOULD prefer the reserved set when the semantics fit.

A receiver that encounters an unknown atomic word MAY reject the message or downgrade handling according to local policy.

### Rationale

Atomic words give the format a fast-path for routing:

- `ERROR` can be surfaced immediately;
- `DEFER` can update state without waiting for a full parse;
- `QUERY` can be directed to execution systems before schema-specific validation finishes.

This is one of the few places where AXF intentionally adds a visible abstraction not present in classic X12.

## 5.4 Segment Identifiers

Each segment begins with a segment identifier in the first element position.

A segment identifier:

- SHOULD consist of 2 to 6 uppercase ASCII letters or digits;
- MUST NOT contain delimiter characters;
- MUST be unique within a schema namespace for a given semantic role.

Examples:

- `FXH` — header
- `FXT` — trailer
- `CAL` — tool call
- `ARG` — argument
- `OBJ` — object field bundle
- `ERR` — error detail

## 5.5 Elements

Elements are positional fields separated by `*`.

Empty elements are allowed and represented by adjacent delimiters.

For example:

```text
CAL*weather.get*req-9**metric
```

In the example above, the fourth element is empty.

## 5.6 Sub-elements

Sub-elements divide a composite element using `:`.

Example:

```text
LOC*geo:40.7128:-74.0060
```

The meaning and cardinality of sub-elements are schema-defined.

## 5.7 Repetitions

Repeated values within a single element are separated by `^`.

Example:

```text
CAP*streaming^partial^idempotent
```

## 5.8 Escaping

The escape character is `?`.

A literal delimiter character inside element data MUST be escaped.

The v0.1 escape sequences are:

- `?*` for literal `*`
- `?:` for literal `:`
- `?^` for literal `^`
- `?~` for literal `~`
- `??` for literal `?`
- `?n` for newline when newline framing is used and a literal newline is needed inside data

Receivers MUST process escapes after locating structural delimiters according to framing mode.

Senders SHOULD avoid unnecessary escaping by choosing schemas that keep large free-text fields isolated and infrequent.

## 5.9 Whitespace

Whitespace is significant inside element data and MUST be preserved exactly except where a schema explicitly defines trimming behavior.

Outside element data:

- leading and trailing whitespace around a whole segment line is NOT RECOMMENDED;
- receivers MAY reject messages containing extraneous whitespace outside the lexical grammar.

## 5.10 Comments

Comments are not part of the wire format and MUST NOT appear in canonical messages.

Diagnostic tooling MAY support out-of-band annotations for human display.

---

# 6. Framing Modes

AXF v0.1 supports two framing modes for segment boundaries:

1. **newline-delimited mode**; and
2. **tilde-delimited mode**.

Both modes carry the same logical content.

## 6.1 Newline-Delimited Mode

In newline-delimited mode:

- the atomic word occupies one line;
- each segment occupies one line; and
- line feed (`LF`, `U+000A`) terminates each frame.

Carriage return + line feed (`CRLF`) MAY be accepted by receivers, but senders SHOULD emit `LF`.

### Advantages

- easier to read in terminals and logs;
- easier to stream through line-oriented tooling;
- simpler visual scanning;
- less need to escape `~` in human text.

### Disadvantages

- literal embedded newlines in data become slightly more awkward;
- some legacy stream parsers expect explicit non-line record terminators;
- line normalization by intermediary systems can occasionally be lossy.

## 6.2 Tilde-Delimited Mode

In tilde-delimited mode, each segment ends with `~`, following the X12 tradition.

The atomic word MUST still be isolated as its own top-level frame before the first segment. Implementations MAY encode it as `QUERY~FXH*...~...` or as `QUERY\nFXH*...~...` so long as the atomic word remains unambiguous.

### Advantages

- direct continuity with X12-style record framing;
- clearer separation from platform-specific newline handling;
- convenient when messages are embedded in streams that already use newlines for logging.

### Disadvantages

- marginally less pleasant for humans to scan;
- more escaping pressure if `~` is common in payload text;
- slightly less natural for line-oriented Unix tooling.

## 6.3 Recommended Default

For agent-facing and developer-facing systems, **newline-delimited mode is RECOMMENDED as the default presentation and transport form**.

For environments that prioritize legacy EDI familiarity or explicit non-line record delimiters, tilde-delimited mode MAY be used.

Receivers SHOULD support both modes.

Schemas and transports MAY define a preferred mode, but SHOULD NOT alter the meaning of segments or elements based on framing mode alone.

### Rationale

The spec keeps `~` because X12 got something very right: a single segment delimiter is extremely efficient. But newline mode wins on everyday debuggability. v0.1 therefore allows both and recommends newline mode for modern agent stacks.

---

# 7. Message Structure

## 7.1 Canonical Order

A canonical AXF message MUST appear in this order:

1. atomic word;
2. header segment `FXH`;
3. zero or more body segments;
4. trailer segment `FXT`.

No body segment may appear before `FXH`.

No segment may appear after `FXT`.

## 7.2 Header and Trailer Reservation

The segment identifiers `FXH` and `FXT` are reserved by the core specification.

Schemas MUST NOT redefine their syntax or positional meaning.

## 7.3 Empty Body

A message MAY have no body segments between `FXH` and `FXT`.

This is valid, for example, for a bare acknowledgment.

---

# 8. Header Segment

## 8.1 Purpose

The header segment declares the minimal metadata needed to interpret, route, validate, and authorize a AXF message.

The core header segment identifier is `FXH`.

## 8.2 Header Syntax

The v0.1 header layout is:

```text
FXH*fx-version*sender-id*receiver-id*schema-ref*auth-slot
```

All six positions are reserved. Empty values are permitted where noted below.

## 8.3 Header Elements

### 8.3.1 Element 1: Segment Identifier

Value: `FXH`

### 8.3.2 Element 2: `fx-version`

The AXF protocol version.

This MUST be a semantic version string of the form `MAJOR.MINOR.PATCH`.

Example:

```text
0.1.0
```

### 8.3.3 Element 3: `sender-id`

A sender identifier meaningful to the communicating parties.

Examples include:

- `agent://planner.alpha`
- `tool://weather.local`
- `did:example:abc123`
- `urn:axf:node:west-2`

The core specification does not constrain the sender namespace beyond excluding delimiter characters unless escaped.

### 8.3.4 Element 4: `receiver-id`

A receiver identifier meaningful to the communicating parties.

The same identifier conventions as `sender-id` apply.

### 8.3.5 Element 5: `schema-ref`

A schema reference describing how to interpret body segment identifiers and positional fields.

This MUST be one of:

- an absolute HTTPS URL;
- a URN;
- a schema ID resolvable by prior agreement or well-known URL convention.

Examples:

- `https://axf.ai/.well-known/axf/tool-call-v1.axf-schema`
- `urn:fx:schema:tool-call:v1`
- `tool-call-v1`

### 8.3.6 Element 6: `auth-slot`

An optional authorization or credential slot.

This element MAY be empty.

When populated, it MAY contain:

- a bearer token reference;
- a capability handle;
- a proof identifier;
- an HMAC key ID reference; or
- another schema- or deployment-defined auth token.

The core specification does not standardize auth semantics in v0.1.

### Rationale

The header stays intentionally small. It should answer five questions fast:

1. what version is this;
2. who sent it;
3. who should receive it;
4. how should I interpret it; and
5. is there an auth hook I should consult.

That is the minimum viable envelope for dense machine traffic.

## 8.4 Additional Header Extensions

Schemas or profiles MAY define optional extension segments immediately after `FXH` for richer transport metadata, such as trace IDs, timestamps, correlation IDs, locale hints, or capability negotiation.

Such extensions MUST NOT alter the meaning of the reserved `FXH` positions.

---

# 9. Body Segments

## 9.1 General Rule

All body segment semantics are schema-driven.

The core syntax does not assign universal meaning to body segment identifiers other than `FXH` and `FXT`.

## 9.2 Positional Interpretation

Each body segment is interpreted positionally.

For a segment such as:

```text
CAL*weather.get*req-9*false*metric
```

the meaning of each element is determined by the schema, not by inline key names.

## 9.3 Recommended Schema Pattern

For interoperability, schemas SHOULD define for each segment identifier:

- segment purpose;
- minimum and maximum occurrences;
- ordered element list;
- required vs optional elements;
- type and value constraints;
- sub-element structure, if any;
- repetition rules, if any; and
- whether segment order is significant.

## 9.4 Segment Ordering

A schema MAY require body segments in:

- a fixed order;
- grouped blocks;
- a header-detail-summary pattern; or
- an order-insensitive set, where correlation IDs or segment types make order irrelevant.

When order is significant, receivers MUST preserve it.

## 9.5 Recommended Core Body Patterns

Although v0.1 does not standardize a full universal body vocabulary, the following patterns are RECOMMENDED for common agent workloads.

### 9.5.1 Call Segment

A schema MAY define a call segment such as `CAL` for top-level operation metadata.

Example:

```text
CAL*weather.getForecast*req-184*0*metric
```

Illustrative positional meaning:

1. operation name
2. request ID
3. streaming allowed flag
4. units profile

### 9.5.2 Argument Segment

A schema MAY define repeated `ARG` segments, one per argument.

Example:

```text
ARG*location*Austin, TX
ARG*days*5
ARG*hourly*temp_c^precip_mm^wind_kph
```

This pattern is slightly less dense than fully positional schemas but often much more debuggable. v0.1 permits either style.

### 9.5.3 Fully Positional Segment Families

For maximum density, a schema MAY use short segment codes where all positions are pre-agreed.

Example:

```text
CAL*weather.getForecast*req-184*0*metric
LOC*Austin, TX*US*30.2672:-97.7431
RNG*2026-04-25*2026-04-29
OPT*temp_c^precip_mm^wind_kph*daily*json
```

This usually compresses better than repeated `ARG*name*value` segments.

### 9.5.4 Nested Structures

Nested structures SHOULD be represented through one of:

- dedicated child segments;
- composite elements using sub-elements;
- repeated segments with shared correlation identifiers.

Deep JSON-style nesting SHOULD be flattened into segment sequences where practical.

### Rationale

AXF’s real win comes from moving structure out of punctuation soup and into schema and repeated segment patterns. If an implementation keeps stuffing mini-JSON blobs inside elements, it missed the point a little.

## 9.6 Correlation and Identity

Schemas SHOULD define how segments correlate when multiple related objects appear in one message.

Common strategies include:

- a shared request ID element;
- sequence numbers;
- parent-child references;
- segment groups delimited by start/end markers.

## 9.7 Unknown Segments

If a receiver encounters a body segment identifier unknown to the referenced schema, it:

- MAY reject the message;
- MAY ignore the segment if the schema or profile allows forward-compatible extension; or
- MAY store the raw segment for later inspection.

The handling policy SHOULD be defined by schema or deployment profile.

---

# 10. Trailer Segment

## 10.1 Purpose

The trailer segment provides end-of-message integrity checks.

The core trailer segment identifier is `FXT`.

## 10.2 Trailer Syntax

The v0.1 trailer layout is:

```text
FXT*segment-count*checksum
```

## 10.3 Trailer Elements

### 10.3.1 Element 1: Segment Identifier

Value: `FXT`

### 10.3.2 Element 2: `segment-count`

The number of segments from `FXH` through `FXT`, inclusive.

The atomic word is **not** included in `segment-count`.

This count MUST be expressed as a base-10 integer.

### 10.3.3 Element 3: `checksum`

A checksum over the message content.

For v0.1, senders SHOULD use one of the following textual forms:

- `crc32:<8-hex>`
- `sha256:<64-hex>`
- `none`

Receivers MUST accept `none`.

Receivers SHOULD accept `crc32` and `sha256`.

When a checksum is present, the exact canonicalization input MUST be defined by the transport profile or schema profile. In the absence of such a profile, the checksum SHOULD be computed over the UTF-8 bytes of the message from the first byte of `FXH` through the last byte before `FXT`, using the framing actually transmitted.

### Rationale

Segment count catches truncation and obvious framing bugs. Checksum catches the sneakier gremlins.

## 10.4 Validation

A receiver SHOULD validate:

- that `FXT` is present;
- that `segment-count` matches the observed number of segments; and
- that checksum verification succeeds when a recognized checksum algorithm is used.

A receiver MAY continue processing despite checksum failure if local policy permits partial or best-effort handling, but it SHOULD surface the failure clearly.

---

# 11. Schema Discovery

## 11.1 Goal

A AXF receiver needs a way to discover what body segment identifiers mean.

v0.1 supports two primary discovery models:

1. explicit schema reference in the header; and
2. resolution through a well-known URL convention.

Both models can coexist.

## 11.2 Inline Header Reference

The preferred discovery path is the `schema-ref` value in `FXH`.

If `schema-ref` is an absolute HTTPS URL, the receiver MAY fetch it directly.

If `schema-ref` is a URN or compact schema ID, the receiver MAY resolve it using local registry rules or the well-known convention below.

## 11.3 Well-Known URL Convention

A deployment or public host MAY expose schemas at:

```text
https://<authority>/.well-known/axf/<schema-id>.axf-schema
```

Example:

```text
https://axf.ai/.well-known/axf/tool-call-v1.axf-schema
```

When `schema-ref` is a compact identifier such as `tool-call-v1`, a receiver MAY combine it with a configured authority or a trusted sender domain to construct the lookup URL.

## 11.4 Schema Media Type

The schema document format is not fully standardized in v0.1, but a schema resource SHOULD be served with a content type such as:

```text
application/axf-schema
```

or, pending registration, a text or JSON-based provisional type agreed by implementations.

## 11.5 Minimum Schema Contents

A v0.1 schema SHOULD define at minimum:

- schema identifier and version;
- supported AXF version range;
- permitted atomic words, if constrained;
- recognized segment identifiers;
- positional definition of each segment;
- requiredness and cardinality;
- type constraints;
- extension and forward-compatibility policy.

## 11.6 Caching

Receivers SHOULD cache schemas according to normal HTTP caching semantics where applicable.

Implementations MAY also pin schemas by digest or version to avoid repeated fetches and reduce trust ambiguity.

## 11.7 Failure Modes

If a receiver cannot resolve `schema-ref`, it MAY:

- reject the message;
- treat it as opaque for logging/storage only; or
- parse only the atomic word, header, and trailer.

Receivers SHOULD NOT guess segment meaning from identifier names alone except for diagnostics.

### Rationale

The whole point is to avoid repeating keys everywhere. That only works if the schema is easy to find. So the spec makes schema lookup a first-class citizen instead of an afterthought.

---

# 12. Worked Example: Tool Call Comparison

This section compares a representative tool call expressed as JSON and as AXF.

## 12.1 Example Semantics

The payload expresses a request to call a weather forecast tool with:

- tool name;
- request ID;
- location;
- number of days;
- units;
- requested output fields;
- one nested options object containing language and cache policy.

## 12.2 JSON Representation

```json
{
  "type": "tool_call",
  "intent": "query",
  "tool": "weather.getForecast",
  "request_id": "req-184",
  "args": {
    "location": "Austin, TX",
    "days": 5,
    "units": "metric",
    "fields": ["temp_c", "precip_mm", "wind_kph"],
    "options": {
      "lang": "en",
      "cache": "prefer"
    }
  }
}
```

## 12.3 AXF Representation

Example using newline framing and a hypothetical `tool-call-v1` schema:

```text
QUERY
FXH*0.1.0*agent://planner.alpha*tool://weather.local*tool-call-v1*
CAL*weather.getForecast*req-184*0*metric
LOC*Austin, TX
DAY*5
FLD*temp_c^precip_mm^wind_kph
OPT*en*prefer
FXT*6*none
```

Illustrative meaning by segment:

- `CAL` → operation name, request ID, stream flag, units
- `LOC` → location string
- `DAY` → day count
- `FLD` → requested output fields
- `OPT` → nested options object flattened to positional fields

## 12.4 Token Count Estimate

Exact counts depend on tokenizer and whitespace. Using a `cl100k_base`-style tokenizer as a rough reference, this example typically lands in the following range:

- JSON: approximately **85–105 tokens**
- AXF: approximately **30–42 tokens**

A plausible midpoint comparison is:

- JSON: **94 tokens**
- AXF: **36 tokens**

That yields a reduction of about **61.7%**.

## 12.5 Why the Savings Occur

The savings come from several sources:

1. repeated field names such as `request_id`, `location`, `units`, `fields`, and `options` disappear from each message instance;
2. braces, brackets, quotes, commas, and colons are replaced by a smaller delimiter alphabet;
3. the nested object becomes a short positional segment;
4. arrays become repetition-delimited values instead of bracketed lists; and
5. the schema carries structure once instead of the payload carrying it every time.

## 12.6 Alternate Even-Denser Form

If the schema is more aggressively positional, the same message could be represented as:

```text
QUERY
FXH*0.1.0*agent://planner.alpha*tool://weather.local*tool-call-v1*
CAL*weather.getForecast*req-184*Austin, TX*5*metric*temp_c^precip_mm^wind_kph*en:prefer
FXT*3*none
```

This is denser still, but some implementations may prefer the previous multi-segment form because it is easier to inspect, stream, and evolve.

### Rationale

There is no free lunch. Max density usually means more schema dependence and less eyeball clarity. AXF deliberately lets deployments choose where to sit on that spectrum.

---

# 13. Streaming Considerations

## 13.1 Incremental Parse Model

A receiver SHOULD be able to parse a AXF message incrementally in this order:

1. read atomic word;
2. read and interpret `FXH`;
3. resolve or select schema;
4. parse body segments as they arrive;
5. validate `FXT` at end of stream.

## 13.2 Early Routing

Because the atomic word appears first, a receiver can often route the message immediately:

- `QUERY` → executor path;
- `RESULT` → result aggregation path;
- `DEFER` → async state store;
- `ERROR` → failure path;
- `ACK` → transport/session state.

## 13.3 Partial Processing

A receiver MAY process complete early segments before the full body is present, provided the schema permits such streaming behavior.

Examples:

- begin authorization checks once `FXH` is read;
- allocate request state after `CAL` is read;
- start result rendering as `ROW` or `TOK` segments arrive;
- surface `ERROR` detail immediately upon reading the relevant segment.

## 13.4 Segment Atomicity

Parsers SHOULD treat each segment as the smallest independently valid processing unit.

A receiver SHOULD NOT assume that a multi-segment logical object is complete until required companion segments are observed or the trailer is reached.

## 13.5 Transport Independence

AXF may be carried over:

- HTTP request or response bodies;
- WebSocket streams;
- TCP streams;
- message queues;
- append-only logs;
- standard input/output between cooperating processes.

The syntax does not require a particular transport so long as segment framing is preserved.

## 13.6 Backpressure and Long Results

Schemas intended for long-running or high-volume results SHOULD define chunk or continuation segments.

A `DEFER` followed later by one or more `RESULT` messages is RECOMMENDED for workflows where the producer cannot complete quickly.

## 13.7 Error Recovery in Streams

If a receiver detects malformed framing mid-stream, it MAY:

- abort parsing immediately;
- attempt resynchronization at the next clear segment boundary;
- emit a transport-specific parse error.

Resynchronization is more feasible in newline mode and when segment identifiers are strongly patterned.

---

# 14. Versioning Strategy

## 14.1 Protocol Version

The `fx-version` element in `FXH` identifies the AXF core protocol version.

AXF uses semantic versioning:

- **MAJOR** for incompatible syntax or core semantic changes;
- **MINOR** for backward-compatible additions or clarifications;
- **PATCH** for editorial fixes or non-breaking corrections.

## 14.2 Schema Version

Schemas SHOULD version independently from the core protocol.

A schema identifier MAY embed versioning, for example:

- `tool-call-v1`
- `urn:fx:schema:tool-call:1.2`
- `https://example.com/.well-known/axf/tool-call/1.0.axf-schema`

## 14.3 Forward Compatibility

Older receivers handling newer minor versions SHOULD:

- accept reserved core syntax they understand;
- ignore optional schema extensions when policy allows;
- reject only features they cannot safely interpret.

## 14.4 Unknown Elements and Segments

Forward compatibility is primarily a schema concern.

Schemas SHOULD explicitly state whether:

- trailing optional elements may be ignored;
- unknown extension segments may be skipped;
- new repetition values are allowed; and
- additional atomic words are fatal or non-fatal.

## 14.5 Major Version Breaks

A receiver encountering an unsupported major `fx-version` SHOULD reject the message unless a compatibility profile exists.

## 14.6 Recommended Evolution Rules

To support real-world interop, schema designers SHOULD prefer the following evolution order:

1. add optional trailing elements;
2. add optional extension segments;
3. add new segment identifiers gated by schema version;
4. only then consider reinterpreting existing positions.

Reinterpreting existing positional fields is the most compatibility-hostile change and SHOULD be avoided.

### Rationale

Positional formats are wonderfully dense and occasionally vindictive about bad versioning discipline. If you change the meaning of slot 4 out from under everyone, chaos goblins win.

---

# 15. Error Handling

## 15.1 Parse Errors

A receiver SHOULD treat the following as parse errors:

- missing atomic word;
- missing `FXH`;
- missing `FXT`;
- malformed escape sequence;
- invalid segment structure under the claimed framing mode;
- unsupported major protocol version;
- trailer count mismatch when validation is required.

## 15.2 Schema Validation Errors

A message that is syntactically valid AXF but invalid under its schema SHOULD be reported as a schema or application validation error, not as a lexical parse error.

## 15.3 Error Responses

When appropriate to the transport, a receiver MAY return a AXF `ERROR` message describing:

- parse failure;
- schema resolution failure;
- authorization failure;
- validation failure;
- execution failure.

Example:

```text
ERROR
FXH*0.1.0*tool://weather.local*agent://planner.alpha*tool-call-v1*
ERR*REQ_SCHEMA*Unknown schema ref*tool-call-v999
FXT*3*none
```

---

# 16. Security Considerations

## 16.1 Auth Slot Is a Hook, Not a Full Security Model

The `auth-slot` in `FXH` exists to support integration, but v0.1 does not define a complete authentication or authorization framework.

Implementers MUST NOT assume that the presence of a non-empty auth slot is sufficient for trust.

## 16.2 Schema Trust

Schema resolution over HTTP introduces trust questions.

Receivers SHOULD:

- prefer HTTPS;
- validate host trust boundaries;
- cache trusted schema versions;
- optionally pin schema digests;
- avoid fetching arbitrary attacker-controlled schemas on hot paths without policy.

## 16.3 Injection and Escaping

Any implementation converting AXF into command lines, SQL, JSON, HTML, or prompts MUST apply context-appropriate escaping and validation.

AXF density does not magically prevent injection; it just moves the punctuation around.

## 16.4 Logging Exposure

Because AXF is plain text, logs may accidentally capture identifiers, tokens, or sensitive payloads. Deployments SHOULD redact or hash sensitive elements where appropriate.

## 16.5 Checksums vs Signatures

Checksums protect integrity against accidental corruption, not malicious tampering.

Sensitive deployments SHOULD layer cryptographic signing or transport security above v0.1.

---

# 17. Implementation Guidance

## 17.1 Parser Simplicity

A minimal parser can be implemented in three phases:

1. split into top-level frames using the configured segment mode;
2. decode escapes and split each segment into elements, then sub-elements and repetitions as needed;
3. apply header, trailer, and schema validation.

## 17.2 Canonical Emission Recommendations

Senders SHOULD:

- emit `QUERY`, `RESULT`, `DEFER`, `ERROR`, or `ACK` in uppercase;
- emit `FXH` immediately after the atomic word;
- use newline framing by default unless a profile specifies otherwise;
- avoid unnecessary empty trailing elements;
- emit `FXT` with an accurate segment count.

## 17.3 Pretty vs Dense Forms

Implementations MAY provide both:

- a dense canonical wire form; and
- a pretty diagnostic display form.

However, they SHOULD define clearly which form is signed, checksummed, or transmitted.

## 17.4 JSON Bridging

A bridge between JSON and AXF SHOULD be schema-aware.

Blind automatic conversion without a schema usually yields poor results and often degenerates into stuffing JSON blobs into AXF elements, which is legal but unimpressive.

---

# 18. IANA and Registration Considerations

This document proposes the following media type for future registration:

```text
application/axf
```

It also proposes the conventional file extension:

```text
.axf
```

Schema resources MAY use a future media type such as:

```text
application/axf-schema
```

Formal registration details are out of scope for v0.1 but SHOULD be addressed before standardization.

---

# 19. Open Questions and Future Work

v0.1 intentionally leaves several issues open.

## 19.1 Binary Mode

Should AXF define a binary companion encoding that preserves the same logical model but uses numeric segment IDs or packed delimiters for even lower overhead?

Possible direction:

- text mode for debugging and LLM interoperability;
- binary mode for high-throughput service links;
- identical schema model across both encodings;
- a deterministic round-trip between text and binary forms.

## 19.2 Compression Negotiation

Should transports be able to negotiate compression at the AXF layer, or should compression remain entirely a concern of the underlying transport such as HTTP or WebSocket extensions?

A future profile might define:

- `gzip` or `zstd` content encoding recommendations;
- per-message compression hints;
- rules for checksumming compressed vs uncompressed bytes.

## 19.3 Authentication and Signing

v0.1 includes an `auth-slot` but leaves the semantics open.

Future versions may define:

- bearer token conventions;
- detached signatures;
- segment-level signatures for partial verification;
- sender identity binding to schema trust roots.

## 19.4 Canonicalization

Checksum and signature interoperability would benefit from a stronger canonicalization model.

Open questions include:

- whether newline and tilde modes should canonicalize to a single logical form;
- whether trailing empty elements should be normalized;
- whether Unicode normalization should be mandatory for signed content.

## 19.5 Schema Language

The schema discovery mechanism is specified in v0.1, but the schema document language itself is not fully standardized.

Future work should define a canonical `.axf-schema` format with:

- segment grammars;
- cardinality rules;
- types and enums;
- extension points;
- examples;
- compatibility policy.

## 19.6 Message Envelopes and Sessions

Some deployments may want additional session concepts such as:

- conversation IDs;
- causality chains;
- resumable streams;
- batched sub-messages;
- multiplexed channels over one connection.

These may be better expressed as profile-level conventions than as core syntax.

## 19.7 Rich Result Streaming

Agent systems often emit token streams, intermediate reasoning artifacts, tool events, and partial tables.

Future schemas or core extensions may define standard segment families for:

- token deltas;
- chunked text output;
- progress updates;
- citations;
- artifact references.

## 19.8 Registry and Governance

A healthy ecosystem may eventually need:

- a public registry for common schema IDs;
- reserved segment namespaces;
- profile registries for signing, transport, and compatibility;
- governance for future atomic words.

---

# 20. Example Canonical Messages

## 20.1 Basic Query

```text
QUERY
FXH*0.1.0*agent://orchestrator*tool://calendar*calendar-slot-v1*
CAL*calendar.findOpenings*req-77*0*iso8601
RNG*2026-04-25T09:00:00Z*2026-04-25T17:00:00Z
ATT*alice@example.com^bob@example.com
DUR*30
FXT*5*none
```

## 20.2 Deferred Response

```text
DEFER
FXH*0.1.0*tool://research*agent://planner*research-v1*
REF*job-991
ETA*2026-04-24T18:05:00Z
STS*accepted*queued
FXT*4*none
```

## 20.3 Error Response

```text
ERROR
FXH*0.1.0*tool://calendar*agent://orchestrator*calendar-slot-v1*
ERR*AUTH*Missing capability token
REF*req-77
FXT*4*none
```

---

# 21. Comparison Summary

Compared with JSON, AXF makes a different set of tradeoffs.

## 21.1 Advantages

- lower token cost for repetitive structured traffic;
- better fit for schema-governed machine interchange;
- straightforward stream parsing;
- easy terminal debugging;
- explicit top-level intent with atomic words.

## 21.2 Costs

- lower self-descriptiveness without schema access;
- stronger discipline required for versioning;
- more up-front schema design work;
- less ad hoc readability than verbose JSON for unfamiliar domains.

## 21.3 Best Fit

AXF is a strong fit when:

- the same message shape repeats often;
- token cost matters;
- schemas are stable or discoverable;
- incremental parsing matters;
- agents and tools exchange compact structured messages at scale.

It is a weaker fit when:

- payloads are mostly one-off human-authored documents;
- structure changes constantly without schema governance;
- the ecosystem already depends on generic JSON tooling more than density matters.

---

# 22. Conclusion

AXF v0.1 proposes a deliberately small core: atomic intent words, compact positional segments, a fixed header and trailer, schema discovery, and streaming-friendly framing.

Its central bet is simple: for agent systems, repeated keys and nested punctuation are often expensive noise. By reviving the dense segment discipline proven by X12 and similar systems—while modernizing it for UTF-8, HTTP-era schema discovery, and LLM token economics—AXF offers a practical candidate wire format for the next layer of machine-to-machine communication.

Whether it succeeds will depend less on syntax than on schema quality, implementation discipline, and interop testing. But as a v0.1 foundation, the shape is intentionally clear: compact, readable, streamable, and unapologetically schema-first.

---

# Appendix A. ABNF-Inspired Sketch

This appendix is illustrative and non-normative.

```abnf
message        = atomic-frame frame-sep header-segment frame-sep *body-segment trailer-segment [frame-sep]
atomic-frame   = atomic-word
atomic-word    = "QUERY" / "RESULT" / "DEFER" / "ERROR" / "ACK" / custom-atomic
custom-atomic  = 1*(ALPHA / DIGIT / "-" / "_")

header-segment = "FXH" element-sep version element-sep value element-sep value element-sep value element-sep [value]
trailer-segment= "FXT" element-sep 1*DIGIT element-sep checksum
body-segment   = segment-id *(element-sep [element])
segment-id     = 2*6(UPALPHA / DIGIT)
element        = *char
subelement     = *char
repeat-value   = *char
value          = *char
checksum       = "none" / ("crc32:" 8HEXDIG) / ("sha256:" 64HEXDIG)

frame-sep      = LF / "~"
element-sep    = "*"
subelement-sep = ":"
repeat-sep     = "^"
escape         = "?"
char           = escaped / safe-char
escaped        = "?*" / "?:" / "?^" / "?~" / "??" / "?n"
safe-char      = %x20-7E / UTF8-non-ascii
```

---

# Appendix B. Minimal Schema Sketch

This appendix is illustrative and non-normative.

```yaml
id: tool-call-v1
axfVersion: ">=0.1.0 <1.0.0"
atomicWords:
  - QUERY
  - RESULT
  - ERROR
segments:
  CAL:
    repeat: 1
    elements:
      - name: operation
        type: string
        required: true
      - name: requestId
        type: string
        required: true
      - name: stream
        type: booleanish
        required: true
      - name: units
        type: enum
        values: [metric, imperial]
        required: false
  LOC:
    repeat: 0..1
    elements:
      - name: location
        type: string
        required: true
  DAY:
    repeat: 0..1
    elements:
      - name: days
        type: integer
        required: true
  FLD:
    repeat: 0..1
    elements:
      - name: fields
        type: repetition<string>
        required: true
  OPT:
    repeat: 0..1
    elements:
      - name: lang
        type: string
      - name: cache
        type: enum
        values: [none, avoid, prefer, require]
```

---

# Appendix C. Naming Guidance for Schema Authors

This appendix is non-normative.

Schema authors SHOULD:

- keep frequently repeated segment identifiers short;
- reserve the shortest identifiers for the hottest paths;
- use consistent segment families across related schemas;
- avoid mixing free text into high-frequency segments unless necessary;
- prefer positional stability over clever packing tricks.

A good AXF schema should feel boring in the best way: regular, dense, and easy to implement.

---

# Acknowledgments

AXF stands on the shoulders of EDI X12, EDIFACT, dense binary interchange formats, and modern agent protocol work. The central insight is old and still excellent: when machines already know the schema, repeating the same labels over and over is mostly decorative overhead.
