ARC-ADR-003 — Certified-Module Auto-Discovery & Deploy¶
- Status: Proposed (number to be reconciled with the hub ADR index)
- Date: 2026-05-28
- Deciders: middle-core maintainers
- Consulted: generator team (external repo), backend-core
- Informed: AgentArmy hub, frontend-core
- Extends: ARC-ADR-001 (pub/sub broker — NATS JetStream + CloudEvents)
Context and Problem Statement¶
The model→code generator (modelgen) has been split into its own repository. It
remains the factory that turns model.yaml into certified code modules — the
generated C# contracts (templates/middle-core/generated/*.g.cs), the compiled LangGraph
graph, typed clients, etc. Until now those modules were produced in-tree; now they are
produced elsewhere and must be delivered to middle-core.
middle-core needs a way to:
- Auto-discover when the external generator has certified a new module (or a new version of one), without middle-core polling the generator or the generator enumerating consumers.
- Deploy that module into the running middle-core so the platform can, in effect, rebuild itself from freshly certified output.
- Do (1) and (2) over the messaging fabric that already exists (ARC-ADR-001), with the smallest possible new surface — speed of delivery is the primary driver for v1.
Decision¶
A certified module is announced as a CloudEvent on NATS JetStream and pulled by middle-core over the same connection, then staged and activated at runtime. Every trust/transport/deploy concern is behind a narrow seam so the fast v1 can be hardened later without touching callers.
1. Transport — reuse the secure channel (fastest)¶
The generator publishes the module bytes into a NATS JetStream Object Store bucket
(aax-certified-modules) and emits a CloudEvent announcing it on subject
aax.fleet.module.certified.v1 (the aax.fleet.* convention from the NATS smoke test).
middle-core is already connected to NATS (ARC-ADR-001), so there is no new registry, no
new credential, no new network path — the artifact rides the channel that is already
trusted and wired. The fetch is abstracted behind a Fetcher protocol, so an OCI registry
(ORAS) or a blob bucket can replace the Object Store later with no change to the
orchestrator.
2. Deploy — runtime staging + in-process registry ("the system builds itself")¶
On discovery, middle-core writes the verified bytes to a content-addressed staging
directory and registers/activates the module in an in-process ModuleRegistry. This
is the no-git path: the running service adopts the new certified module without a commit
or redeploy. A GitVendorSink stub is kept as a pluggable alternative for when an audited
PR trail is wanted (it would open a vendor PR and let existing CI drift-gates take over) —
but that is explicitly not the v1 default.
3. Trust — checksum now, pluggable to signatures later¶
v1 verification is a SHA-256 digest match: the CloudEvent carries artifact.digest
(sha256:…); middle-core hashes the fetched bytes and rejects on mismatch before
anything is staged. Verification sits behind a Verifier protocol, so cosign/sigstore
signatures or a PROV-O evidence-pack attestation (ADR-0002 lineage) can be layered in later
without changing the discovery flow. Nothing unverified is ever staged or activated.
4. Messaging shape¶
| Concern | Choice |
|---|---|
| Transport | NATS JetStream (ARC-ADR-001) |
| Announce subject | aax.fleet.module.certified.v1 |
| Envelope | CloudEvents v1.0 JSON, type = com.agentarmy.module.certified.v1 |
| Artifact store | NATS JetStream Object Store bucket aax-certified-modules |
| Consumer | durable JetStream consumer middle-core-module-autodiscovery (at-least-once; ack after stage) |
| Contract | contracts/module-certified.cloudevent.schema.json |
The event data payload (see the schema for the normative definition):
{
"moduleId": "middle-core/data-platform-contracts",
"version": "1.4.0",
"target": "csharp",
"artifact": {
"store": "nats-object",
"bucket": "aax-certified-modules",
"object": "middle-core/data-platform-contracts/1.4.0.tar.zst",
"contentType": "application/zstd",
"sizeBytes": 12345,
"digest": "sha256:…"
},
"provenance": { "modelSnapshot": "sha256:…", "generatorCommit": "…" },
"issuedAt": "2026-05-28T22:00:00Z"
}
Flow¶
generator (external repo) middle-core (agent_runtime/modules)
───────────────────────── ───────────────────────────────────
certify module
put bytes → JetStream Object Store
publish CloudEvent ─────────────────────▶ durable consumer on
aax.fleet.module.certified.v1 aax.fleet.module.certified.v1
│
├─ parse + validate envelope (events.py)
├─ fetch bytes by ArtifactRef (fetch.py: Fetcher)
├─ verify sha256 digest (verify.py: Verifier)
├─ stage content-addressed (sink.py: ModuleSink)
└─ register + activate (registry.py)
ack ──────────────────────────▶ (re-delivered if not acked)
A failed verification or malformed event produces a rejected DiscoveryResult and the
message is not acked into a poison loop — it is recorded and dead-lettered (DLQ wiring is a
follow-up; see Open Questions).
Consequences¶
Positive - Fastest possible v1: no new infra, rides the existing trusted channel. - The platform can adopt freshly certified modules at runtime — the self-build loop. - Every hard decision (transport, trust, deploy target) is a swappable seam, so hardening is additive, not a rewrite.
Negative / risks
- Object Store is fine for the module sizes we ship today; very large artifacts may later
warrant OCI/ORAS (the Fetcher seam covers this).
- Runtime hot-activation without a git trail trades auditability for speed; GitVendorSink
exists for when that trade is wrong.
- At-least-once delivery means handle() must be idempotent (it is: staging is
content-addressed and re-stage is a no-op).
Open questions (follow-ups, out of scope for v1)¶
- Dead-letter subject + retry policy for repeatedly-failing modules.
- Signature/attestation verifier (cosign / PROV-O evidence-pack) — the
Verifierseam. - Emitting a
com.agentarmy.module.deployed.v1acknowledgement event back to the fleet. - Rollback / pinning a known-good version when a newly activated module misbehaves.
References¶
- ARC-ADR-001 — Pub/Sub broker selection (NATS JetStream + CloudEvents)
contracts/module-certified.cloudevent.schema.json— the announce envelopeagent_runtime/modules/— the consumer skeleton implementing this ADRscripts/middle-core/nats-smoke.sh— theaax.fleet.*CloudEvents round-trip precedent