MCR-F4 — Middle-core data-platform contract (consumer binding for the UDA)¶
Context and Problem Statement¶
The Universal Data Adapter (ADR 0001, epic #13) is meant to consume middle-core's
data-platform objects + projection interfaces — the contract the hub registry calls
MCR-F4 (DataPlatformContracts.g.cs). In the registry MCR-F4 is producer:
middle-core (RT7) → consumer: backend-core UDA (RT6), and its status is PENDING —
middle-core #11 (draft). backend-core #40 ("Bind UDA to MCR-F4") is therefore BLOCKED.
Two problems block a binding even once #11 ships:
- No language-neutral artifact. MCR-F4 is generated C#. backend-core's UDA is Python (FastAPI) with a Rust serving path — neither can bind to C# types. A neutral on-the-wire schema is required so both sides generate from one source of truth.
- No agreed type system / versioning across the hop. The UDA's own type system is
Apache Arrow (ADR 0001:
introspect_schema → pyarrow.Schema). The contract must map cleanly to Arrow and to C#, and be versioned + drift-guarded (the stack already uses Pact-style provider/consumer tests + a CI drift gate, ARC-ADR-005).
This ADR proposes the neutral wire contract for MCR-F4 so middle-core can ratify it and #11 can publish authoritative records/projections against it. It adds no consumer code — only the contract artifact and this rationale.
Decision Drivers¶
- Language-neutral — one schema; C# (producer) and Python/Rust (consumer) are generated, no language privileged.
- Arrow- and CDM-aligned — types map 1:1 to Arrow (UDA introspection) and to a CDM semantic vocabulary for cross-source alignment.
- Contract-first + drift-guarded — versioned (
schemaVersion), pinned by the consumer, enforced in CI (consistent with ARC-ADR-005). - Read-only projections — projection interfaces are views; they never mutate (the UDA maps each to a connector read).
- Minimal + extensible — scalars + array/map/record cover v1; additive evolution is non-breaking.
Considered Options¶
- Schema-first neutral contract — a JSON-Schema meta-model is the source of truth; middle-core's C# (
DataPlatformContracts.g.cs) and backend-core's consumer types are both generated from it. (recommended) - C#-first — keep
DataPlatformContracts.g.csauthoritative; generate the neutral schema from the C#. Workable, but privileges one language and couples the cross-layer contract to a C# toolchain. - Share C# types directly — rejected: no viable Python/Rust binding; violates the neutral-contract principle.
- OpenAPI-only or AsyncAPI-only — premature until we know whether the data platform is request/response (serve) or push (event stream). See open questions.
Decision Outcome¶
Proposed: Option 1 — a schema-first, language-neutral MCR-F4 contract. The artifact is
contracts/proposed/mcr-f4.data-platform.schema.json
(JSON Schema 2020-12), with a worked
mcr-f4.example.json that validates against
it and doubles as the future consumer-test fixture.
Shape¶
- Envelope —
schemaVersion(semver). Additive (new optional field / new record / new projection) = MINOR; breaking (remove/rename/retype required field, change cardinality) = MAJOR. Consumer pins MAJOR; CI drift gate compares pinned vs published. - DataRecord —
name,fields[](name,type,nullable,semanticType),primaryKey[]. - ProjectionInterface —
name,sourcerecord,cardinality(one|many),readOnly:true,fields[](subset/renamed view),parameters[](filters the consumer supplies).
Cross-language type mapping (normative)¶
| Neutral | Arrow (UDA) | C# (MCR-F4) | JSON wire |
|---|---|---|---|
string |
large_utf8 |
string |
string |
int32 |
int32 |
int |
number |
int64 |
int64 |
long |
number |
float32 |
float32 |
float |
number |
float64 |
float64 |
double |
number |
bool |
bool_ |
bool |
boolean |
timestamp |
timestamp[us, UTC] |
DateTimeOffset |
RFC 3339 string |
date |
date32 |
DateOnly |
ISO date string |
time |
time64[us] |
TimeOnly |
ISO time string |
bytes |
large_binary |
byte[] |
base64 string |
decimal |
decimal128 |
decimal |
string |
uuid |
large_utf8 (logical) |
Guid |
string |
json |
large_utf8 |
JsonElement |
any |
array<T> |
list<T> |
IReadOnlyList<T> |
array |
map<string,T> |
map<utf8,T> |
IReadOnlyDictionary<string,T> |
object |
record<R> |
struct |
nested type | object |
Transport binding (proposed)¶
Sync request/response over HTTP/JSON for v1: each ProjectionInterface is invoked with its
parameters; the UDA materialises results as Arrow. (If middle-core's platform pushes
records, this becomes an AsyncAPI channel instead — see open questions.)
Consumer obligations (backend-core UDA, when ratified — not in this ADR)¶
- Pin
schemaVersion(MAJOR) and validate incoming records/projections against the schema. - Map each projection → a connector read; return Arrow/JSON.
- Add a Pact-style consumer contract test drift-guarded in CI, using
mcr-f4.example.jsonas the seed fixture; register in the hub contracts registry.
Open questions (for hub / middle-core ratification)¶
- Source of truth: schema-first vs C#-first (Option 1 vs 2) — the key governance call.
- Serve vs push — request/response (OpenAPI) or event stream (AsyncAPI)? Decides the transport binding above.
- CDM vocabulary — adopt a shared
semanticTypeterm set (ties to the BigQueryintrospect_schemaCDM work in ADR 0001). - Artifact home — promote this from
contracts/proposed/to the hub contracts registry once ratified, so middle-core and backend-core consume one copy.
Consequences¶
- Positive — unblocks a concrete, drift-safe path for #40; gives middle-core #11 a schema to publish against; no language is privileged; aligns with Arrow + ARC-ADR-005.
- Negative / risk — adds a schema-generation step to middle-core's build if Option 1 is chosen; the neutral type set may need extension as real records land.
- #40 stays BLOCKED until (a) this contract is ratified at the hub and (b) middle-core #11 publishes authoritative records/projections; only then is the consumer adapter + test built.
Links¶
- backend-core #40 (UDA ↔ MCR-F4 binding), epic #13, ADR 0001 (UDA)
- Hub: ARC-ADR-002 (JWT forwarding, accepted), ARC-ADR-005 (provider/consumer contracts), inter-layer contracts registry (
contracts_urlin.agent/hub.json) - Artifacts:
contracts/proposed/mcr-f4.data-platform.schema.json,contracts/proposed/mcr-f4.example.json