Skip to content

Middle-Core Knowledge Loop (Loop 5)

Status: design note + deterministic skeleton

Why

Today middle-core is an open-loop compiler:

model.yaml -> generator -> contracts -> runtime -> evidence

Evidence is terminal. Knowledge landed by the knowledge-drop scenario becomes searchable knowledge-chunks and a knowledge-graph-snapshot, and then nothing flows back. The Labs north-star — "One Model, Many Projections" — implies the inverse arc too: what the platform ingests should be able to reshape the platform's own governed model. That closes the loop:

knowledge -> (extract) -> observations -> (propose) -> model delta -> (govern) -> model.yaml -> regenerate -> contracts -> ...

This note defines that loop and ships the deterministic middle arc (propose) as a skeleton. The non-deterministic arc (extract) is an explicit, pluggable boundary so the rest of the system keeps its deterministic, gate-testable guarantees.

The loop, arc by arc

Arc Input → Output Determinism Owner
1. Ingest (exists) knowledge-sourceknowledge-chunks, knowledge-graph-snapshot, evidence-pack deterministic runtime KnowledgeDropScenarioRunner
2. Extract (boundary, new) chunks → observations.json (candidate concepts / object types / relationships) non-deterministic (NLP/LLM) — pluggable nlp-engineerknowledge-engineer
3. Propose (skeleton, new) observations.json + model.yamlproposal.json (model delta) deterministic tools/modelgen/propose_model_evolution.py
4. Govern (vocabulary exists) proposal.jsondecision-record (proposed → accepted) human/agent gated ontologist-ufo / knowledge-engineer + reviewer
5. Apply + regenerate (exists) accepted delta merged into model.yaml → regenerate deterministic generator + drift gate

The determinism boundary

The only non-deterministic step is Extract. Everything downstream of it is a pure function of structured data, so the existing gates (model validation, drift, SHACL, OWL) still hold. propose never sees free text — it consumes a structured observations.json that an extractor (or a human, or an ontology agent) produced. This keeps the loop honest: a model change is only ever proposed by deterministic, reviewable diffing, and only ever applied through governance.

Governed, never auto-applied

propose does not mutate model.yaml. It emits a proposal.json shaped like the input to a decision-record (it carries a PROV-O-aligned provenance header: agent_id, activity_id, schema_version, recorded_at — mirroring ProvenanceStamp). A reviewer (or an evidence gate; see Loop 1) accepts it before any model edit. This reuses the governance primitives the model already defines (decision-record, evidence-pack, capability-exercise) rather than inventing a side channel.

The skeleton: propose_model_evolution.py

A deterministic "lift" that diffs structured observations against the current model and proposes only the genuinely new elements.

python tools/modelgen/propose_model_evolution.py \
  --model model/middle-core/model.yaml \
  --observations model/middle-core/examples/knowledge-loop-observations.example.json \
  --agent knowledge-engineer \
  --activity knowledge-drop \
  --recorded-at 2026-05-25T00:00:00Z      # optional; omit to stamp "now"

Output (proposal.json):

  • provenanceagent_id, activity_id, schema_version (read from the model), recorded_at (injected clock, like ISerializationClock, so output is reproducible).
  • proposed_additionsontology_concepts, object_types, relationship_types that are not already in the model (sorted by id).
  • already_present — observed ids the model already has (the proposal is idempotent: observations ⊆ model ⇒ empty proposal).
  • conflicts — observed elements that cannot be added consistently (e.g. a relationship whose from/to is neither in the model nor in the same proposal, or an object type referencing an unknown ontology concept).
  • status — always proposed.

Determinism: ids validated (kebab-case for object/relationship ids, PascalCase for concepts), all lists sorted, json.dumps(..., sort_keys=True, indent=2) + trailing newline. Re-running with the same inputs yields byte-identical output.

How this composes with the other loops

  • Loop 1 (evidence-gated promotion) is the natural Govern gate: a proposal can be required to carry a passing capability-exercise + complete evidence-pack before a decision-record may move proposed → accepted.
  • Loop 4a (single-source gUFO stereotypes) is what lets a proposed ontology_concept carry a real stereotype (Kind/SubKind/EventType/Relator), so a lifted concept lands as a first-class gUFO commitment, not a bare string.
  • Agent visibility (Loop 6, delivered) — the model now declares an Actor gUFO Kind with agent as its SubKind, plus the provenance links agent --performs--> work-packet and evidence-pack --attributed-to--> agent (PROV-O wasAttributedTo). The proposal's agent_id and ProvenanceStamp.AgentId now correspond to a first-class ontology node. Remaining: wiring the runtime to attach an agent node and an attributed-to edge to the pinned scenario graph (changes the UI node/edge counts, so it ships with the UI update).

What is intentionally not here

  • Extraction. No NLP/LLM. The extractor is a boundary; observations.json is its contract. A reference extractor can be added later behind that contract without touching the deterministic core.
  • Auto-apply. Merging an accepted delta into model.yaml is deliberately left to the governance step (human/agent + the existing drift gate), not automated by the skeleton.

Future arcs

  • A reference extractor (extract_observations.py) over knowledge-chunk excerpts.
  • Round-trip lift from external OWL/LinkML contributions into observations.json (the LinkML projection is already the structural IR).
  • An apply_proposal.py that merges an accepted proposal and runs the generator + gates in one governed step, emitting the decision-record and evidence-pack as it goes.