Skip to content

The Generative Pipeline

How untool.ai turns an ontology into running platform code — a tour of the ontology-driven generative programming pipeline, its F# compiler core, the sift/sort authoring loop, the forge codegen container, and the MBSE + PPM framing that makes the platform self-building.


1. The thesis — ontology-driven generative programming

The untool.ai platform is not hand-written. It is projected from an ontology by a deterministic pipeline. You author meaning; the pipeline emits the code (C#, Rust, JavaScript, Zig, C++, SQL) — byte-for-byte the same on every run, on every machine, for every reviewer.

This is generative programming in the sense Czarnecki and Eisenecker defined it: a paradigm in which the application is generated on demand from a high-level specification (the domain model, here an ontology) by a configuration knowledge transformer (here the F# compiler core). The goal is not to "save typing" — it is to make the meaning the single source of truth, and the code a reproducible artifact of it.

Three foundational sources frame the approach:

  • Czarnecki & Eisenecker, Generative Programming: Methods, Tools, and Applications (Addison-Wesley, 2000). Established the vocabulary of domain engineering, feature modeling, and configuration knowledge. The core insight we keep: separate problem space (what the user means) from solution space (which language/runtime/library), with an explicit mapping in between. The ontology is our problem space; the projection layer (UFO/BFO/SQL/OAS) is the mapping; the generators are the solution space.
  • Cleaveland, Program Generators with XML and Java (Prentice Hall, 2001). A practical playbook for template-driven code generation that takes a structured source (XML, in 2001 — RDF/OWL/Turtle for us) and emits multiple targets from a single specification. The discipline we inherit: the generator is a function, not a script. Same input, same output, always.
  • Kelly & Tolvanen, Domain-Specific Modeling: Enabling Full Code Generation (Wiley/IEEE, 2008). Argues — with twenty industrial case studies — that full code generation from a domain model is achievable when (a) the metamodel is rigorous and (b) the generator is owned by the same team that owns the language. Our metamodel is the platform self-model (ontology/platform-self-model/); our generators live in the forge container; both move together.

The thesis, in one sentence: the ontology is the program, and the compiler is the platform.

Why this beats "just write the code"

Hand-written platform code suffers from three failures the fleet has felt:

  1. Drift between meaning and implementation. A C# class drifts from the business concept it represents because two people changed each independently. The ontology projection makes drift a compile error.
  2. Multiplication of effort across runtimes. Re-writing the same DataConnector in C# (for the BFF), Rust (for the broker), and SQL (for the warehouse) is mechanical labor that should be a function call.
  3. Loss of provenance. Hand-written code carries no trace back to the decision that produced it. Generated code carries the generator version, the ontology hash, and the projection profile in its header.

The generative pipeline solves all three by construction.


2. The F# compiler core (ADR-033)

The compiler that walks the ontology and emits target code is written in F#. The choice is deliberate. See ARC-ADR-033-fsharp-ontology-compiler-core for the full decision; the short form is below.

Why F

Property Why it matters for an ontology compiler
Algebraic data types (discriminated unions) The ontology AST is a tree of Class \| Property \| Restriction \| Axiom \| …. DUs encode this directly, no visitor boilerplate.
Exhaustive pattern matching The compiler fails to build if a new ontology construct is introduced and any generator forgets to handle it. The type system enforces projection completeness.
Computation expressions The sift loop and the projection pipeline are monadic by nature (Section 3, Section 7). F# computation expressions give us result { … }, async { … }, and a custom projection { … } builder with minimal ceremony.
Low ceremony for tree transforms A catamorphism over the ontology AST is a fold — three lines of F#, no framework. The same operation in Java/C# is a Visitor hierarchy.
.NET interop The compiler binds to dotNetRDF, RDFSharp, and the existing C# spoke runtime with zero shim. Cross-platform via the .NET 9 runtime — runs the same on Windows, Linux, and the forge container.
REPL (FSI) Ontology authors can dotnet fsi against a live AST. The sift loop's "explain this projection" affordance falls out of FSI for free.

F# is not the only language that would work — Haskell, OCaml, Scala 3, and Rust were all candidates. F# won on the strength of .NET interop (the fleet runs on .NET 9 already) and familiarity (the team has C# fluency, and F#'s syntax is close enough to read without a tutorial).

Category theory framing

The operator's standing instruction is that category theory is kept top-of-mind for the F# compiler core. Three constructions matter, and the F# code is written so they are visible:

Projections are functors. A projection (e.g. UFO → OntoUML-C#, or BFO → OBO-OWL) is a structure-preserving map from the ontology category Ont to a target category Tgt — it sends objects (concepts) to objects (types) and morphisms (relations) to morphisms (associations), and it preserves composition and identity. Concretely, this means a projection is a congruence: equal concepts project to equal types. The property test that proves this is the byte-identical emission gate in the forge container (Section 4).

Generators are catamorphisms. A generator (e.g. the C# emitter) is a fold over the ontology AST that collapses it to a string of code. In Bird & de Moor's algebra-of-programming sense, a catamorphism is the unique homomorphism out of an initial algebra. The F# fold over the AST is exactly that. Two consequences fall out: (a) the generator is total — every AST node has a clause; (b) the generator is fusable — two generators composed are still a catamorphism, which is how we get "generate the C# entity and the SQL DDL in one pass."

gUFO ⟷ BFO is a natural transformation. Both UFO (as gUFO) and BFO (as the BFO 2020 OWL) project from the same source ontology — they are two functors F, G : Ont → OWL. The mapping that turns a UFO-shaped entity into a BFO-shaped one (or back) is a natural transformation η : F ⇒ G. The square commutes: project-then-map = map-then-project. This is the formal underpinning of ADR-039 ("foundations as perspectives"): UFO and BFO are not rivals to choose between, they are two functors with a natural transformation between them.

The sift loop is monadic. Each refinement step takes an ontology and returns either a refined ontology or a request for human ratification, threaded through state and IO. That is the Result + State + IO monad stack, which in F# is a computation expression. The "and then" of the sift loop is bind.

References for the framing:

  • Awodey, Category Theory, 2nd ed. (Oxford, 2010). Standard reference for functors, natural transformations, and the universal properties used above. Chapters 1–4 are the working vocabulary.
  • Bird & de Moor, Algebra of Programming (Prentice Hall, 1997). The catamorphism-as-generator framing comes directly from chapters 3 and 6. Their "banana brackets" ⦇ f ⦈ notation is the conceptual ancestor of the F# fold we use.
  • Pierce, Basic Category Theory for Computer Scientists (MIT, 1991). Shorter introduction; useful when onboarding a new compiler-team member.

These are load-bearing citations, not decoration. The compiler's type signatures use the same words.


3. The sift / sort authoring loop (ADR-032)

You do not write the ontology straight into final form. You sift it.

See ARC-ADR-032-ontology-sift-sort-authoring-loop for the decision; the surface where authoring happens is the Crucible (see the Platform Self-Model lexicon — Crucible is the space, forge is the verb that materializes ontology into the Object Model).

What it is

The sift/sort loop is LLM-assisted ontology refinement with human-in-the-loop ratification. The cadence:

  1. Propose. A human (or another agent) drops a draft concept, relation, or refinement into the Crucible.
  2. Sift. An LLM-driven sifter expands the draft into candidate axiomatizations: "did you mean a kind, a role, or a phase? Here are three OntoUML stereotypings with rationale." This is cheap, parallel, and throwaway — the sifter's outputs are options, not decisions.
  3. Sort. A second pass — also LLM, also throwaway — ranks the sifter's candidates against the existing ontology's structure: which option preserves the most existing constraints? which introduces the fewest new competency-question regressions? The output is a ranked shortlist with diff visualizations.
  4. Ratify. A human (the modeler, or — for high-stakes changes — hitl-coordinator) picks. The pick is recorded as an evidence node on the board with full provenance back to the sifter prompts.
  5. Commit. The chosen refinement is written into the ontology source (the model/ YAML or the Turtle store) and the generator pipeline is re-run.

The loop is monadic (Section 2): each step takes the current ontology state and returns either a refined state or a request for human input, threaded with effects (LLM calls, board writes).

Why sift before commit

Three failure modes the sift loop prevents:

  • The "vibe ontology." Without sift, you get an ontology that looks right because one person wrote it, but is internally inconsistent because no one checked it against existing axioms. Sift forces a tournament.
  • Premature commitment. Without sift, the first plausible stereotyping wins, and the alternative is never seen. Sift makes the alternatives visible before any of them is committed.
  • Loss of rationale. Without sift, the why of a modeling choice evaporates the moment the YAML is saved. Sift records the shortlist, the ranking, and the ratification — the rationale is queryable forever.

The cost — running an LLM tournament before every commit — is paid back the first time you have to explain a model choice to a downstream consumer.

The Crucible surface

The Crucible is the surface (the space) where the sift loop runs. It exposes the sifter, the sorter, the diff visualizer, and the ratification button. The Crucible's capability is forge: it materializes ontology into the Object Model. Do not conflate the two — forge is the verb the Crucible exposes (and other surfaces could, in principle, expose forge as well).


4. The forge codegen container (ADR-029)

See ARC-ADR-029-agentarmy-forge-codegen-container. The forge is a containerized image (Function-tier per ADR-023) that takes a ratified ontology and emits a pull request against a target spoke.

What the image manifest proves

The forge container ships an image.json manifest (per the Image Standard referenced in CLAUDE.md) and a doctor script. The doctor proves five capabilities, every build, on every CI run:

Capability What it asserts Why we care
ontology-fetch-http The container can fetch an ontology over HTTPS from a known endpoint. The forge is reachable from arbitrary CI contexts (spoke runners, hub, cloud routines).
ontology-fetch-file The container can load an ontology from a mounted volume. Local development and offline reproducibility.
ontology-fetch-blob The container can fetch from object storage (Azure Blob / R2 / S3). The HVFS track (see the project_hvfs_track memory) stores ontology snapshots in object storage.
v0-csharp-emit-byte-identical Two runs over the same ontology produce byte-identical C# output. Reproducibility.
rust-emit-byte-identical Same property for the Rust generator. Reproducibility across language targets.
smoke-compile-passes The emitted C# compiles and the emitted Rust passes cargo check. Generated code is runnable, not just well-formed text.
pr-opener-opens-against-dummy The forge can open a PR against a dummy repository with the generated diff. The delivery path (Section 8) is exercised, not assumed.

Full list and pass criteria: docs/forge-uda-generation-profile.md and docs/generator-platform-tests.md.

Why byte-identical emission matters

This is non-negotiable, and it is worth being explicit about why:

  • Reproducibility. A reviewer can re-run the forge against the same ontology hash and verify the diff. If the output is byte-identical, the forge has no hidden state — no timestamps, no random ordering, no environment leakage. The PR is trustable without re-deriving it.
  • Diffability. Two ontology revisions produce two emissions; the diff between those emissions is the true diff of meaning. If the generator is non-deterministic, the diff contains noise, and reviewers stop trusting it. Once reviewers stop trusting diffs, the generative pipeline is dead.
  • Supply chain. A reproducible build is a defensible build. Given the ontology, the generator version, and the projection profile, anyone can re-derive the artifact and check the signature. This is the SLSA L3 property applied to the generative pipeline.

The byte-identical property is enforced by the doctor and by the CI property tests in docs/generator-platform-tests.md.

Why open a PR, not push

The forge opens a pull request; it does not push to the target branch. This is deliberate.

  • Generated code is still code. The PR is the review surface — reviewers can run the spoke's CI, look at the diff, and approve or reject. The fact that a human did not write it does not mean a human should not see it.
  • The PR carries provenance. The PR body includes the ontology hash, the generator version, the projection profile, and a link back to the ratification decision on the board. Six months later, when someone asks "why does this method exist?", the answer is two clicks away.
  • CI is the integration gate. The spoke's existing CI (lint, test, contract checks) runs against the PR. If the generator produces code that breaks the spoke's tests, the PR fails and the forge — not the spoke — is at fault. The split is clean.
  • It composes with the review loop. A review-loop-labeled forge-opened PR will be reviewed by @claude / @copilot / @codex just like any human-authored PR (see CLAUDE.md → "Autonomous review loop"). The generative pipeline plugs into the existing fleet workflow with no special case.

A git push would skip all of the above. We don't.


5. MBSE & PPM (ADR-057)

See ARC-ADR-057-mbse-ppm-self-building-platform.

Model-Based Systems Engineering

MBSE is the systems-engineering discipline of treating a model as the authoritative source of truth for a system — its structure, behavior, and requirements — rather than a pile of documents that may or may not agree.

Authoritative references:

  • INCOSE Systems Engineering Handbook, 5th ed. (Wiley, 2023). The canonical practitioner reference. The handbook's framing of model-based systems engineering as a transition from "document-centric" to "model-centric" engineering is exactly the transition the generative pipeline performs for platform code.
  • Friedenthal, Moore & Steiner, A Practical Guide to SysML, 3rd ed. (Morgan Kaufmann, 2014). The reference for SysML as the language of MBSE. We do not use SysML directly — our model is an OWL/OntoUML ontology — but the partition of structure / behavior / requirements / parametrics that SysML enforces is the partition our ontology must also carry. Treat this book as the structural checklist.
  • OMG SysML 1.7 specification (2023). The normative spec for SysML, for the moments when "what does SysML mean by block?" matters.

The MBSE framing tells us: every artifact in the platform must have a model, and the artifact must be derivable from the model. If you find an artifact that is not derivable from a model, that is a defect in the pipeline — either the model is missing the concept (sift it in) or the generator is missing a clause (add it).

Project Portfolio Management

PPM is the management discipline of treating a portfolio of projects as a single coordinated investment. The relevant body of practice is PMI's Standard for Portfolio Management (4th ed., 2017) and the SAFe portfolio level (referenced indirectly via the SAFe workflow in CLAUDE.md).

For the platform, PPM means: every model is on the board. The ontology is not a hidden file on a developer's laptop; it is a tracked artifact on the GitHub Projects board, with Type, PI, Status, and a Decision Artifact trail. Changes to the ontology flow through the board the same way feature work does.

Why MBSE + PPM makes the platform self-building

The combination is what unlocks self-building: the platform builds itself, and the platform is also the work product on the board.

  • Every artifact has a model (MBSE). So every artifact is generable from a model — not by accident, but by construction.
  • Every model is on the board (PPM). So every model is visible, tracked, and prioritized against every other piece of work.

A change to the platform — even a structural one, like adding a new entity type — starts as a board issue, becomes a sift-loop run in the Crucible, ratifies as an ontology change, regenerates code via the forge, lands as a forge-opened PR, and ships through the spoke's normal CI. The work and the artifact are the same thing.

Without MBSE, the platform is a collection of hand-written code that has to be re-written for every architectural change. Without PPM, the modeling work is invisible and unprioritized. Together, the platform builds itself against the same board that tracks every other piece of fleet work.

This is the operative meaning of "self-building": the platform is a deliverable on its own board.


6. Universal Data Adapter (UDA)

The Universal Data Adapter is the first major generated subsystem of the platform — it is the worked example that proves the pipeline.

See docs/uda-data-connector-object-model.md for the full object model and docs/forge-uda-generation-profile.md for the generation profile.

The DataConnector object model

The UDA is built around a DataConnector object — a typed, validated, ontology-projected representation of "a way to talk to a data source." A DataConnector has:

  • Identity — name, version, owning namespace.
  • Authentication mode — OAuth2 (with grant types), API key, mTLS, none. These are ontology concepts; the projection knows how to emit each in C# (HttpClient handlers), Rust (reqwest middleware), and OAS (security schemes).
  • Schema — the entity shapes the adapter exposes, projected from the ontology. The DataConnector shape itself is also generated — turtles all the way down.
  • Capability flags — read / write / streaming / batch. These drive generator branches.

Because the DataConnector model is in the ontology, the adapter scaffolds for every data source (Salesforce, Snowflake, ArcadeDB, the fleet's own holon store) come out of the same generator pipeline. Adding a new source is an ontology authoring task, not a coding task.

Vector manifest byte-identical proof

The UDA generation profile includes a vector manifest — a JSON file listing every generated artifact (C# class, Rust struct, OAS path) with its SHA-256. The proof obligation is straightforward: two runs of the forge against the same ontology produce the same vector manifest, byte for byte. The doctor script asserts this on every build (Section 4); the CI property tests in docs/generator-platform-tests.md assert it on every PR.

The vector manifest is the contract between the ontology and the platform. If the vector manifest changes, something meaningful changed. If the manifest is byte-identical, nothing meaningful changed. There is no ambiguous middle.


7. Foundations as perspectives (ADR-039)

See ARC-ADR-039-foundations-as-perspectives.

The principle

A common error in ontology engineering is to treat foundational ontologies (UFO, BFO, DOLCE, GFO, …) as rivals — to pick one and dismiss the rest. The platform deliberately does not.

Projections are perspectives, not picks.

The same source ontology projects through both UFO and BFO. The two projections are different views of the same underlying meaning, useful for different purposes:

  • UFO → OntoUML → C# entity classes with stereotypes. UFO is the primary authoring discipline. Its stereotypes (kind, subkind, phase, role, relator, …) carry rich ontological commitments (rigidity, sortality, identity) that map cleanly to OOP type structures. This is the projection that drives the application layer.
  • BFO → OBO-style OWL. BFO is the interoperability projection. The OBO Foundry, Common Core Ontologies (CCO), and IAO/RO use BFO as their upper ontology. Emitting a BFO-shaped OWL view of our ontology lets us hand a serialized model to a BFO-trained consumer (a bio/health ontology, a defense/intel CCO consumer) without a translation step.

Diagram of the projection

flowchart TD
    Src["Source ontology<br/>(model/instances YAML + Turtle)"]
    UFO["UFO projection<br/>(OntoUML stereotypes)"]
    BFO["BFO projection<br/>(BFO 2020 OWL)"]
    CSH["C# entity classes<br/>(stereotype attrs)"]
    OWL["OBO-style OWL<br/>(CCO-compatible)"]
    Src --> UFO --> CSH
    Src --> BFO --> OWL
    UFO <-. "η: natural transformation<br/>(gUFO ⟷ BFO mapping)" .-> BFO

The dotted arrow η is the natural transformation of Section 2: the square commutes, and that commutativity is enforced by property tests in the compiler core.

Why this matters

Picking one foundation locks the platform into one community's vocabulary forever. Holding both as projections lets the platform be authored in the foundation that best serves the modeler (UFO, design-oriented) while remaining legible to the foundation that best serves the consumer (BFO, realist, ISO/IEC 21838-2 standardized). The cost is a second projection; the benefit is interoperability with the entire OBO Foundry without giving up OntoUML's expressive power.


8. Abstraction validation & distribution (ADR-036)

See ARC-ADR-036-abstraction-validation-distribution-service.

Every abstraction the pipeline produces passes through a validation gate before it is distributed.

The validation gate

Three independent checks must pass:

  1. SHACL shapes. The projected RDF/OWL passes a SHACL shape graph that encodes the ontology's invariants. If a generated class violates a shape (wrong cardinality, missing property, wrong datatype), the gate fails. SHACL is the W3C standard for RDF constraint validation (recommendation, 2017).
  2. Competency questions. The ontology carries a suite of competency questions (Grüninger & Fox, 1995 — "What is the parent kind of this role?", "Which connectors support OAuth2?"). The validation gate runs the suite against the generated artifacts. A regression in any answer fails the gate.
  3. Smoke compile. The emitted C# / Rust / SQL must compile and pass a minimal runtime smoke test. This is the same smoke-compile-passes capability the forge doctor proves (Section 4), applied per artifact rather than per build.

Only artifacts that pass all three gates are distributed.

The distribution surface

Distribution is the moment a validated abstraction reaches a spoke. The platform uses two distribution surfaces in tandem:

  • Postman mocks — the contract-first surface. The validated OAS for a generated adapter is published as a Postman spec, and a mock is created in the AgentArmy workspace (per CLAUDE.md → "Contract-first & mock-first (always)"). Consumers can hit the mock immediately, in parallel with real producer build-out.
  • Generated client packages — the typed surface. The C# client, the Rust client, and the TypeScript client are published as language packages (NuGet / crates.io-private / npm-private) and pulled by spokes via normal dependency management.

A spoke consumes a generated abstraction by either mocking against the Postman URL or depending on the typed client package. Both surfaces are authoritative; both are kept in sync because both are emitted from the same ontology by the same forge build.

The board-level evidence for distribution is the docs/contracts.md registry — every generated contract has a row, every row has a mock URL and a client package version.


9. What we deliberately avoided

The generative-programming landscape is wide. We considered and passed over several well-known toolchains. The reasons matter — both because they are real tradeoffs and because they explain why the F#-and-container path is not the only way to build such a system, just the best fit for this fleet.

EMF / Sirius

The Eclipse Modeling Framework (EMF) and its diagramming companion Sirius are the canonical "model your domain and generate code from it" stack on the JVM. Industrial adoption is real. We passed for three reasons:

  • JVM ceiling. EMF's runtime and its tooling assume a JVM. The fleet runs on .NET 9 (and Rust, and Node). Carrying a JVM dependency for the modeling toolchain alone is a heavy lift.
  • Eclipse heritage. The IDE coupling (Eclipse RCP) makes the modeling tooling hard to integrate with the agent-driven authoring loop (Section 3). The Crucible surface is web-and-agent-first; EMF's centre of gravity is the Eclipse workbench.
  • Modeling lock-in. EMF's metamodel (Ecore) is its own thing — it is not OWL, not RDF, not OntoUML. Going EMF means re-platforming the ontology into Ecore, which loses the OBO Foundry / CCO interoperability story (Section 7).

EMF would work. It would just not work with the rest of the fleet.

Xtext

Xtext is the Eclipse stack's DSL workbench — write a grammar, get a parser, generator hooks, IDE support, and a typed AST. It is genuinely powerful.

  • Heavy. Xtext drags in Xtend, Guice, ANTLR, and the Eclipse runtime. For a DSL workbench this is justified; for our generator core it is not — we have an ontology, not a textual DSL.
  • Grammar-first. Xtext's premise is that the DSL is defined by its grammar. Our DSL is defined by its ontology — RDF/OWL/OntoUML is the surface, not a hand-written grammar. The two paradigms do not compose comfortably.

If we ever build a textual DSL on top of the ontology, Xtext re-enters the conversation as a candidate for the surface syntax. Today, we do not need a DSL workbench.

JetBrains MPS

MPS is the canonical projectional editor — you do not edit text, you edit the AST directly, and the rendering is projected. The tooling is mature and JetBrains' commercial track record is strong.

  • Learning curve. Projectional editing is a paradigm shift for contributors. Onboarding a new modeler to MPS is materially harder than onboarding them to "edit a YAML file in the Crucible web UI."
  • IDE lock-in. MPS is itself an IDE. Like EMF, this clashes with the agent-driven authoring loop — agents do not run inside MPS.
  • Less mature ontology integration. MPS's metamodel is its own; OWL/RDF integration is bolted on, not native.

MPS is technically excellent and culturally a mismatch.

Pure LLM "vibe-code" generation

The most fashionable 2026 alternative: give an LLM the ontology and ask it to write the platform. This is exactly what we do not do.

  • No determinism guarantee. LLM output is non-deterministic by construction. The byte-identical emission property (Section 4) is unachievable. Reviewers cannot trust diffs. The platform becomes un-reproducible.
  • No total coverage. An LLM "might" handle every ontology construct. The F# compiler's exhaustive pattern match will. The compiler fails to build if it would miss a case; an LLM silently emits something plausible-looking and wrong.
  • No provenance. LLM-generated code carries no trace back to the ontology hash. Six months on, you cannot tell which ontology revision produced which line of platform code.

LLMs are enormously useful in the sift loop (Section 3), where their output is throwaway and human-ratified, and as code reviewers (the @claude / @copilot / @codex review-loop). They are not used as the deterministic core of the pipeline. The split — LLMs at the soft edges, deterministic compiler at the hard core — is the design.

What we kept

  • The F# compiler core (deterministic, exhaustive, fast to write).
  • The forge container (reproducible, byte-identical, PR-opening).
  • The Crucible surface for sift-loop authoring (LLM-assisted, human- ratified).
  • The board as the PPM surface (every model tracked, every change visible).

The rest is variations on the same theme.


10. The pipeline end-to-end

flowchart LR
    Src["Ontology source<br/>(YAML + Turtle)"] --> Sift["Sift / sort loop<br/>(Crucible surface)"]
    Sift --> Comp["F# compiler core<br/>(AST + catamorphisms)"]
    Comp --> ProjU["UFO projection<br/>(OntoUML)"]
    Comp --> ProjB["BFO projection<br/>(OBO/OWL)"]
    ProjU --> GenCS["C# generator"]
    ProjU --> GenJS["JS/TS generator"]
    ProjU --> GenRS["Rust generator"]
    ProjU --> GenZIG["Zig generator"]
    ProjU --> GenCPP["C++ generator"]
    ProjU --> GenSQL["SQL generator"]
    ProjB --> EmitOWL["OWL emit<br/>(byte-identical)"]
    GenCS --> Emit["Byte-identical emission"]
    GenJS --> Emit
    GenRS --> Emit
    GenZIG --> Emit
    GenCPP --> Emit
    GenSQL --> Emit
    EmitOWL --> Emit
    Emit --> Forge["Forge container<br/>(opens PR)"]
    Forge --> Spoke["Spoke pickup<br/>(CI + review + merge)"]
    Spoke -.-> Src

Read the arrows as functorial composition. The dotted feedback edge — spoke pickup back to ontology source — is how use informs meaning: a spoke discovering a missing concept files a board issue, which seeds the next sift-loop iteration. The pipeline is a loop, not a line.


11. References

Primary citations (load-bearing)

  • Czarnecki, K. & Eisenecker, U. Generative Programming: Methods, Tools, and Applications. Addison-Wesley, 2000. The vocabulary of domain engineering and configuration knowledge — Section 1.
  • Cleaveland, J. C. Program Generators with XML and Java. Prentice Hall, 2001. The discipline of the deterministic, template-driven generator — Section 1.
  • Kelly, S. & Tolvanen, J.-P. Domain-Specific Modeling: Enabling Full Code Generation. Wiley/IEEE, 2008. The case for full generation, with twenty industrial case studies — Section 1.
  • Awodey, S. Category Theory, 2nd ed. Oxford University Press, 2010. Functors, natural transformations, and the universal properties — Section 2.
  • Bird, R. & de Moor, O. Algebra of Programming. Prentice Hall, 1997. Catamorphisms as the canonical form of a generator — Section 2.
  • Pierce, B. C. Basic Category Theory for Computer Scientists. MIT Press, 1991. Onboarding-friendly category-theory reference — Section 2.
  • INCOSE Systems Engineering Handbook, 5th ed. Wiley, 2023. The practitioner reference for model-based systems engineering — Section 5.
  • Friedenthal, S., Moore, A. & Steiner, R. A Practical Guide to SysML, 3rd ed. Morgan Kaufmann, 2014. The structural partition of SysML, used as a checklist — Section 5.
  • OMG SysML 1.7 specification. Object Management Group, 2023. Normative spec for SysML — Section 5.
  • Grüninger, M. & Fox, M. S. Methodology for the Design and Evaluation of Ontologies. IJCAI Workshop on Basic Ontological Issues in Knowledge Sharing, 1995. The origin of competency questions — Section 8.
  • W3C SHACL recommendation. Shapes Constraint Language, W3C Recommendation, 20 July 2017 — Section 8.

Prior art for generative pipelines

  • OBO ROBOT release tool — the OBO Foundry's standard tool for releasing OWL ontologies with reasoner checks and report generation. Spiritual ancestor of the forge container's release discipline.
  • ATL (Atlas Transformation Language) and OMG QVT-O — model-to- model transformation languages from the MDE community. Conceptual ancestor of the projection layer.
  • Apache Velocity — the classical template-driven code generator. Heritage for the "generator is a function from model to text" stance.
  • GraalVM Truffle / SOM — modern DSL implementation infrastructure. Reference point for "small DSL on a fast host," the path we did not take but considered for surface syntax.

Foundational ontology references

  • Guizzardi, G. Ontological Foundations for Structural Conceptual Models. PhD thesis, University of Twente, 2005. The origin of UFO.
  • gUFO — gentle UFO, the OWL implementation of UFO. The lightweight on-ramp for OWL-tooling consumers.
  • BFO 2020 / ISO/IEC 21838-2 — Basic Formal Ontology, the realist upper ontology used by the OBO Foundry. The interoperability projection's target.

Operational

  • INCOSE SE Handbook 5th ed., Friedenthal/Moore/Steiner Practical Guide to SysML 3rd ed., and the OMG SysML 1.7 spec are the three references you reach for when an MBSE question lands on the board.
  • The PMI Standard for Portfolio Management, 4th ed. (2017), is the reference for the PPM half of Section 5.

See also: Ontology Foundations · Ontology Stack · Coordination & VFS