ARC-ADR-057 — MBSE & PPM Self-Building Platform¶

One line: How we integrate Model-Based Systems Engineering (MBSE - SysML blocks, ArchiMate components) and Project Portfolio Management (PPM - Mission → Goal → Task hierarchy) to drive self-building agent loops guided by Nicky's creative director HITL steering.

Context and Problem Statement¶

As untool.ai evolves towards a fully self-building agent platform, we face a structural alignment gap between architectural models (how the system is designed) and execution roadmaps (what the agent army builds).

We have two distinct modeling domains that must be unified: 1. Model-Based Systems Engineering (MBSE): SysML blocks, ArchiMate components, flow ports, and interface specifications defining the functional and structural architecture of the platform (repos, containers, contracts, APIs). 2. Project Portfolio Management (PPM): A strategic backlog hierarchy (Missions → Goals → Tasks) defining the execution logic and business justification (why we are building, what capability is unlocked, and who steers it).

Currently, these domains are decoupled. Runtimes like Antigravity, Codex, and Claude Code operate on flat issue lists without architectural context. They do not know if the code changes they propose conform to ArchiMate/SysML design constraints, nor do they have a model-driven loop to verify that implemented code actually realizes a strategic mission.

Furthermore, as the Creative Director, Nicky Clarke needs non-intrusive, high-level control vectors—Human-in-the-Loop (HITL) steering hooks—to guide the design philosophy and behavior of the self-building swarms without resorting to writing code or manually editing individual agent prompts.

We require a model-driven integration that reifies both MBSE structural elements and PPM strategic hierarchies in the platform's self-model ontology, closing the loop from strategic mission to verified code.

Decision Drivers¶

Model-Driven Codegen: Agent swarms must only build elements that realize recognized system components or capability tracks in the ontology.
Strategic Traceability: Every line of code changed in a spoke repository must trace back to a Goal that realizes a Mission.
Automated Verification: The build-and-test loop must collect machine-readable evidence proving compliance with both functional contracts and architectural constraints.
Creative Director Steering (HITL): Nicky Clarke needs explicit hooks in the model (Vision Prompts, Hysteresis, Gate Signoffs) to direct the swarm's focus, quality limits, and risk thresholds.
Substrate Neutrality: Runtimes must read/write to this plane via the untool MCP and filesystem projections.

Proposed Decisions¶

We choose to integrate MBSE and PPM into a unified, ontology-driven Self-Building Execution Loop reified in model.yaml.

graph TD
    Mission["PPM: Mission (Kind)"]
    Goal["PPM: Goal (Kind)"]
    Capability["System: Capability (Kind)"]
    SysMLBlock["MBSE: SysML Block / ArchiMate Component"]
    SpokeRepo["Implementation: Spoke Repository Code"]
    HITL["HITL: Creative Director Steering (Nicky)"]

    Mission -- "mission-realization" --> Goal
    Goal -- "capability-realization" --> Capability
    Capability -- "realized-by" --> SysMLBlock
    SysMLBlock -- "implemented-in" --> SpokeRepo
    HITL -- "hitl-steering" --> Goal

1. SysML/ArchiMate to Ontology Mapping Conventions¶

We map physical, functional, and structural systems engineering elements to the platform ontology types as follows:

Systems Engineering Concept	ArchiMate Profile	SysML Profile	Platform Ontology Mapping
System Boundary / Space	Application Collaboration	System Context	`Surface` (e.g., Crucible, Workspace)
Functional Component	Application Component	Block	`Repository` or `Container`
Infrastructure Node	Technology Node	Node / Execution Env	`Platform` (e.g., Azure Container Apps)
Interface / Flow Port	Application Interface	Flow Port / Proxy Port	`Contract` or `SystemMessageSubject`
Data Structure / Payload	Data Object	Value Type / Signal	`MessageFamily` or Shared Schema
Composition / Part Assembly	Composition Relation	Directed Composition	`subComponentOf` / `parent` Edge
Constraint / Rule Block	Application Policy	Constraint Block	`SieveValidation` / SHACL Rule
Dynamic Execution Flow	Application Process	Activity / Interaction	`TeamingPattern` / DBOS Workflow

These alignments are asserted via the mbse-alignment relator, which maps model elements to their respective profiles and verification guidelines.

2. PPM Reified Hierarchy¶

To represent the strategic portfolio in the graph, we introduce the following ontology elements: * Mission (Kind): Represents a high-level strategic theme or objective (e.g., "Transition to holographic source-control substrate"). * Goal (Kind): A concrete, measurable target representing a capability milestone (e.g., "Implement lakeFS UDA provider in backend-core"). * Task (Kind): A specific, atomic action item tracking code modifications (mapped dynamically to a local VFS task shard or optionally synced). * mission-realization (Relator): Reifies the link between a Mission and the Goals that fulfill it. * goal-decomposition (Relator): Maps a Goal to its child Task nodes. * hitl-steering (Relator): Connects a Goal or Mission to Nicky's steering inputs.

3. The Self-Building Execution Loop¶

The execution loop operates as a state-machine managed by the Release Train Engineer (RTE) and executed by specialist builder agents:

[Decompose Goal] ──> [Claim & Implement Spoke Code] ──> [Run Forge Codegen] ──> [Verify Evidence] ──> [Close & Promote]

Decompose: The RTE decomposes an active Goal into repository-specific issues and tasks within the target spoke repos (e.g., backend-core, frontend-core).
Implement: Specialist coding agents (e.g., backend-developer, frontend-developer) pull tasks from the local-fleet MCP queue, claim the file-ownership territories, and implement changes in isolated git worktrees.
Compile & Forge: The local build-gate triggers the agentarmy-forge codegen tool. It materializes ontology changes into typed C#, TypeScript, and Python Object Models, ensuring code representations remain in sync with the model.
Verify Evidence: The loop runs local test suites, contract validation checks, and L1/L4 ontology provers (SHACL, SMT). Verification evidence (test logs, schema conformity proofs, PR status) is captured in signed JSON files under evidence/.
Close & Promote: Once the audit checks confirm that all adoption tracks for the capability are satisfied and green, the RTE merges the code, closes the tracking issues, and marks the Goal as realized.

4. Human-in-the-Loop (HITL) Steering Hooks¶

Nicky Clarke guides the creative and quality direction of the swarm through three reified parameters:

Vision Prompts (Creative Focus): Strategic guidance injected at the Mission or Goal level. This prompt is appended to the system instructions of all builder agents working on that target, setting stylistic, architectural, or performance rules (e.g., "Optimize for zero allocation and maximum read concurrency").
Hysteresis Adjustments ($\eta \in [0.0, 1.0]$): An operating tolerance parameter governing convergence constraints:
$\eta = 0.0$ (Strict Verification Gate): No warning or test skip allowed. Any compiler/lint warning or untested route fails the build.
$\eta \in (0.0, 0.5]$ (Incremental Guard): Warnings are allowed, but contract interface schema checks (sieve) are strictly enforced.
$\eta \in (0.5, 1.0)$ (Spike Mode): speculative compilation is enabled. Soft contract errors are logged, but build passes to test structural logic.
$\eta = 1.0$ (Wild Swarm): Completely unrestricted execution inside the MicroVM sandbox for exploratory agent spikes.
Gate Signoffs (Manual Milestones): Mandatory manual approvals triggered when the loop attempts risky actions. These are captured as hitl-steering events where Nicky approves transitions like merging database schema evolutions, promoting code to production, or departing from a previously accepted ADR.

Finalized Architectural Decisions¶

Per Nicky Clarke's signoff on 2026-05-31:

1. SysML & ArchiMate Authoring Tooling¶

Decision: Both (Incremental Easiest-to-Hardest Progression) * Markdown-First (YAML models and inline Mermaid graphs) is adopted as the primary source of truth for all systems modeling. * Standard OMG-compliant XML/XMI export stubs will be developed progressively as projection targets, enabling external MBSE integrations (Cameo/Enterprise Architect) without git thrashing.

2. Hysteresis Control Scope¶

Decision: Goal-Level Granularity * The hysteresis parameter $\eta$ is configured directly on individual strategic Goal nodes in the ontology. This allows strict validation gates for database/kernel layers while maintaining exploratory agility for front-end prototyping.

3. PPM Task Board Sync Seam¶

Decision: Self-Contained Local Board with Optional GitHub Sync * The Holonic Unified Board operates natively on the virtual database layer (ArcadeDB/vfs_server.db) to enable full offline execution. GitHub integration is treated as a pluggable, bi-directional sync adapter rather than a required execution substrate.

Consequences¶

Positive¶

No Code Sprawl: Builder agents cannot write arbitrary code; all modifications must be traced back to an architectural element and a strategic goal.
Non-Intrusive Steering: Nicky can shift the design system, speed, or focus of multiple parallel runtimes simultaneously by modifying a single model node (e.g., updating a Vision Prompt or Hysteresis setting).
Self-Validating: Code generation and compilation are tied directly to the systems model, failing early if an agent breaks an interface contract.

Negative / Costs¶

Increased Latency: Running forge codegen and validation provers on every execution step increases the local test cycle time.
Model overhead: Requires developers (and agents) to maintain and update the ontology schema alongside codebase edits.

Verification / Spikes¶

Decomposition Spike: Trigger the decomposition of a dummy PPM Goal into a spoke issue, claim it with an agent, and verify that the resulting code changes map back to the target SysML block.
Hysteresis Spike: Vary the hysteresis parameter on a test run. Verify that low hysteresis triggers validation failures on draft commits, while high hysteresis permits them but flags them for final signoff.