Skip to content

ARC-ADR-054 — Holographic Virtual Filesystem (HVFS)

One line: A database-backed, content-addressable virtual filesystem for agent swarms and human pairs that mirrors Git syntax command-for-command but operates at database transaction speeds.

Context and Problem Statement

Our agent armies (running Codex, Antigravity, and Claude Code) operate in isolated git worktrees. This asynchronous developer model creates significant friction: - High Process Overhead: Repeatedly spawning git commit, git add, and git diff shell processes throttles the agent execution loops and consumes unnecessary token and CPU cycles. - Checkout Latency: Checking out branches physically copies thousands of files on disk, causing high disk IO wait times. - Lack of Real-Time Collaboration: Standard Git does not support concurrent write coordination inside a single branch; multiple writers lead to index locks or merge collisions. - Asymmetric Integration: Human operators and agent swarms cannot seamlessly pair program on a shared workspace in real-time.

We need a virtual filesystem layer that behaves like Git (content-addressable, DAG of commits, branching/merging) but runs at database transaction speeds and supports real-time multi-agent collaboration.

Decision Drivers

  • Zero Friction Migration: The VFS interface must be immediately usable by existing human teams and pre-trained LLM code agents without modifying their core instructions.
  • Performance & Scale: Branching, commits, diffing, and merging must be metadata-only operations running in sub-100ms.
  • Concurrency & Collision Safety: The filesystem must coordinate parallel writers inside a shared workspace.
  • High Availability (HA): Storage must be durable, distributed, and highly available across all cloud spokes.

Considered Options

Option A — Standard Git via In-Memory Git Libraries (libgit2 / Go-Git)

Run standard Git operations but bypass shell execution by using in-memory Git bindings directly inside our Python and C# runtimes. - Pros: - Standard Git compatibility. - No new database dependencies. - Cons: - Still suffers from file index locks during concurrent commits. - No native path-level locking or real-time event sync. - High memory usage when handling large repository checkouts in memory.

Option B — Holographic Virtual Filesystem (HVFS) via lakeFS Integration (Chosen)

Separate file content (staged locally in SQLite write-buffers and committed to cloud object storage) from the version history DAG (managed by lakeFS metadata). - Pros: - Zero Checkout Copying: Switching branches updates metadata pointers in sub-milliseconds without physical disk writes. - High Concurrency: Supports multiple parallel writers by isolating staging changes to local client transaction buffers. - Git CLI Command Mirroring: Mirrors standard Git syntax (status, add, commit, checkout, merge) command-for-command, ensuring immediate compatibility. - Durable HA Base: Physical storage is delegated to distributed object storage (MinIO/Azure Blob), inheriting cloud provider durability. - Cons: - Requires maintaining a lakeFS metadata service instance. - Requires local staging DB management on the agent runner.


Decision

Adopt Option B (Holographic Virtual Filesystem). Implement the VFS metadata and file storage layer using lakeFS over object storage, wrapped in a Git-mirroring CLI and MCP tools client inside commons-core.

1. Git Command Mirroring Invariant

The ut vfs CLI and Model Model Context Protocol (MCP) tools match standard Git syntax command-for-command to ensure human and agent compatibility:

Command Action Under-the-Hood Operation
ut vfs checkout <branch> Branch switch Updates local VFS client branch pointers.
ut vfs status View staged changes Queries the local SQLite staging database.
ut vfs add <file> Stage modification Records the path and file hash in staging metadata.
ut vfs commit -m "<msg>" Create commit Parallel uploads staged blobs and appends a DAG node.
ut vfs diff <branch> Compare states Performs high-speed hash map diffs.
ut vfs merge <source> Merge branches Performs metadata-only fast-forward/three-way merges.

2. High-Performance Local Staging Buffer

The client avoids uploading files individually during agent coding loops: 1. Mutations (writes/deletions) are hashed (SHA-256) and staged in a local SQLite database (.vfs_staging.db) in the agent workspace. 2. Only when commit is invoked are the new blobs uploaded in a parallel batch to the object store, followed by a single commit transaction to lakeFS.

3. Dynamic Teaming Extensions

We extend standard Git semantics to support live collaboration: - Exclusive Path Locks (ut vfs lock <path>): Agents assert exclusive locks on specific subdirectories (e.g., locking app/api/admin/ while writing a BFF route) to prevent write collisions. Locks are stored in ArcadeDB with an auto-expiring TTL. - Live Sync (NATS-Driven pairing): VFS clients can join live branches via ut vfs join <branch> --session=<id>. Changes are broadcast instantly as patch events over NATS topics (fleet.vfs.mutations), updating the memory caches of sibling agents and human pairs in real-time.


Consequences

  • + zero-copy Branching: Branching is near-instant, matching the speed of a database write.
  • + Agent Muscle Memory: Existing agents immediately know how to use ut vfs because the verbs map directly to standard Git CLI training.
  • + Lock-Free Workspaces: Local buffers prevent concurrent writers from colliding or locking the index file.
  • − Additional Network Hop: Commit operations require network requests to lakeFS and object storage (mitigated by batch staging).
  • − Metadata Sync Complexity: Requires configuring ArcadeDB coordination locks and NATS event streams for live collaboration sessions.

Alternatives Considered

  • Git over NFS/DFS: Rejected due to high network filesystem latency, lock-file corruption risks, and lack of content-addressable deduplication.
  • Dolt Relational Versioning: Rejected for general file systems (Dolt requires structured tables/schemas) but retained as a specialized candidate for structured data.