Skip to content

Intelligent Spigot: LiteLLM Beneath, Hypergraph Beside, Untool Above

Context and Problem Statement

backend-core already exposes an OpenAI-compatible LLM Gateway (/v1) that routes logical model ids through LiteLLM. The platform is also adding a hosted OpenAI-compatible API endpoint for the hypergraph object model, including a hyperbolic embedding space that can be queried directly.

The risk is flattening all intelligence into "model routing." LiteLLM is useful provider plumbing, but the hypergraph is not merely another chat provider. It is the platform's structured/geometric knowledge substrate: object identities, relations, neighborhoods, paths, and embedding-space retrieval.

How should the gateway architecture support OpenAI/Anthropic SDK compatibility, LiteLLM provider reach, and the hypergraph/object-model substrate without turning our differentiator into a generic provider string?

Decision Drivers

  • SDK compatibility — external clients should speak familiar OpenAI and, where valuable, Anthropic-compatible wire shapes.
  • Spigot intelligence — policy, intent classification, budget, guardrails, provenance, and retrieval happen before model execution.
  • Separation of concerns — LiteLLM normalizes model providers; the hypergraph serves structured/geometric knowledge; the Agent Gateway fronts MCP/A2A tools.
  • Moat protection — the Object Model, UDA, ontology, and hyperbolic embedding space are platform capabilities, not LiteLLM extensions.
  • Optional federation — tenants may use our native routing, a customer LiteLLM proxy, LiteLLM Enterprise, direct providers, OpenRouter, or local inference without changing the public client contract.
  • Supply-chain prudence — LiteLLM is high-value dependency code in the credential path, so keep it contained behind our boundary and pinned/audited.

Considered Options

  1. Untool intelligent spigot above multiple provider families (chosen) — keep one OpenAI/Anthropic-compatible external surface, then route internally across LiteLLM model providers, Hypergraph/Object Model providers, Agent Gateway MCP/A2A tools, and UDA data tools.
  2. Make the hypergraph a LiteLLM provider — convenient provider-string symmetry, but loses the distinction between "generate text" and "query the platform substrate."
  3. Fork LiteLLM and add Untool-native intelligence inside it — maximum control, but turns provider compatibility and security maintenance into our burden.
  4. Use LiteLLM proxy as the entire gateway — fast operationally, but makes the platform's policy graph, provenance, and object-model routing secondary to a third-party control plane.

Decision Outcome

Chosen: Option 1 — Untool owns the intelligent spigot. LiteLLM sits beneath it as a model adapter. The hypergraph/object-model endpoint sits beside LiteLLM as a first-class knowledge provider.

The public contract stays boring and compatible:

OpenAI / Anthropic compatible SDK
        |
Untool hosted API endpoint
        |
Intelligent Spigot
  - auth / tenant / policy
  - budget / FinOps
  - intent classification
  - guardrails
  - provenance / evidence requirements
        |
Router
  |              |                 |
LiteLLM          Hypergraph API     Agent Gateway
LLM providers    Object/Embedding   MCP/A2A tools
                 query provider

Architecture Rules

  • LiteLLM answers: which LLM should generate, reason, embed, or chat?
  • The hypergraph answers: which object, relation, neighborhood, path, or embedding-space region should ground the request?
  • The Agent Gateway answers: which external MCP/A2A capability should be invoked?
  • The UDA/Object Model answers: which typed data object should be hydrated from which backend?

Do not force hypergraph semantics through LiteLLM as a fake chat model. Expose hypergraph access in three forms:

  1. OpenAI-compatible tool surface
  2. hypergraph.query
  3. hypergraph.neighbors
  4. hypergraph.embed
  5. hypergraph.path
  6. hypergraph.explain
  7. Native object/query API
  8. /v1/hypergraph/query
  9. /v1/objects/{type}/{id}
  10. /v1/embeddings/search
  11. Router-native retrieval provider
  12. classify request
  13. query hypergraph / UDA / Consensus / other evidence sources
  14. assemble context and provenance
  15. route model execution through LiteLLM or another model backend
  16. validate output

Provider Shape

The hypergraph should be represented as a sibling provider family, not as a LiteLLM provider:

class KnowledgeTarget:
    slug: str
    kind: Literal["hypergraph", "vector", "object_model"]
    base_url: str
    capabilities: tuple[str, ...]
    secret_ref: str
    cost_unit: str

Routing policy should be able to name both retrieval and model requirements:

{
  "intent": "research_or_object_query",
  "requires": ["grounding", "citations"],
  "retrieval": ["hypergraph-main", "consensus-mcp"],
  "model_route": "best_reasoning"
}

Consequences

Positive

  • We keep OpenAI/Anthropic SDK adoption while owning policy and routing.
  • LiteLLM can be upgraded, replaced, federated, or isolated without redesigning the platform's knowledge layer.
  • Hypergraph/object-model retrieval becomes a first-class substrate capability, not a provider hack.
  • The platform can be "OpenRouter for work," routing among models, tools, object memory, graph structure, and embedding geometry.

Negative

  • We must build and maintain a small router-policy/control-plane layer.
  • We need capability metadata and tests across more than one provider family.
  • We cannot rely on LiteLLM's admin UI alone for platform-level provenance, ontology, UDA, or hypergraph routing.

Mitigations

  • Keep the external API OpenAI-compatible by default.
  • Add Anthropic-compatible pass-through only where OpenAI shape loses important native semantics.
  • Pin/audit LiteLLM; do not fork unless a specific security, licensing, or protocol requirement forces it.
  • Keep Hypergraph/Object Model contracts explicit and generated where possible.

Follow-ups

  • Define KnowledgeTarget registry and config override shape.
  • Add route-policy schemas for retrieval + model selection.
  • Add OpenAI tool definitions for hypergraph query primitives.
  • Decide whether /v1/responses or /v1/chat/completions is the primary tool-calling surface for SDK users.
  • Add SCA gates before adopting LiteLLM proxy as an internal managed service.