Intelligent Spigot: LiteLLM Beneath, Hypergraph Beside, Untool Above¶
Context and Problem Statement¶
backend-core already exposes an OpenAI-compatible LLM Gateway (/v1) that
routes logical model ids through LiteLLM. The platform is also adding a hosted
OpenAI-compatible API endpoint for the hypergraph object model, including a
hyperbolic embedding space that can be queried directly.
The risk is flattening all intelligence into "model routing." LiteLLM is useful provider plumbing, but the hypergraph is not merely another chat provider. It is the platform's structured/geometric knowledge substrate: object identities, relations, neighborhoods, paths, and embedding-space retrieval.
How should the gateway architecture support OpenAI/Anthropic SDK compatibility, LiteLLM provider reach, and the hypergraph/object-model substrate without turning our differentiator into a generic provider string?
Decision Drivers¶
- SDK compatibility — external clients should speak familiar OpenAI and, where valuable, Anthropic-compatible wire shapes.
- Spigot intelligence — policy, intent classification, budget, guardrails, provenance, and retrieval happen before model execution.
- Separation of concerns — LiteLLM normalizes model providers; the hypergraph serves structured/geometric knowledge; the Agent Gateway fronts MCP/A2A tools.
- Moat protection — the Object Model, UDA, ontology, and hyperbolic embedding space are platform capabilities, not LiteLLM extensions.
- Optional federation — tenants may use our native routing, a customer LiteLLM proxy, LiteLLM Enterprise, direct providers, OpenRouter, or local inference without changing the public client contract.
- Supply-chain prudence — LiteLLM is high-value dependency code in the credential path, so keep it contained behind our boundary and pinned/audited.
Considered Options¶
- Untool intelligent spigot above multiple provider families (chosen) — keep one OpenAI/Anthropic-compatible external surface, then route internally across LiteLLM model providers, Hypergraph/Object Model providers, Agent Gateway MCP/A2A tools, and UDA data tools.
- Make the hypergraph a LiteLLM provider — convenient provider-string symmetry, but loses the distinction between "generate text" and "query the platform substrate."
- Fork LiteLLM and add Untool-native intelligence inside it — maximum control, but turns provider compatibility and security maintenance into our burden.
- Use LiteLLM proxy as the entire gateway — fast operationally, but makes the platform's policy graph, provenance, and object-model routing secondary to a third-party control plane.
Decision Outcome¶
Chosen: Option 1 — Untool owns the intelligent spigot. LiteLLM sits beneath it as a model adapter. The hypergraph/object-model endpoint sits beside LiteLLM as a first-class knowledge provider.
The public contract stays boring and compatible:
OpenAI / Anthropic compatible SDK
|
Untool hosted API endpoint
|
Intelligent Spigot
- auth / tenant / policy
- budget / FinOps
- intent classification
- guardrails
- provenance / evidence requirements
|
Router
| | |
LiteLLM Hypergraph API Agent Gateway
LLM providers Object/Embedding MCP/A2A tools
query provider
Architecture Rules¶
- LiteLLM answers: which LLM should generate, reason, embed, or chat?
- The hypergraph answers: which object, relation, neighborhood, path, or embedding-space region should ground the request?
- The Agent Gateway answers: which external MCP/A2A capability should be invoked?
- The UDA/Object Model answers: which typed data object should be hydrated from which backend?
Do not force hypergraph semantics through LiteLLM as a fake chat model. Expose hypergraph access in three forms:
- OpenAI-compatible tool surface
hypergraph.queryhypergraph.neighborshypergraph.embedhypergraph.pathhypergraph.explain- Native object/query API
/v1/hypergraph/query/v1/objects/{type}/{id}/v1/embeddings/search- Router-native retrieval provider
- classify request
- query hypergraph / UDA / Consensus / other evidence sources
- assemble context and provenance
- route model execution through LiteLLM or another model backend
- validate output
Provider Shape¶
The hypergraph should be represented as a sibling provider family, not as a LiteLLM provider:
class KnowledgeTarget:
slug: str
kind: Literal["hypergraph", "vector", "object_model"]
base_url: str
capabilities: tuple[str, ...]
secret_ref: str
cost_unit: str
Routing policy should be able to name both retrieval and model requirements:
{
"intent": "research_or_object_query",
"requires": ["grounding", "citations"],
"retrieval": ["hypergraph-main", "consensus-mcp"],
"model_route": "best_reasoning"
}
Consequences¶
Positive¶
- We keep OpenAI/Anthropic SDK adoption while owning policy and routing.
- LiteLLM can be upgraded, replaced, federated, or isolated without redesigning the platform's knowledge layer.
- Hypergraph/object-model retrieval becomes a first-class substrate capability, not a provider hack.
- The platform can be "OpenRouter for work," routing among models, tools, object memory, graph structure, and embedding geometry.
Negative¶
- We must build and maintain a small router-policy/control-plane layer.
- We need capability metadata and tests across more than one provider family.
- We cannot rely on LiteLLM's admin UI alone for platform-level provenance, ontology, UDA, or hypergraph routing.
Mitigations¶
- Keep the external API OpenAI-compatible by default.
- Add Anthropic-compatible pass-through only where OpenAI shape loses important native semantics.
- Pin/audit LiteLLM; do not fork unless a specific security, licensing, or protocol requirement forces it.
- Keep Hypergraph/Object Model contracts explicit and generated where possible.
Follow-ups¶
- Define
KnowledgeTargetregistry and config override shape. - Add route-policy schemas for retrieval + model selection.
- Add OpenAI tool definitions for hypergraph query primitives.
- Decide whether
/v1/responsesor/v1/chat/completionsis the primary tool-calling surface for SDK users. - Add SCA gates before adopting LiteLLM proxy as an internal managed service.