Local deploy for middle-core consumption¶
backend-core runs as application-tier containers in docker-compose.yml
(ARC-ADR-023 / docs/adr/0004-container-tiering.md). The ArcadeDB platform
service is hub-owned — bring the hub's templates/local-stack/ up first
(it runs ArcadeDB on the host) and then start backend-core; they reach each
other via host.docker.internal:2480.
One-time setup¶
cp .env.example .env
cp .secrets/arcadedb_password.txt.example .secrets/arcadedb_password.txt
In .env, set the provider keys for whichever paths you'll exercise:
| Variable | Used by | Models it unlocks |
|---|---|---|
CEREBRAS_API_KEY |
LLM gateway (/v1) |
llama-3.3-70b, llama3.1-8b, … |
ANTHROPIC_API_KEY |
LLM gateway (/v1) |
claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5 |
AZURE_EMBED_API_KEY |
UDA embeddings + /v1/embeddings |
embed-v-4-0 (Cohere Embed v4 on Foundry) |
AUTH_JWT_SECRET (defaults to dev-only-change-me) is what the dev console
mints local tokens against; for production deployments switch to OIDC/JWKS
(AUTH_JWKS_URL + AUTH_ISSUER + AUTH_AUDIENCE) and set
APP_ENV=production.
Up¶
First bring the hub's local-stack up (gives you ArcadeDB on
host:2480), then this repo's compose:
# in the AgentArmy hub clone:
( cd templates/local-stack && docker compose up -d )
# in this repo:
docker compose up --build
This repo's compose brings up two application-tier containers:
| Service | Port | Surface |
|---|---|---|
backend-core |
8000 |
FastAPI: /v1/* (LLM gateway, ARC-ADR-021) + /api/v1/* (Universal Data Adapter, ADR 0001) + /console, /health/* |
backend-core-rust-v2 |
8080 |
Fast /api/v2 track (Axum) — in-memory today |
backend-core reaches ArcadeDB at ARCADEDB_URL (default
http://host.docker.internal:2480 — the hub local-stack on the host). The
LLM gateway routes do not depend on ArcadeDB, so if you only want the
model surface you can skip the hub local-stack and ignore the 503 on
/health/ready.
Verify¶
Visual — open http://localhost:8000/console in a browser, click Mint dev token, then List models, then send a streaming chat.
Programmatic — run the included smoke script (stdlib only, no install needed):
python scripts/smoke_gateway.py
# or:
python scripts/smoke_gateway.py --chat-model claude-sonnet-4-6
python scripts/smoke_gateway.py --base https://backend-core.<...>.azurecontainerapps.io
The script hits /health/live + /health/ready, mints a dev JWT via
POST /api/v1/_dev/token (404 in production), lists /v1/models, and
optionally calls /v1/chat/completions if --chat-model is passed.
What middle-core points at¶
| Path | Used by middle-core for |
|---|---|
http://localhost:8000/v1 |
OpenAI-compatible LLM gateway. Wire LangGraph's ChatOpenAI(base_url=…) or CopilotKit's openaiApiBase here. |
http://localhost:8000/v1/embeddings |
Cohere Embed v4 (and any other registered embedding model) on the OpenAI shape. |
http://localhost:8000/api/v1 |
Universal Data Adapter: connections, queries, ingestion, search. |
http://localhost:8000/console |
The browser test UI; handy for verifying changes end-to-end without spinning up middle-core. |
Auth — forward the user JWT (ADR-002) on every request. For local dev,
POST /api/v1/_dev/token returns an HS256 token signed with AUTH_JWT_SECRET
(the endpoint hard-disables itself when APP_ENV=production).
Compose networking¶
If middle-core also runs in docker compose, the simplest path is to add it to the same Compose project (or join its network) and call backend-core by service name:
# middle-core's compose.yml (sketch)
services:
middle-core:
# ...
environment:
OPENAI_API_BASE: http://backend-core:8000/v1
OPENAI_API_KEY: ${BACKEND_CORE_JWT} # JWT from /api/v1/_dev/token
UDA_BASE: http://backend-core:8000/api/v1
networks: [default, backend-core_default]
networks:
backend-core_default:
external: true
Running middle-core on the host instead works the same way — just use
http://localhost:8000 since port 8000 is published.
Tearing down¶
docker compose down -v # also drops the ArcadeDB volume