ARC-ADR-006: Human-in-the-loop confirmation for destructive operations¶

Metadata¶

Field	Value
ID	ARC-ADR-006
Status	Proposed
Date	2026-05-25
Deciders	Architecture Review
Supersedes	—
Superseded by	—
Tags	hitl, copilotkit, destructive-ops, rbac, safety, generative-ui

Spoke-authored draft. Referenced as proposed by epic #12 / issue #17 at the hub path docs/decisions/ARC-ADR-006-hitl-destructive-ops.md, but not yet published. This file mirrors that ID/filename so the issue links resolve once upstreamed.

Context and Problem Statement¶

Phase 3 (issue #17) lets users manage sources through the copilot, including deleting a source. Delete is destructive and irreversible from the user's point of view, and it is being triggered through a natural-language interface where the agent infers intent — a setting prone to misinterpretation ("clean up these" deleting more than intended). We need a pattern that prevents an LLM-mediated request from destroying data without an explicit, informed human action, while keeping authorization authoritative.

Two distinct concerns must not be conflated:

Authorization (may this user delete at all?) — owned by backend-core RBAC (admin role), reached via the forwarded JWT (ARC-ADR-002). This is the security boundary.
Confirmation (does the user really want this specific deletion now?) — a UX safety gate at the moment of action. This ADR is primarily about the second, layered on the first.

Note this is a different HITL than hub ARC-ADR-001 (HITL decision-point pattern), which is board-level governance for agents escalating decisions to humans across sessions. ARC-ADR-006 is in-app, end-user HITL: an interactive confirmation card in the chat stream.

Decision Drivers¶

#	Driver
D1	A destructive op must never execute on inferred intent alone — it requires an explicit, deliberate human confirmation.
D2	Authorization stays authoritative in backend-core (admin role); the UX gate must not become the security boundary.
D3	The confirmation must clearly state what will be deleted before the user consents (informed consent).
D4	Non-admins must be stopped before being offered a confirmation they cannot fulfill (clear permission-denied, not a dead-end button).
D5	On rejection, the data is preserved and the user is told; on approval, the delete proceeds.
D6	Use a supported CopilotKit primitive so the pause/confirm/resume integrates with agent state.

Considered Options¶

renderAndWaitForResponse HITL confirmation card, layered on backend RBAC (chosen) — the delete tool's useCopilotAction renders a confirmation card and pauses the agent until the user explicitly confirms or rejects; backend-core still enforces admin.
Backend RBAC only, no UX gate — rely solely on backend-core's admin check; the agent deletes immediately if authorized.
Client-side guard only — a frontend confirmation/window.confirm, with no agent-state pause and trusting the client for gating.
Type-to-confirm modal outside the agent flow — a standard React modal (type the source name) decoupled from CopilotKit's action/state lifecycle.

Decision Outcome¶

Option 1 — renderAndWaitForResponse confirmation card, layered on backend RBAC is adopted.

The delete source flow:

Pre-check role (D4). Before offering deletion, the UI checks the user's role (from the session/JWT). Non-admins get a clear permission-denied message instead of a confirmation card.
Render + wait (D1, D3). The delete_source action is registered with useCopilotAction({ renderAndWaitForResponse }). When the agent proposes a delete, the chat stream renders a confirmation card naming exactly what will be deleted, with an explicit Confirm button (and Cancel). The agent pauses — respond() is not called until the user acts.
Resolve (D5). Confirm → respond() returns approval, the agent proceeds, and the delete tool runs in middle-core, which calls backend-core. Cancel → respond() returns rejection, the source is preserved, and the user is notified.
Authoritative enforcement (D2). Independent of the card, backend-core's require_principal admin gate is the real authorization check (reached via the forwarded JWT, ARC-ADR-002). The card is a safety gate, not the security boundary: even if the card were bypassed, backend-core still denies non-admins.

Confirmation Criteria¶

Selecting a source and asking to delete it renders the HITL confirmation card via renderAndWaitForResponse; the card states what will be deleted and requires an explicit Confirm press.
The agent does not proceed until the user responds; no delete request reaches middle-core before confirmation.
A non-admin sees a permission-denied message and is not shown a functional confirmation card.
On approval the delete proceeds; on rejection the source still exists and the user is notified.
A forged/bypassed client request to delete as a non-admin is still rejected by backend-core (the UX gate is not the security boundary) — verifiable once backend-core is reachable.

Pros and Cons¶

Option 1 — renderAndWaitForResponse + backend RBAC (chosen)¶

Pros:

Defense in depth: explicit human consent (D1/D3) and authoritative backend authorization (D2) — neither alone, both together.
renderAndWaitForResponse is CopilotKit's first-party HITL primitive; the pause/resume is wired into agent state, so the agent genuinely waits (D6).
Role pre-check gives non-admins a clean message rather than a button that errors (D4).

Cons:

More implementation than a plain modal: action wiring, agent-state pause, and the role pre-check must all be correct.
The role used for the pre-check is a UX hint; it must never be treated as the authorization decision (that stays in backend-core) — a discipline the team must hold.

Option 2 — Backend RBAC only, no UX gate¶

Pros:

Simplest; backend already enforces admin.

Cons:

Violates D1: an admin's misinterpreted natural-language request deletes data with no confirmation — exactly the irreversible-mistake risk of an LLM interface.
No informed-consent step (D3); poor, dangerous UX for destructive actions.

Option 3 — Client-side guard only¶

Pros:

Easy; a window.confirm or local modal blocks accidental clicks.

Cons:

If treated as the gate, violates D2 (client-trusted authorization is no authorization).
Not integrated with agent state (D6): the agent may continue or the tool may fire regardless of the dialog; fragile in a streaming agent flow.

Pros:

Strong friction (typing the name) reduces accidental deletes.

Cons:

Decoupled from CopilotKit's action/respond() lifecycle (D6): the agent isn't actually paused/resumed by the modal, so the conversational flow and the modal can desync.
Reinvents a primitive CopilotKit already provides via renderAndWaitForResponse.

Positive Consequences¶

Destructive operations get explicit, informed, human approval at the point of action while backend-core remains the single authoritative authorizer — satisfying both safety and security goals.
Establishes a reusable HITL pattern for any future destructive tool (not just delete source).

Negative Consequences¶

Adds interaction friction to deletes (intended), and adds implementation complexity to the Phase 3 action wiring.
Tempting failure mode: treating the client-side role pre-check as the authorization decision. Must be documented and reviewed to keep backend-core authoritative.

Implementation Notes¶

Register delete_source with useCopilotAction({ renderAndWaitForResponse }); the render function receives args (what will be deleted), status, and respond; do not call respond() until the user clicks Confirm/Cancel.
Gate the offer of deletion on role (from session/JWT) for UX (D4); gate the execution on backend-core admin RBAC (D2). These are two separate checks with two purposes.
Card copy must name the specific source(s) and be unambiguous about irreversibility (D3).
Relates to but is distinct from hub ARC-ADR-001: 001 = agent→human board governance HITL; 006 = end-user in-app destructive-op confirmation. Cross-link them to avoid confusion.

Depends on: ARC-ADR-007 (React/CopilotKit app required for renderAndWaitForResponse); ARC-ADR-002 (forwarded JWT carries the role backend-core enforces).
Distinct from: hub ARC-ADR-001 (HITL decision-point pattern — agent/board governance).
Relates to: epic #12, issue #17 (Phase 3 source management + HITL delete); hub plan docs/plans/copilotkit-generative-ui.md ("deletes need admin + HITL confirmation").

Caveats¶

The authoritative backend denial path can only be fully tested with backend-core + middle-core running (private repos). frontend-core verifies the card/pause/resume behavior and the non-admin pre-check against mocks.

Revision History¶

Version	Date	Author	Change
0.1	2026-05-25	Architecture Review	Initial proposal (spoke draft)

ARC-ADR-006: Human-in-the-loop confirmation for destructive operations¶

Metadata¶

Context and Problem Statement¶

Decision Drivers¶

Considered Options¶

Decision Outcome¶

Confirmation Criteria¶

Pros and Cons¶

Option 1 — renderAndWaitForResponse + backend RBAC (chosen)¶

Option 2 — Backend RBAC only, no UX gate¶

Option 3 — Client-side guard only¶

Option 4 — Type-to-confirm modal outside the agent flow¶

Positive Consequences¶

Negative Consequences¶

Implementation Notes¶

Related Decisions¶

Caveats¶

Revision History¶