ARC-ADR-006: Human-in-the-loop confirmation for destructive operations¶
Metadata¶
| Field | Value |
|---|---|
| ID | ARC-ADR-006 |
| Status | Proposed |
| Date | 2026-05-25 |
| Deciders | Architecture Review |
| Supersedes | — |
| Superseded by | — |
| Tags | hitl, copilotkit, destructive-ops, rbac, safety, generative-ui |
Spoke-authored draft. Referenced as proposed by epic #12 / issue #17 at the hub path
docs/decisions/ARC-ADR-006-hitl-destructive-ops.md, but not yet published. This file mirrors that ID/filename so the issue links resolve once upstreamed.
Context and Problem Statement¶
Phase 3 (issue #17) lets users manage sources through the copilot, including deleting a source. Delete is destructive and irreversible from the user's point of view, and it is being triggered through a natural-language interface where the agent infers intent — a setting prone to misinterpretation ("clean up these" deleting more than intended). We need a pattern that prevents an LLM-mediated request from destroying data without an explicit, informed human action, while keeping authorization authoritative.
Two distinct concerns must not be conflated:
- Authorization (may this user delete at all?) — owned by backend-core RBAC (
adminrole), reached via the forwarded JWT (ARC-ADR-002). This is the security boundary. - Confirmation (does the user really want this specific deletion now?) — a UX safety gate at the moment of action. This ADR is primarily about the second, layered on the first.
Note this is a different HITL than hub ARC-ADR-001 (HITL decision-point pattern), which is board-level governance for agents escalating decisions to humans across sessions. ARC-ADR-006 is in-app, end-user HITL: an interactive confirmation card in the chat stream.
Decision Drivers¶
| # | Driver |
|---|---|
| D1 | A destructive op must never execute on inferred intent alone — it requires an explicit, deliberate human confirmation. |
| D2 | Authorization stays authoritative in backend-core (admin role); the UX gate must not become the security boundary. |
| D3 | The confirmation must clearly state what will be deleted before the user consents (informed consent). |
| D4 | Non-admins must be stopped before being offered a confirmation they cannot fulfill (clear permission-denied, not a dead-end button). |
| D5 | On rejection, the data is preserved and the user is told; on approval, the delete proceeds. |
| D6 | Use a supported CopilotKit primitive so the pause/confirm/resume integrates with agent state. |
Considered Options¶
renderAndWaitForResponseHITL confirmation card, layered on backend RBAC (chosen) — the delete tool'suseCopilotActionrenders a confirmation card and pauses the agent until the user explicitly confirms or rejects; backend-core still enforces admin.- Backend RBAC only, no UX gate — rely solely on backend-core's admin check; the agent deletes immediately if authorized.
- Client-side guard only — a frontend confirmation/
window.confirm, with no agent-state pause and trusting the client for gating. - Type-to-confirm modal outside the agent flow — a standard React modal (type the source name) decoupled from CopilotKit's action/state lifecycle.
Decision Outcome¶
Option 1 — renderAndWaitForResponse confirmation card, layered on backend RBAC is
adopted.
The delete source flow:
- Pre-check role (D4). Before offering deletion, the UI checks the user's role (from the session/JWT). Non-admins get a clear permission-denied message instead of a confirmation card.
- Render + wait (D1, D3). The
delete_sourceaction is registered withuseCopilotAction({ renderAndWaitForResponse }). When the agent proposes a delete, the chat stream renders a confirmation card naming exactly what will be deleted, with an explicit Confirm button (and Cancel). The agent pauses —respond()is not called until the user acts. - Resolve (D5). Confirm →
respond()returns approval, the agent proceeds, and the delete tool runs in middle-core, which calls backend-core. Cancel →respond()returns rejection, the source is preserved, and the user is notified. - Authoritative enforcement (D2). Independent of the card, backend-core's
require_principaladmin gate is the real authorization check (reached via the forwarded JWT, ARC-ADR-002). The card is a safety gate, not the security boundary: even if the card were bypassed, backend-core still denies non-admins.
Confirmation Criteria¶
- Selecting a source and asking to delete it renders the HITL confirmation card via
renderAndWaitForResponse; the card states what will be deleted and requires an explicit Confirm press. - The agent does not proceed until the user responds; no delete request reaches middle-core before confirmation.
- A non-admin sees a permission-denied message and is not shown a functional confirmation card.
- On approval the delete proceeds; on rejection the source still exists and the user is notified.
- A forged/bypassed client request to delete as a non-admin is still rejected by backend-core (the UX gate is not the security boundary) — verifiable once backend-core is reachable.
Pros and Cons¶
Option 1 — renderAndWaitForResponse + backend RBAC (chosen)¶
Pros:
- Defense in depth: explicit human consent (D1/D3) and authoritative backend authorization (D2) — neither alone, both together.
renderAndWaitForResponseis CopilotKit's first-party HITL primitive; the pause/resume is wired into agent state, so the agent genuinely waits (D6).- Role pre-check gives non-admins a clean message rather than a button that errors (D4).
Cons:
- More implementation than a plain modal: action wiring, agent-state pause, and the role pre-check must all be correct.
- The role used for the pre-check is a UX hint; it must never be treated as the authorization decision (that stays in backend-core) — a discipline the team must hold.
Option 2 — Backend RBAC only, no UX gate¶
Pros:
- Simplest; backend already enforces admin.
Cons:
- Violates D1: an admin's misinterpreted natural-language request deletes data with no confirmation — exactly the irreversible-mistake risk of an LLM interface.
- No informed-consent step (D3); poor, dangerous UX for destructive actions.
Option 3 — Client-side guard only¶
Pros:
- Easy; a
window.confirmor local modal blocks accidental clicks.
Cons:
- If treated as the gate, violates D2 (client-trusted authorization is no authorization).
- Not integrated with agent state (D6): the agent may continue or the tool may fire regardless of the dialog; fragile in a streaming agent flow.
Option 4 — Type-to-confirm modal outside the agent flow¶
Pros:
- Strong friction (typing the name) reduces accidental deletes.
Cons:
- Decoupled from CopilotKit's action/
respond()lifecycle (D6): the agent isn't actually paused/resumed by the modal, so the conversational flow and the modal can desync. - Reinvents a primitive CopilotKit already provides via
renderAndWaitForResponse.
Positive Consequences¶
- Destructive operations get explicit, informed, human approval at the point of action while backend-core remains the single authoritative authorizer — satisfying both safety and security goals.
- Establishes a reusable HITL pattern for any future destructive tool (not just delete source).
Negative Consequences¶
- Adds interaction friction to deletes (intended), and adds implementation complexity to the Phase 3 action wiring.
- Tempting failure mode: treating the client-side role pre-check as the authorization decision. Must be documented and reviewed to keep backend-core authoritative.
Implementation Notes¶
- Register
delete_sourcewithuseCopilotAction({ renderAndWaitForResponse }); the render function receivesargs(what will be deleted),status, andrespond; do not callrespond()until the user clicks Confirm/Cancel. - Gate the offer of deletion on role (from session/JWT) for UX (D4); gate the execution on backend-core admin RBAC (D2). These are two separate checks with two purposes.
- Card copy must name the specific source(s) and be unambiguous about irreversibility (D3).
- Relates to but is distinct from hub ARC-ADR-001: 001 = agent→human board governance HITL; 006 = end-user in-app destructive-op confirmation. Cross-link them to avoid confusion.
Related Decisions¶
- Depends on: ARC-ADR-007 (React/CopilotKit app required for
renderAndWaitForResponse); ARC-ADR-002 (forwarded JWT carries the role backend-core enforces). - Distinct from: hub ARC-ADR-001 (HITL decision-point pattern — agent/board governance).
- Relates to: epic #12, issue #17 (Phase 3 source management + HITL delete); hub plan
docs/plans/copilotkit-generative-ui.md("deletes need admin + HITL confirmation").
Caveats¶
- The authoritative backend denial path can only be fully tested with backend-core + middle-core running (private repos). frontend-core verifies the card/pause/resume behavior and the non-admin pre-check against mocks.
Revision History¶
| Version | Date | Author | Change |
|---|---|---|---|
| 0.1 | 2026-05-25 | Architecture Review | Initial proposal (spoke draft) |