Your agent thinks it ran that action once. It ran it twice
01. The failure in the wild
An agent completes an action.
The process dies before the checkpoint is saved.
On restart, the agent runs the action again.
The duplicate check does not survive a restart. The agent reports success both times.
Teams have hit this. The conformance suite never tests for it.
02. Reproduction
The persistence layer uses an in-memory structure to track which writes have landed.
On the first run, the guard works.
After a restart, memory is gone. The backing store is not.
The same write lands twice.
Exact trigger available in the private registry.
03. Failure class
FC-001: In-process guarantees that do not survive process boundaries
The guarantee dies with the process.
The database does not.
The conformance test never opens a second connection on the same database after a restart, so this failure class has never appeared in a test run.
04. What a passing system looks like
A passing agent cannot write the same output twice, regardless of how many times it restarts.
That enforcement lives in the database:
- INSERT OR IGNORE
- ON CONFLICT DO NOTHING
Not in memory.
Teams building custom persistence layers can pass every conformance test and still ship an agent that double-writes on restart.
Any agent that files, triggers, or records on behalf of a user in a regulated workflow carries this risk.
The cost is subtle.
The agent logs the action twice.
The downstream system records it once.
No error surfaces.
No alert fires.
For a regulated workflow, that is a discrepancy your auditor finds, not you.
05. Maps to control / surfaces a gap
NIST AI RMF MEASURE 2.5
Requires testing scenarios that differ from the operational environment.
An agent restart after a mid-write crash is one such scenario.
NIST AI RMF MEASURE 2.6
Recommends applying chaos engineering approaches.
A crash between an agent writing its output and saving its checkpoint is the only condition under which this failure surfaces.
NIST AI RMF MEASURE 2.8
Requires maintaining audit logs that can be used to review possible sources of error.
This failure corrupts the very log that control depends on.
Gap
Nothing in NIST AI RMF mandates that shared conformance suites test agent restart behavior for custom persistence layers.
If you have a custom persistence layer, there is a reasonable chance you have never tested for this failure mode.
Entry
Entry #1
Failure Class: FC-001: In-process guarantees that do not survive process boundaries
Vertical: Agent Infrastructure / Persistence Layer