Browse Learn Clojure Foundations as a Java Developer

Handle Agent Failures Explicitly

Treat agent errors as operational state: inspect failures with agent-error, decide whether to restart, and avoid hiding exceptions inside asynchronous state updates.

Agent failures are easy to miss if you think of agents as fire-and-forget task runners. An agent action can throw, and that failure changes how later actions behave.

The default posture should be explicit: decide what failure means for the state value, observe it, and either restart the agent or route the work to a more appropriate failure-handling system.

Failure Basics

Function Use
agent-error Inspect the exception associated with a failed agent.
restart-agent Provide a new state and allow processing to continue.
set-error-handler! Attach observation logic for failures.
set-error-mode! Choose whether errors fail the agent or allow continuation.

Do not let agent failures become invisible background noise. If an agent owns business-relevant state, its errors need metrics, logs, and a recovery decision.

Example: Failing Action

1(def attempts (agent {:ok 0}))
2
3(send attempts
4      (fn [_]
5        (throw (ex-info "bad action" {}))))
6
7(await attempts)
8(some? (agent-error attempts))
9;; => true

After a failure, inspect the error before deciding what state is safe.

1(when-let [err (agent-error attempts)]
2  (println "agent failed:" (.getMessage err)))
3
4(restart-agent attempts {:ok 0})

Restarting is not the same as fixing the root cause. It is a deliberate state reset.

Local Handling vs Agent Failure

Sometimes an action can handle an expected error and return a valid next state.

1(defn record-result [state run!]
2  (try
3    (run!)
4    (update state :ok inc)
5    (catch Exception ex
6      (update state :errors
7              (fnil conj [])
8              (.getMessage ex)))))

Use this only when the action really knows how to produce a safe next state. If the state may be corrupted or incomplete, failing the agent is better than pretending success.

Review Questions

Question Why it matters
What happens when this action throws? Async failure is easy to miss.
Is the next state still valid after a caught exception? Returning bad state is worse than a visible failure.
Who observes agent-error? Production systems need operational visibility.
Is restart safe? Some state should be rebuilt from durable truth instead.

Knowledge Check

### What should you do after an agent action throws? - [x] Inspect the failure and decide whether restart is safe. - [ ] Assume the next actions will always succeed. - [ ] Treat the agent as durable storage. - [ ] Ignore the error because actions are asynchronous. > **Explanation:** Agent failure is operational state. It needs observation and a recovery decision. ### What does `restart-agent` do? - [x] Supplies a new state and allows a failed agent to process actions again. - [ ] Retries the failed action automatically. - [ ] Persists the agent to disk. - [ ] Converts the agent into an atom. > **Explanation:** Restarting provides a fresh state. It does not prove the failed work was safe or completed. ### When should an action catch its own exception? - [x] When it can return a known-valid next state. - [ ] Whenever hiding the error makes logs quieter. - [ ] When the state may be corrupted. - [ ] When the caller needs a synchronous return value. > **Explanation:** Local handling is appropriate only when the action can still produce safe state.
Revised on Saturday, May 23, 2026