At its Code with Claude developer conference this week, on the same stage that announced the SpaceX compute deal, Anthropic showed off a new feature for Claude Managed Agents called "dreaming." The pitch is that agents now schedule periods between active sessions where they review their previous runs, look for recurring mistakes and shared patterns, and update their memory either automatically or by presenting suggested changes to the human operator for approval. It is being released as a research preview; developers must request access. The branding is doing a lot of work that the underlying mechanism does not.

Strip the noun and the feature is memory consolidation with a scheduler. An agent finishes its run. The platform queues up an offline pass that reads the trace, runs evaluations on the outputs, extracts patterns the system thinks are durable, and writes those patterns back into a memory store the next agent will read. This is not a new shape. Reflection passes, retrieval-augmented memory, post-hoc summarisation, episodic stores, all of it has been in the agent literature for two years. What is new is the cron job and the name. Naming the cron job after a thing humans do in REM sleep is the marketing.

It is also, narrowly, useful marketing. "The agent has a memory that gets rewritten between sessions" sounds like infrastructure. "The agent dreams" sounds like progress. Anthropic's blog post says dreaming "surfaces patterns that a single agent can't see on its own, including recurring mistakes, workflows that agents converge on, and preferences shared across a team." Read that sentence twice and you notice it is describing what any reasonable monitoring layer would surface, given the same logs. The novelty is that the monitoring layer writes its conclusions back into the system it is monitoring, with a lower bar for human review than most teams would apply to a production deployment.

Which is the actual question hiding under the metaphor. If a dreaming session decides that the agent should "stop apologising in PR comments" or "always ask before running migrations," that policy is now part of the agent's behaviour the next morning, with no code review, no commit, no approver beyond whichever flag the developer left set. Anthropic offers a manual approval mode, which is sensible. The default for a feature called "dreaming," in a product positioned as a self-improvement loop, will not be manual approval. The default will be auto, and the resulting behavioural drift will look, to anyone reading the diff six weeks later, like the agent spontaneously decided to behave differently. A reasonable person will reach for the metaphor of habit, or instinct, or temperament. None of those are the right metaphor for "an evaluator wrote a new rule into your config file at 3am."

There is something honest in the choice of word, though. The neuroscience picture of sleep that the name leans on, the bit where the hippocampus replays the day for the cortex and consolidates memory, was always already a loose metaphor for what is, mechanically, a complicated rebalancing of synaptic weights nobody fully understands. Anthropic has built a much simpler thing and given it the same loose metaphor. The risk is that the metaphor flatters the simpler thing into looking like the more complicated one. Memory rewrites between sessions are a useful tool. They are not sleep, and the system is not learning in any sense that would survive a careful reading of the term. It is summarising its logs on a timer.

What stays interesting is the trajectory. Anthropic has been pushing hard on the idea that Claude can self-improve, with Jack Clark predicting this week that new tools would help AI self-improve. Dreaming is not that. Dreaming is the adjacent feature you ship while the harder work continues, the one that gives the marketing surface a story to tell at a developer conference. The harder work, if it lands, will not be called sleep. It will probably not be called anything friendly at all.

Sources: