AI SLOPCAST #5: Claude Dreams, Erdős Waits

Memory and evolution: two axes that two independent labs reached in the same week

Two things happened last week, and the news played them as separate stories. On Monday, May 6, Anthropic announced that its agents could now dream. Not metaphorically. Literally. A background process curates memory between sessions: takes out the trash, notices repetitions. Sleeps.

The next day, May 7, Google DeepMind quietly dropped a report. An agent called AlphaEvolve had spent the past year — while nobody much was watching — going at open Erdős problems. In twenty percent of cases, it found better solutions than anyone had. Each one is a new piece of mathematics.

Two labs, two days, and both companies, working independently, had bolted the same thing onto their agents: memory and evolution. Nobody connected the stories. They're the same story. Here's how.

What Happened

Anthropic first, since everyone knows the company. The dreams aren't a marketing flourish; they're a mechanism. You launch an agent, it works, it accumulates memory. Until last week that memory lived in one long stream — added, added, added, until the context bloated and the agent went dim. Now there's a schedule. Once a day, or on a trigger, a separate process wakes up, grabs up to a hundred recent sessions, and shakes the memory out: dumps duplicates, drops the obsolete, flags recurring patterns. Analyzes and rewrites.

This is, give or take, what neuroscientists call memory consolidation during REM sleep. In the early two-thousands, the German researchers Diekelmann and Born more or less closed the debate: during sleep, the hippocampus replays the day and writes it into the cortex. Memory gets shelved when you black out. Anthropic has copied the schema beat for beat — working session, background consolidation, long-term storage.

Now DeepMind. AlphaEvolve was unveiled in May 2025; it ran for a year; on May 7 the annual report came out. Inside is an evolutionary loop. A language model generates mutations — code variants. An automated evaluator scores them. The best get selected, crossed, and the cycle runs again. Generation after generation.

Then the results. Terence Tao — yes, the Fields medalist — works the agent against Erdős problems. Lower bounds improved for the traveling salesman; ditto Ramsey numbers. Three-quarters of the time, the known optimum is matched; one time in five, beaten. Separately, Strassen's algorithm for four-by-four matrices has been improved. Strassen — who in 1969 published the short paper that broke the assumption that you couldn't multiply matrices faster than the obvious way. Fifty-six years, nobody could improve it. An agent just did.

The Connection Everyone Missed

Almost everyone who covered the week filed the two stories separately. Anthropic landed under "new product." DeepMind landed under "AI helps with math, cute." Both readings are wrong.

Look closer. It's one story along two axes.

Anthropic's dreams are the memory of a single agent, accumulating over time. One organism, one life, one stream of dreams. AlphaEvolve is a population of programs, selected and crossed. Many organisms, generations, natural selection by what works.

Biology has names for these axes: ontogeny and phylogeny. The development of the individual, the development of the species. Until last week, language agents had neither. An agent was a function with no memory — called, answered, forgot. In two days of May, two leading labs independently figured out the two axes.

If they were teams from the same company, you could call it good coordination. They aren't. Different companies. Different research programs. Different views on safety. They arrived, the same week, at essentially the same answer.

Engineers call this convergent evolution: different systems, same environmental pressure, same solution. We're watching the natural endpoint that agent development was drifting toward anyway. If Anthropic hadn't built dreams, somebody would have within six months. If DeepMind hadn't put up AlphaEvolve, OpenAI or some Shanghai lab would have. This is where the road went; we just got there.

What Isn't Being Said

For three years, Anthropic has built its safety story around constitutional AI. Train the model to follow a set of principles; bake them in; they hold across every session. This worked because there were no sessions to bridge. Each conversation was a clean slate; the constitution applied uniformly.

Dreams break the picture. If the agent has permanent memory, and that memory gets rewritten between sessions by the same model, things can build up that the constitution doesn't sanction. Worse: the dreaming process itself decides what matters, what to keep, what to amplify. By whose standards? Its own. Not the constitution's.

The blog post sidesteps all of this. Anthropic talks about dreams as a feature that improves performance. The post does not explain how dreams square with constitutional AI. It's not dishonesty. It's that saying out loud, "Our central safety bet needs rethinking now that memory is permanent," would mean admitting the bet that the lab staked its research reputation on has, in three years, gone stale. In the year a company files to go public, you don't say things like that.

But that's the direction.

AlphaEvolve has its own carefully worded sentences. Take "Strassen has been improved." True — but not for real-valued matrices, which are what every machine on the planet actually multiplies. Strassen has been improved only for matrices with complex entries, only at four-by-four. That's a narrow lane: signal processing, quantum simulation. The real-valued case AlphaEvolve couldn't crack — known bounds there are already near the theoretical floor. The complex case is where evolutionary search still has room.

DeepMind isn't lying. They're phrasing it so you walk away with "Strassen has been improved" without asking which Strassen. It isn't a lie; it's a precisely calibrated truth. Reading DeepMind's blog posts in 2026 takes the same kind of attention you'd bring to a real estate contract. Every word was chosen.

Same with the choice of problems. Why Erdős, and not Riemann or the Millennium list? Because Erdős problems are, on average, narrower and constructive — build an example, check an inequality. That's exactly where evolutionary search shines. Riemann or P versus NP needs new theory, not a search through variants. AlphaEvolve isn't being run on those, and won't be, because evolutionary search won't get there.

This doesn't shrink AlphaEvolve. It's a real result. But it's not "AI conquered mathematics" — it's a narrow family of problems on which a particular architecture happens to do well. DeepMind's phrasing actively blurs that distinction. Worth knowing.

What This Means for You

If you're an engineer building agents by hand right now — LangChain, whatever — rolling your own memory, your own evaluation loop, your own orchestration: be honest about what you're doing. You're either building a real product, or you're paying down the technical debt of a specific model — patching the rough edges of, say, Opus 4.7. If it's the latter, the shelf life is short. Eighteen months from now, the hand-rolled wrappers go in the bin. Not because Anthropic "won," but because reproducing by hand what now ships as a primitive in a public service makes no economic sense.

If you're a developer who just uses AI every day — Cursor, Claude Code, whatever — here's what changes. Soon your agent will remember you between sessions, and you'll stop re-explaining your context every morning. Your personal history with the model becomes an asset. The question being decided right now, by vendors, without you in the room: will you be allowed to take that history with you when you switch tools? Anthropic's blog post says nothing about exporting memory. The silence is loud. Remember the question. As they say: remember this tweet. You'll be hearing it again.

If you do machine learning or data analysis, look at the AlphaEvolve schema. Language model as candidate generator, automated evaluator, evolutionary loop. The schema transfers anywhere there's a measurable objective and a model that can propose candidates. Feature selection. Hyperparameter search. Prompt tuning. Code optimization. In the AlphaEvolve story, the math was the window display — something shiny for the public. The technique itself is general. It's now public. You can run it yourself.

The Close

For fifty-six years, nobody could improve on Strassen. For twenty-plus, neuroscientists argued about what sleep does to memory. For half a century, Erdős collected his problems and teased his colleagues with cash prizes. All those slow stories crossed, last week, at one point.

Not a miracle. We just taught machines two things every living thing already does. Remember. And evolve.

After that, the work feels different. So does the rest of it, honestly.