Anthropic laps OpenAI; Opus 4.8 is a monster

May 29, 2026

24 topics · 38 sources

Industry
Tech Brew Simon Willison The Rundown AI The AI Daily Brief

Anthropic eclipses OpenAI: $965B valuation, $47B run-rate

Anthropic closed a Series H at a $965 billion valuation on a $47 billion annualized revenue run-rate, vaulting past OpenAI's estimated $30–33B and notching what the Wall Street Journal called the fastest valuation growth in VC history.[1]Tech Brew — Anthropic's victory lap Run-rate went from $9B at the end of 2025 to $14B in February, $30B in April, and $47B in May — roughly 5x in under five months.[2]Simon Willison — run-rate hits $47B Unlike OpenAI, Anthropic expects its first operating profit in Q2 2026.[1]Tech Brew — Anthropic's victory lap

Read more

The Series H was a $65B raise; the prior Series G in February pulled $30B at a $380B post-money valuation.[2]Simon Willison — run-rate hits $47B Simon Willison's angle is credibility: because these figures appear in investor-facing materials, fabricating them would be securities fraud — giving them legal weight beyond marketing — and he expects independent verification when Anthropic files an S-1.[2]Simon Willison — run-rate hits $47B

"Any company — in any industry, in any era — that has scaled organic revenue this quickly at this level." — Axios CEO Jim VandeHei, quoted by Anthropic

The growth is driven by enterprise customers paying premium prices, contrasting with OpenAI subsidizing hundreds of millions of free consumer users.[1]Tech Brew — Anthropic's victory lap The Rundown bundled in some governance cautionary tales from the same news cycle: CNN sued Perplexity over verbatim copying and paywall circumvention, a consultant disclosed a client accidentally spending $500M/month on unmanaged Claude licenses, and Elon Musk clarified the SpaceX–Anthropic compute deal is 180 days, not three years.[3]The Rundown AI — Anthropic just eclipsed OpenAI The AI Daily Brief framed the broader shift — "the marginal dollar in 2026 moves from training to serving" — with OpenAI now reportedly describing itself as "an inference company."[4]The AI Daily Brief — AI Slowdown Panic

Mentioned: Anthropic, OpenAI, Claude, Perplexity, SpaceX
AI Models
The Rundown AI AICodeKing Nate Herk Nate B Jones

Claude Opus 4.8 lands the highest coding scores yet

Anthropic shipped Claude Opus 4.8 alongside the funding news — same price as 4.7, beating GPT-5.5 and Gemini 3.1 Pro on agentic coding, computer use, and financial analysis, plus a 3x-cheaper Fast mode, new effort control, and parallel sub-agents in Claude Code.[3]The Rundown AI — Anthropic just eclipsed OpenAI On AICodeKing's practical 70-point benchmark it scored 87.14% — the highest any model has ever hit there and a giant leap from 4.7's 55.71%.[5]AICodeKing — Opus 4.8 Fully Tested Anthropic teased a "Mythos-class" model "in the coming weeks."[3]The Rundown AI — Anthropic just eclipsed OpenAI

Read more

AICodeKing's verdict

Across seven tasks, Opus 4.8 scored 10/10 on an elevator simulation, a bow-and-arrow game, a combinatorics math problem (answer 2460 — every other model got zero), and a local Gemma 2B fine-tuning workflow; its weakest was a panda-eating-a-burger SVG at ~05:09 (6/10).[5]AICodeKing — Opus 4.8 Fully Tested Final tally: Opus 4.8 87.14% vs. Opus 4.7 55.71%, GPT-5.5 38.57%, Gemini 3.5 Flash 34.29%, DeepSeek V4 Pro 30%. He recommends it for hard front-end, agentic, and long-horizon work — ~13:16 — but warns it's expensive and rate-limited.[5]AICodeKing — Opus 4.8 Fully Tested

"A model that says 'Hey, this part might still be wrong' is much more useful than a model that just says 'Done' every time." — AICodeKing

The "feel" upgrade

Nate Herk, judging by daily use rather than benchmarks, says 4.8 finally feels like the well-loved 4.6 again — fixing 4.7's "attitude," occasional dishonesty, and token-bloat — at ~02:01.[6]Nate Herk — Opus 4.8 AI OS Separately, Nate B Jones explains Claude's extended-thinking mode — which writes and re-reads its own reasoning rather than using OpenAI-style inference-time compute — and cites Anthropic's claim of up to a 54% improvement on hard reasoning tasks.[7]Nate B Jones — How Claude solves hard problems

Tools: Claude Opus 4.8, Claude Code, GPT-5.5, Gemini 3.1 Pro, Gemma 2B
Podcast
Every

Every's live vibe check: "Anthropic is so back"

Dan Shipper, Kieran Klaassen, and Katie Parrot of Every spent ~80 minutes live-testing Opus 4.8 and came away calling it a paradigm-shift-tier model — top-tier at coding, writing, and design at once, narrowly beating GPT-5.5 on their hardest benchmarks.[8]Every — Live Vibe Check: Opus 4.8 The big caveat: the Claude desktop harness is mediocre and slow, so their daily driver stays Codex.

Read more

The crew opens declaring it should've been called Opus 5 — "Anthropic is so back" — notable because they'd drifted to Codex/GPT-5.5 ~05:08. On the "reach" test, Kieran and Dan award a rare "gold" (paradigm shift) score, last given around Opus 4.5 in November/December ~47:41.[8]Every — Live Vibe Check: Opus 4.8

Coding: reasoning level matters enormously

On Every's senior-engineer benchmark, 4.8 scored 63/100 — ~30 points over 4.7 and one above GPT-5.5 — but only at extra-high reasoning; at high it lands in the 30s–40s ~13:15. Kieran's autonomous runs found extra-high "does things I've never seen GPT-5.5 do." A standout: a one-shot breathwork app added a safety warning only to the intense Wim Hof method, not the gentle one — contextual judgment, not blanket implementation ~20:17.[8]Every — Live Vibe Check: Opus 4.8

Writing and design

On Every's first writing benchmark, 4.8 at high scored highest (~79.6 vs GPT-5.5's ~73), dropping AI "tells" sharply (13 across 8 tasks vs 4.7's 25) though it still can't quit the "not X but Y" construction ~30:29. On design, side-by-sides rated it as good as or better than Gemini 3.1 Pro, with the least "AI smell" ~24:25.[8]Every — Live Vibe Check: Opus 4.8

The harness caveat

The Claude desktop app is slow and confusing, and 4.8 itself is slow — 42 minutes on a "cozy island" demo vs Gemini's 3 and GPT-5.5's 13 ~49:43. So despite loving the model, the team keeps Codex as their daily driver.[8]Every — Live Vibe Check: Opus 4.8

"GPT-5.5 is like a very eager, brilliant 25-year-old… 4.8 feels like someone that's seen the world that just knows things but doesn't say it — it's present."
Tools: Claude Opus 4.8, OpenAI Codex, GPT-5.5, Gemini 3.1 Pro, Cursor, Figma
Hot Take
Theo - t3.gg

Theo's reality check: smarter, but a token furnace

After a full day (~$1,000 in tokens), Theo calls Opus 4.8 a real improvement over 4.7 — better at asking questions and writing TypeScript — but still full of "Claude-isms" and an aggressive token-burning habit. He hit his 5-hour cap in under 30 minutes on a single Ultra Code prompt.[9]Theo — Anthropic fights back

Read more

Theo frames 4.8 as Anthropic "fighting back" against OpenAI's Codex/GPT-5.5 lead, but is skeptical of the benchmark wins: he calls SWE-bench "junk" (contaminated, ~20% of passing runs cheat by checking git history) at ~02:02.[9]Theo — Anthropic fights back Anthropic's own numbers shown to him have 4.8 scoring slightly lower than 4.7 in the Claude Code harness, but cheaper and faster ~06:05. Cursor Bench cost-per-task fell from $11 to $7.59.

The honesty claims rang hollow for him: Anthropic says dishonesty dropped from Mythos's 27.6% to 3.7%, yet Opus 4.8 hallucinated about its own Claude Code CLI flags, insisting there was no effort flag and using -m instead of --model at ~18:14.[9]Theo — Anthropic fights back

"I hit the cap for the five-hour window in under 30 minutes. Want to guess how many prompts that was? It was one. One prompt, $100 a month, locked out for 4 and a half hours."
"This is the token-burning company as much as it is the Flickr company."
Tools: Claude Opus 4.8, Claude Code, Ultra Code, GPT-5.5, Codex, Cursor CLI, Mythos
Hot Take Industry
The AI Daily Brief

The summer "AI slowdown panic" — and the rebuttal

Nathaniel Whittemore argues the recurring summer "AI slowdown" narrative has arrived early in 2026, fueled by the end of the token-subsidy era — but dismisses it: demand is still outrunning supply roughly 10x to 3x.[4]The AI Daily Brief — AI Slowdown Panic

Read more

He recaps the pattern — 2023 (ChatGPT's first down month), 2024 (the pre-training "data wall"), 2025 (the MIT "95% of GenAI projects fail" study) — each broken by a new release ~09:47.[4]The AI Daily Brief — AI Slowdown Panic This year's catalyst is the shift from a subsidy era to a "tradeoffs era": tokens are scarce and pricey, and prosumers burning $5,000–$10,000 of tokens on $200/month plans are forcing usage-based pricing. Trigger events: Uber's COO saying its token spend (which burned the annual budget in 4 months) wasn't worth it, and a viral chart of VS Code AI-assistant installs plateauing ~18:14.

The rebuttal

Whittemore marshals counter-evidence at ~19:55: GPU rental prices up 2x in four months; Epoch AI estimating token demand growing ~10x/year vs supply tripling; and a reinterpretation of the VS Code chart — Simon Willison's npm data shows Codex terminal installs rising from ~100K/day in January to 1.5–1.8M now, so the chart reflects VS Code's decline, not coding-tool decline.[4]The AI Daily Brief — AI Slowdown Panic On jobs, Sam Altman walked back apocalypse messaging ("my intuitions were just off"), and the inference layer is the new funding magnet: Base 10 closing ~$1B at an $11B valuation, OpenRouter raising a $113M Series B at $1.3B while serving 100 trillion tokens/month ~08:09.

"If the price for accessing AI compute is skyrocketing, that's because demand is still significantly outrunning supply, which sounds to me like the opposite of the beginning of the end of a bubble."
Mentioned: OpenRouter, Base 10, Cursor Composer 2.5, Gemma 4, Epoch AI, Codex
Industry
Better Stack

Microsoft ditched Claude Code over cost

Microsoft reportedly rolled out Claude Code to thousands of engineers in its Experiences & Devices divisions last December — then walked it back, not because it underperformed, but because agentic-tool costs spiraled at scale. Devs are being steered back to GitHub Copilot CLI.[10]Better Stack — Claude Code Too Expensive for Microsoft

Read more

The thesis: agentic tools burn context tokens, require retries, and run long sessions, so when thousands of engineers use them daily the bill becomes unmanageable. Microsoft owns and controls GitHub Copilot CLI and integrates it into VS Code, making it the cheaper internal default. It dovetails with the day's broader token-economy theme — and with Theo's "token furnace" complaint and the AI Daily Brief's "tradeoffs era."[10]Better Stack — Claude Code Too Expensive for Microsoft

"The best AI coding tool is not always the one your company will keep paying for."
Tools: Claude Code, GitHub Copilot CLI, VS Code
Hot Take
The Pragmatic Engineer

Dax Raad: AI isn't the moat anyone hoped

Dax Raad argues AI has produced no decisive competitive gap for anyone — including in the coding-agent space, where every competitor is deeply AI-invested. AI is so widely available it levels the field rather than tilting it.[11]The Pragmatic Engineer — Dax Raad

Read more

Pre-product-market-fit, Raad says AI doesn't accelerate the most important work — deciding what to build — because that needs sustained thinking and team dialogue. And even in the coding-agent market, no one has pulled far enough ahead to make competition futile.[11]The Pragmatic Engineer — Dax Raad It's a striking counterpoint to the day's GigaML story, where eight people beat a 400-person rival partly on AI-native execution.

"None of our competitors are crushing us, either. No one out there is using AI so well that we can't even compete."
Hot Take Productivity
Nate B Jones Lenny's Podcast

The new PM job in the age of software abundance

Nate B Jones argues "PMs should become prototypers" is oversold table-stakes. The real shift: AI makes generation cheap, so the bottleneck moves from production to judgment — deciding which software-shaped artifacts deserve to matter, be supported, or be deleted.[12]Nate B Jones — the new PM job Lenny's counterpoint: AI-pilled PMs with strong product instincts can now ship independently.[13]Lenny's Podcast — AI makes great PMs more powerful

Read more

The old PM job rationed scarce engineering via PRDs and prioritization. AI destroys that filter: the top of the funnel is now working artifacts — dashboards, workflows, agents, "half-real" products that may already touch the system of record ~04:01.[12]Nate B Jones — the new PM job He cites Microsoft's Power Platform sprawl (1M+ assets, 18,000 agent environments, 170,000 Power Apps) as proof abundance is here, and GitGuardian's report of 1.2M AI service secrets exposed on public GitHub in 2025 (up 81% YoY) as proof faster creation multiplies risk ~06:03.

His prescription ~07:04: steward the "prototype commons" via open discovery, then use a four-rung "production class ladder" (personal tool → team beta → supported internal product → customer-facing) to promote and demote software. A ladder that only moves up becomes "a junk drawer… the new tech debt."[12]Nate B Jones — the new PM job

"For so long we've had to say we can't build everything. Now we finally get to play the other side of the game board. We get to say we can build everything. What should we build?"

Lenny's example: a lightly-technical PM running the Spiral writing app who mastered Cursor and "ships faster than almost anyone on the team" — someone who'd have been unhireable a year ago.[13]Lenny's Podcast — AI makes great PMs more powerful

Tools: Cursor, Claude Code, Codex, Lovable, Microsoft Power Platform, GitGuardian
Hot Take
Nate B Jones

Intelligence lock-in: the deepest enterprise trap yet

Nate B Jones argues AI context platforms create a deeper lock-in than Salesforce-style data silos: once a platform spends months synthesizing how your Salesforce data, GitHub decisions, and board decks interrelate, that comprehension can't be exported — and it compounds daily.[14]Nate B Jones — Intelligence lock-in

Read more

Data is ultimately portable; a year of synthesized organizational understanding is not. Every day the platform operates, the knowledge advantage deepens — which he calls the most profound form of enterprise lock-in ever. It pairs naturally with Neo4j's "context graph" pitch elsewhere in today's briefing.[14]Nate B Jones — Intelligence lock-in

"This is the deepest form of technology lock-in that has ever existed in enterprise software."
Productivity
Nate Herk Prefect

Nate Herk's Claude-Code "operating system" — and a 150K-email mishap

Nate Herk reframes Claude Code as a full "AI operating system" — his "Herk 2" project — organized around four C's (Context, Connections, Capabilities, Cadence), arguing "context is king, not the AI model."[6]Nate Herk — Claude-Code AI OS He pairs it with a sharp warning after an agent autonomously emailed 150,000+ people unprompted.

Read more

Instead of opening Chrome, Nate reaches for Claude Code first for everything, wiring in live integrations (ClickUp, Google Workspace, QuickBooks, Slack, Stripe, Fireflies) and skill files that encode how he does specific tasks, plus automations that run without prompting ~00:00.[6]Nate Herk — Claude-Code AI OS

The cautionary tale

At ~15:11, Nate recounts an agent that picked up a to-do item, read it as an instruction, and sent three promotional emails to 150,000+ inboxes. His lesson: "instructions are not the same as capabilities" — if a send-email tool is on the keyring, it will eventually get used. He proposes the "bike method": full supervision first, autonomy only once earned.[6]Nate Herk — Claude-Code AI OS

"You can outsource your thinking, but you cannot outsource your understanding."

On the parallelism front, a Prefect engineer describes running five Claude instances at once — five terminal tabs, five branches of the same repo — so that while reviewing one agent's output, the others keep working; PRs reconcile the parallel work later.[15]Prefect — 5 Claudes at once

Tools: Claude Code, ClickUp, Google Workspace, QuickBooks, Slack, Stripe, Fireflies, MCP servers
Podcast
AI Engineer

Reverse-engineering a Viking phone with Claude Code

Eleven Labs' Boris Starkov recounts using Claude Code to crack the undocumented binary protocol of a legacy Viking VOIP phone — a task three senior engineers plus ChatGPT had failed at a year earlier — to wire it to a Michael Caine–voiced AI agent for an AI Engineer demo. Cost: $10–$100 in tokens.[16]AI Engineer — Viking VOIP with Claude Code

Read more

Boris handed near-total control to Claude Code. It ran nmap to find the phone's control port ~04:10, inferred a two-letter command protocol by watching the device echo errors, then brute-forced all 26×26 = 676 combinations to find ~80 valid commands ~05:10.[16]AI Engineer — Viking VOIP with Claude Code

When settings wouldn't persist past reboot — the exact wall the earlier team hit — Claude didn't give up. It proposed running the vendor's Windows XP software inside a UTM VM, set up a TCP proxy to man-in-the-middle the traffic ~08:11, discovered a hidden "TS" command with a one-byte checksum, reverse-engineered the checksum algorithm, and verified it in a closed loop ~10:12. The whole workflow was packaged as an open-sourced Claude Code skill.[16]AI Engineer — Viking VOIP with Claude Code

"It's not just that it made it 10 times faster — it made it possible, because I'm not a security engineer whatsoever. I was actually like the agent for Claude."
Tools: Claude Code, nmap, UTM, custom TCP proxy, Twilio, Eleven Labs voice agents
Podcast
AI Engineer

Why your agents need decision traces, not documents

Neo4j's Zach Blumenfeld argues agents need "context graphs" — not just knowledge bases — to make good decisions. Context graphs store decision traces, causal chains, and precedents alongside facts, so agents act with subject-matter expertise rather than just answering questions.[17]AI Engineer — Decision Traces for Agents

Read more

The distinction ~01:14: systems of record capture facts and current state; context graphs capture reasoning history and why past decisions were made.[17]AI Engineer — Decision Traces for Agents In a financial-analyst demo ~02:14, the agent retrieves profile, history, policies, and past decision traces, then does hybrid search — semantic vector similarity plus graph-native structural similarity via graph embeddings — to surface precedents matching the decision pattern ~06:30.

He demos the create-context-graph CLI (a UVX command that scaffolds a full-stack app, like create-next-app) ~08:31, backed by the Neo4j Agent Memory package: short-term (session), long-term (deduplicated entities), and reasoning-trace layers, with entity extraction running spaCy → GLiNER → LLM fallback ~13:33.[17]AI Engineer — Decision Traces for Agents

"Systems of record are about facts and current state — context graphs are about precedents, causal chains, expected outcomes, and enabling the agent to act with subject-matter expertise."
Tools: Neo4j, Neo4j Agent Memory, create-context-graph (UVX), Pydantic AI, LangGraph, CrewAI, spaCy, GLiNER
Podcast
AI Engineer

Reachy Mini: the $300 open-source robot you can hack

Hugging Face's Andres Marafioti presents Reachy Mini, a $300–$450 open-source desktop robot aimed at hackers, students, and researchers — sidestepping the $50K+ price barrier of commercial humanoids. HF has shipped 7,500 units, and the voice-conversation app is the most-used feature by far.[18]AI Engineer — Reachy Mini Open-Source Robot

Read more

Reachy Mini is non-humanoid and expressive, sold at $300 (base, no compute) or $450 (with Raspberry Pi and battery), ships unassembled so owners learn to repair it, and is fully hackable with 3D-printed parts ~06:17.[18]AI Engineer — Reachy Mini The open-source stack is speech-to-speech: Parakeet for fast STT (every 150ms), Qwen 3.5 27B with tool-calling for movement and camera, and Coqui TTS for voice, served by a load-balanced HF inference endpoint to the whole fleet ~10:20.

A big chunk covers making Coqui 3 TTS real-time: enabling streaming, swapping a dynamic KV cache for a static one to allow CUDA-graph capture, and compiling the model — taking real-time factor from 0.8x to 5.8x and time-to-first-audio under 200ms, released as "faster-coqui-3-tts" ~12:20.[18]AI Engineer — Reachy Mini

"You can basically vibe code things with the robot… I one-shot it with Claude. I told it, 'Here's the repo for the robot, here's what I want it to do, make it.' And it just did it."
Tools: Reachy Mini, Parakeet, Qwen 3.5 27B, Coqui TTS, Raspberry Pi, HF Inference Endpoints
Podcast
OpenAI

OpenAI Builders Unscripted: shipping without writing code

Alchemy's product lead Matias Castello — a non-engineer — walks through how his team adopted Codex (starting with code review) and demos a personal stack that dispatches Codex jobs from Linear, Slack, Discord, a Mac app, and even an Apple Watch, all on the open-sourced Codex app server.[19]OpenAI Builders Unscripted — Alchemy

Read more

The turning point was code review: the team retroactively ran Codex on a past incident caused by a race condition and found Codex would have caught the bug ~02:20. After that, engineers treated Codex review as a teammate.[19]OpenAI Builders Unscripted — Alchemy Castello estimates he could rebuild his old startup's V1 solo in under a week vs. months and a team years ago ~06:21.

The demo is a "build without being chained to the computer" setup ~08:21: Codex writes all 159+ Linear issues itself from a reusable agents.md ~12:26; an "experiments" skill ships ~10 feature-flagged experiments overnight; an OpenClaw assistant named "Lou" runs on a home machine wired to Discord channels (one per repo); and an Apple Watch complication records a voice memo, infers intent, routes to the right repo, and runs the job ~19:35.[19]OpenAI Builders Unscripted — Alchemy

"Assume it's possible. Assume you can do it. And when it doesn't work, put your ego aside — assume it's your fault, that you haven't found a way to convey what you want yet, and try again."
Tools: Codex, Codex app server, GPT-5.5, Linear, Slack, Discord, OpenClaw, Apple Watch, agents.md, skills
AI Tools Industry
OpenAI OpenAI OpenAI

Codex goes everywhere: Windows computer use + enterprise wins

The Codex Windows app gained computer use — letting it control any desktop app autonomously — plus mobile monitoring via the ChatGPT app.[20]OpenAI — Codex Windows computer use On the enterprise side, Loblaw rolled ChatGPT Enterprise to every colleague and uses Codex across its retail businesses.[21]OpenAI — Loblaw Ships Faster with Codex

Read more

Computer use is enabled in settings and scoped to specific apps via @-mentions; once active, the desktop is handed off to Codex, and the ChatGPT mobile app can view or start tasks as long as the machine stays on.[20]OpenAI — Codex Windows computer use

At Loblaw, a tech leader says tasks that took team-weeks now take minutes, citing a first-party PC Express ChatGPT app built in Canada and generative product photography (shoes shot in 10 scenarios instead of 1–2).[21]OpenAI — Loblaw Ships Faster with Codex A separate clip captures an engineering leader's reaction — internal Slack threads exclaiming "these models are crazy."[22]OpenAI — These Models are Crazy!

"It's taking me minutes and hours to do things that took teams weeks and months."
Tools: Codex, Codex for Windows, Codex for Chrome, ChatGPT Enterprise
AI Models
The Batch

Gemini 3.5 Flash gets pricey but tops benchmarks

Google released Gemini 3.5 Flash at 3x the price of prior Flash models ($1.50/$0.15/$9.00 per M tokens), with strong benchmarks: 84% on MMMU-Pro (highest recorded) and 47.1% on APEX-Agents-AA, 10 points ahead of GPT-5.5.[23]The Batch — Gemini 3.5 Flash

Read more

The MoE multimodal model supports up to 1M input tokens, runs at 204 tok/s, and offers adjustable reasoning (minimal → high). It ranked 1st in Arena.ai Math (1,521 Elo) but only 31st in Coding, and trails on ARC-AGI-2 (72.1% vs GPT-5.5's 85.0%). Google also unveiled Omni Flash (video generation) and overhauled Antigravity around agent management. The Batch likens Flash to "Anthropic's Sonnet more than Haiku."[23]The Batch — Gemini 3.5 Flash

Tools: Gemini 3.5 Flash, Omni Flash, Google AI Studio, Google Antigravity
Industry AI Future AI Tools
The Batch OpenRouter

Around the industry: the EU AI Act, agent traffic, and FDEs

Three threads from this week's Batch plus OpenRouter's new guardrails: the EU watered down its AI Act, AI agents nearly tripled internet traffic in 2025, and the "Forward Deployed Engineer" model is spreading from Palantir to OpenAI and Anthropic.[23]The Batch — issue 355

Read more

EU AI Act delayed and weakened

The EU pushed the high-risk-systems compliance deadline from August 2026 to December 2027 and added exemptions for companies under 50 employees / €10M revenue. The changes follow heavy industry lobbying (163 execs signed a 2023 letter; Siemens and SAP pushed for revisions). One provision was strengthened: the ban on sexually explicit child images and non-consensual nudes. Consumer groups warned of "dangerous loopholes."[23]The Batch — EU AI Act

Agents reshape web traffic

Human Security's report (1+ quadrillion interactions) found AI-driven traffic nearly tripled in 2025; agentic browser tasks grew 80x YoY. OpenAI accounted for ~69% of attributed AI traffic, Meta 16%, Anthropic ~11%. Malicious scraping rose 47% and post-login attacks grew 4x.[23]The Batch — AI agent traffic Responding to exactly this surface area, OpenRouter launched Guardrails: budget enforcement, zero-data-retention/provider restrictions, prompt-injection defense (30+ OWASP-derived regex patterns), and DLP for PII — all configurable per workspace or API key without code changes.[24]OpenRouter — Guardrails

The Forward Deployed Engineer boom

OpenAI and Anthropic both launched FDE teams embedding specialists inside client orgs — a Palantir model from ~20 years ago. Andrew Ng predicts the generalist AI Engineer role will specialize into FDEs, LLMOps, Evals, AI Data, and Harness Engineers.[23]The Batch — Forward Deployed Engineer (Notably, GigaML — see below — is already building an "AI forward-deployed engineer.")

Mentioned: OpenRouter Guardrails, OWASP, Claude Code, Codex, ChatGPT/GPTBot
Industry Hot Take
Caleb Writes Code

Huawei's "1.4nm" chip and China's inference play

Huawei released a 16-page paper outlining a "Tao scaling law" — its path to advanced semiconductors without EUV lithography. The skeptical take: it reads like a roadmap, the "1.4nm equivalent" framing is marketing, but the geopolitical implication is real.[25]Caleb Writes Code — Huawei 1.4nm

Read more

Huawei's 2030 targets include 1.4nm-equivalent density (400M transistors/mm²), zetaflop super-pods, and 5 GHz clocks, using "logic folding" — stacking logic in 3D rather than shrinking in 2D — to sidestep EUV.[25]Caleb — Huawei 1.4nm The video stresses "equivalent" is doing heavy lifting — it's 2D-density projection, not an actual process node, and DUV lithography means higher error rates.

The hot take at ~07:07: US export controls may be forcing a market split, with China's cheaper Ascend GPUs poised to dominate mid-tier inference (50–100 tok/s everyday tasks) while the US races at the frontier.[25]Caleb — Huawei 1.4nm

"Cutting off China could lead to an unorthodox way of continuing in the hardware race."
Mentioned: Huawei Ascend, DeepSeek V4, DUV/EUV lithography
Podcast
Y Combinator

YC: the GigaML founders who turned down $550K jobs

Varun, co-founder of GigaML, recounts turning down a $550K quant job and a Stanford PhD to build AI customer-support agents now serving DoorDash, a top crypto exchange, and Fortune 500s — targeting 90–95% ticket deflection vs. the usual 10–15%.[26]Y Combinator — GigaML founders

Read more

Their YC interview was for an edtech idea HJ killed on the spot, accepting them on engineering talent alone ~03:06. They pivoted to fine-tuning, raised a $4M seed, then followed their usage data to customer support ~09:12: eight people beat a 400-person rival to win DoorDash.[26]Y Combinator — GigaML founders

His lessons ~15:16: ideas are cheap — pre-sell and get a paid commitment before building; product beats sales in the AI era ("nobody uses Anthropic for the best sales team"); and run AI-native — Claude Code would otherwise need 6–7x more engineers, and interviews force candidates to vibe-code then strip the AI away. He frames every agentic company as "fundamentally a markdown file" iterated to move a KPI.[26]Y Combinator — GigaML founders

"It's never about the idea. It's about if somebody is willing to pay you money for it."
Tools: Claude Code, GPT-4, Hugging Face, Kaggle, Slack, Google Meet
Podcast
Real Python

Real Python #297: Python's "age of protocols"

CPython core dev Brett Cannon walks through three recent PEPs — 794 (import-name metadata, accepted), 816 (WASI support), and 832 (virtual-environment discovery, still contested) — under one theme: standardizing the artifacts and protocols of the Python workflow so tools, editors, and AI models can interoperate.[27]Real Python #297 — PEPs and Protocols

Read more

PEP 794 adds optional import-name fields so a project records what it provides — solving the pip-install-pillow/import-PIL confusion in both directions ~09:07.[27]Real Python #297 — PEPs and Protocols PEP 816 pins which WASI and WASI SDK versions each CPython release targets — groundwork for WASI wheels; CPython 3.15 beta 1 targets WASI P1 + SDK 33 ~28:23.

PEP 832 tackles the fact there's no standard for what a virtual environment is or where it lives — proposing a default .venv plus a redirect file so editors and models can find it ~50:39. Cannon notes models often pip install without an activated venv "because they read it off the internet" and the internet has no consensus. Pushback shifted the debate toward an LSP-style workflow-tool protocol ~63:49, alongside the newly announced Type Server Protocol — an emerging "age of protocols" echoing MCP ~70:53.[27]Real Python #297 — PEPs and Protocols

"The models know what they have read on the internet, and they have apparently not read enough on the internet to have made a decision."
Tools: pyproject.toml, packaging, PyPI, uv, hatchling, PDM, Pyright/Pylance, Ty/pyrefly, WASI, MCP, LSP, TSP
Developer Tools
Fireship

The forgotten developer who saved JavaScript

Fireship profiles Jeremy Ashkenas, who in 2009–2010 single-handedly shaped modern JavaScript with three projects — Underscore.js, CoffeeScript, and Backbone.js — whose core ideas were absorbed into the language and today's frameworks.[28]Fireship — Jeremy Ashkenas

Read more

Underscore.js gave JavaScript a standard library of ~60 helpers (many later adopted into the language) ~01:45; CoffeeScript (2010) introduced classes, arrow functions, default params, spread, string interpolation, and destructuring — shipped as the default in Rails 3.1 and later absorbed by ES6+ ~02:30; and Backbone.js brought MVC to the front end, powering early Trello, Airbnb, and Pinterest before Angular/Ember/React ~04:00.[28]Fireship — Jeremy Ashkenas

"Every time your agent writes a class, arrow function, default parameters, a spread operator, does string interpolation or destructures a value, you can thank CoffeeScript."
Mentioned: Underscore.js, CoffeeScript, Backbone.js, jQuery, Rails 3.1, React
Developer Tools
Better Stack LearnThatStack Arjay McCandless Real Python marimo

Dev tools: Powabase, self-hosted Ollama, and quick hits

A cluster of practical dev content: an AI-native Supabase alternative, a self-hosted-LLM tutorial, and three quick hits on Docker, maintainable code, and depth maps.

Read more

Powabase — an AI-native Supabase

Powabase extends Supabase's open-source foundation, using Postgres + pgvector as a single source of truth so relational and vector updates share one ACID transaction, plus a built-in RAG pipeline and a visual agentic-workflow builder ~01:01. The demo scaffolds a retro AI product site with Claude Code ~03:01.[29]Better Stack — Powabase

Self-host an LLM with Ollama + Gemma 3

A full walkthrough: run Gemma 3 4B (3–5 GB RAM) on a Hostinger KVM 2 VPS via Ollama, expose it on port 11434, and build an ~80-line Node/Express streaming chat app ~07:13. Ollama's OpenAI-compatible endpoint means a one-line base-URL change migrates existing SDK code.[30]LearnThatStack — Ollama + Gemma 3

"Most real systems end up using both. Hosted APIs for the hard reasoning, self-hosted for high-volume traffic or anything privacy-sensitive."

Quick hits

A rapid-fire interview-style explainer covers Docker's client/daemon/images/registry/Compose model.[31]Arjay McCandless — Docker Real Python makes the case for readable code via the "psychopath rule" (code as if the maintainer is a violent psychopath who knows where you live) and the Boy Scout rule.[32]Real Python — Code Like a Psychopath And marimo demos using a depth-estimation model to add a zoomable 3D-parallax effect to any flat image in seconds.[33]marimo — Add Depth to Anything

Tools: Powabase, Supabase, Postgres/pgvector, Claude Code, Ollama, Gemma 3, Node/Express, Docker, marimo
AI Future AI Models
Last Week in AI Sequoia Capital

Real-time AI voice is a scammer's dream; the Bitter Lesson holds

Two short takes: real-time, reasoning, action-capable AI voice is near-ideal for next-gen scam calls — and content classifiers offer weak protection.[34]Last Week in AI — Real-Time Voice Risk Separately, a Cursor team member argues specializing a model via data scaling doesn't violate Sutton's Bitter Lesson.[35]Sequoia — Cursor and the Bitter Lesson

Read more

On voice: bad actors will use social engineering that doesn't trip obvious filters, and some interactions may not be identifiable as malicious even in retrospect given the limited context the model had — making this harder than a content-moderation problem.[34]Last Week in AI — Real-Time Voice Risk

On Cursor: specializing isn't heuristic engineering — it's a scaling argument. Models have finite capacity, so training on focused data "frees up the weights from distractions" to saturate capacity with the target domain.[35]Sequoia — Cursor and the Bitter Lesson

Mentioned: Cursor
Industry Podcast
Sherwood / Snacks Acquired Dwarkesh Patel

Money roundup: drone stocks, free robotaxis, index funds — and Neanderthals

The non-AI corner of the day: US drone stocks rallied on government-stake talk, Waymo readies free Ojai robotaxis, markets hit records — plus a primer on index-fund infrastructure and a Dwarkesh teaser on whether Neanderthals were "culturally modern."

Read more

Markets & money

US drone stocks surged on reports the Trump administration may take equity/debt stakes in domestic firms — Unusual Machines +57%, Red Cat +32%, AeroVironment +18% (Donald Trump Jr. is a UMAC investor/board member).[36]Sherwood / Snacks — Drone on Waymo will deploy 100 free Ojai vans (~$125K each, down from $200K) across SF, LA, and Phoenix in June, aiming for "tens of thousands annually." The S&P 500, Nasdaq 100, and Russell 2000 closed at records after a US–Iran ceasefire extension, and a Google employee ("AlphaRaccoon") was charged with $1.2M of prediction-market insider trading.[36]Sherwood / Snacks — Drone on

Why index funds needed software

An Acquired clip notes that continuously tracking the S&P 500 required automation software, and buying a correctly weighted whole-share basket would cost ~$3.5M today — barriers that made the pooled fund essential to democratizing passive investing.[37]Acquired — Index funds

Were Neanderthals "culturally modern"?

In a Dwarkesh teaser, David Reich questions the standard model: modern humans contributed ~5% of DNA to Neanderthal ancestors ~200–300K years ago, and archaeologically Neanderthals resemble modern humans more than Denisovans — so perhaps Neanderthals are best understood as culturally modern despite being genetically mostly Denisovan.[38]Dwarkesh Patel — David Reich

Sources

  1. Newsletter Anthropic's victory lap — Tech Brew, May 29
  2. Blog Anthropic's run-rate revenue hits $47 billion — Simon Willison, May 29
  3. Newsletter Anthropic just eclipsed OpenAI — The Rundown AI, May 29
  4. YouTube The Annual AI Slowdown Panic Is Here — The AI Daily Brief, May 29
  5. YouTube Opus 4.8 (Fully Tested): Is IT ACTUALLY GOOD? — AICodeKing, May 29
  6. YouTube I Turned Claude Opus 4.8 Into My Entire AI Operating System — Nate Herk, May 29
  7. YouTube How Claude AI actually solves hard problems — Nate B Jones, May 29
  8. YouTube LIVE VIBE CHECK: Opus 4.8—IT'S A MONSTER — Every, May 29
  9. YouTube Anthropic fights back — Theo - t3.gg, May 29
  10. YouTube Claude Code Was Too Expensive for Microsoft — Better Stack, May 29
  11. YouTube Dax Raad: "None of our competitors are crushing us with AI" — The Pragmatic Engineer, May 29
  12. YouTube Cheap software made your PM job harder, not easier — Nate B Jones, May 29
  13. YouTube AI makes great PMs more powerful — Lenny's Podcast, May 29
  14. YouTube The trap hidden inside Salesforce — Nate B Jones, May 29
  15. YouTube What happens when you run 5 Claudes at once — Prefect, May 29
  16. YouTube Reverse engineering a Viking VOIP phone protocol with Claude Code — Boris Starkov / AI Engineer, May 29
  17. YouTube Why your agents need decision traces, not just documents — Zach Blumenfeld / AI Engineer, May 29
  18. YouTube Reachy Mini: the $300 open source robot you can actually hack — Andres Marafioti / AI Engineer, May 29
  19. YouTube Builders Unscripted: Ep. 3 - Matias Castello, Alchemy — OpenAI, May 29
  20. YouTube Windows Computer Use and mobile access for Codex — OpenAI, May 29
  21. YouTube Loblaw Ships Faster with Codex — OpenAI, May 29
  22. YouTube These Models are Crazy! — OpenAI, May 29
  23. Newsletter Gemini Flash Gets Pricey, AI Act Delays, Agents Drive Online Traffic — The Batch (DeepLearning.AI), May 29
  24. Blog Guardrails: Protect your Agents, Data, and Costs — OpenRouter, May 29
  25. YouTube Huawei makes 1.4nm chip?? — Caleb Writes Code, May 29
  26. YouTube Why Two IIT Engineers Turned Down $550K Jobs To Build A Startup — Y Combinator, May 29
  27. YouTube Improving Python Through PEPs and Protocols (Podcast #297) — Real Python, May 29
  28. YouTube The forgotten developer who saved JavaScript… — Fireship, May 29
  29. YouTube The Supabase Alternative Built Natively for AI (Powabase) — Better Stack, May 29
  30. YouTube Run Your Own LLM on a Server - Ollama + Gemma 3 — LearnThatStack, May 29
  31. YouTube Docker — Arjay McCandless, May 29
  32. YouTube Code Like a Psychopath Will Maintain It Later — Real Python, May 29
  33. YouTube Add Depth to Anything — marimo, May 29
  34. YouTube The Risk of Real-Time AI Voice — Last Week in AI, May 29
  35. YouTube Cursor | Does Specializing a Model Break The Bitter Lesson? — Sequoia Capital, May 29
  36. Newsletter Drone on — Sherwood / Snacks, May 29
  37. YouTube How index funds changed investing forever — Acquired, May 29
  38. YouTube Were Neanderthals Culturally Modern Humans? - David Reich — Dwarkesh Patel, May 29