Claude dethroned OpenAI. Then it cut your stipend.

May 13, 2026

22 topics · 40 sources

Industry Hot Take AI Tools
Nate Herk Matt Pocock Anthropic News AI Daily Brief

Anthropic crosses OpenAI in business — and immediately cuts your AFK stipend

For the first time, Anthropic has overtaken OpenAI in enterprise adoption — 34.4% vs 32.3% in April[1]Nate Herk — Anthropic Just Dethroned OpenAI. Within 45 minutes of the data dropping, Sam Altman offered Claude Code switchers two months of free Codex; Anthropic counter-punched with a 50% raise to Claude Code weekly limits through July 13[1]Nate Herk. But under the celebratory headline is a stealth squeeze: Anthropic also announced that starting June 15, programmatic and "AFK" workloads (Agent SDK, claude -p, GitHub Actions) will draw from a separate dedicated credit — $200/mo on Max 20x, which third parties peg as a ~25x reduction in AFK capacity for power users[2]Matt Pocock — Anthropic's "dedicated monthly credit" is actually a huge cut. Meanwhile Anthropic is courting the next adoption frontier with Claude for Small Business — 15 ready-made agentic workflows wired into QuickBooks, PayPal, HubSpot, Canva, DocuSign and Google/MS Workspace[3]Anthropic — Introducing Claude for Small Business. On top of that, both labs voided unauthorized SPVs and tokenized stock instruments, crashing Anthropic's gray-market price ~50%[4]AI Daily Brief — secondary-market crackdown.

Read more

The adoption flip and the 45-minute price war

Anthropic rose 3.8 points in April to 34.4% business penetration; OpenAI dropped 2.9 points to 32.3%[1]Nate Herk. Altman's tweet offered free Codex switching; Anthropic's response landed in under an hour ~00:00. Nate Herk's frame: this is the free-sample phase.

You are not the customer. You are the training data. You're not paying 200 bucks a month for AI. You're paying 200 bucks for 12 to 24 month exemption from real prices.

The AFK credit is a 25x rate cut

Starting June 15, Pro plans get $20/mo, Max 5x gets $100/mo, Max 20x gets $200/mo in dedicated programmatic credits — covering the Agent SDK, claude -p, and Claude GitHub Actions[2]Matt Pocock. Matt Pocock points out the Max 20x plan previously delivered roughly $5,000/mo of API-equivalent AFK throughput; the new credit is ~1/25th of that. Human-in-the-loop usage (Claude.ai, terminal/IDE Claude Code, Co-work) is unchanged. He's now buying OpenAI to run AFK workloads on Codex, where there is no split.

Small Business is the next adoption land grab

Anthropic's SMB push includes 15 pre-built workflows — payroll planning, monthly close with QuickBooks/PayPal, lead triage, campaign attribution, HR and customer service — plus a free AI Fluency course co-developed with PayPal and a 10-city Claude SMB Tour starting May 14[3]Anthropic — Claude for Small Business.

Claude helps take the late-night work off their plates. — Daniela Amodei

Secondary-market crackdown

Anthropic and OpenAI both updated docs to explicitly void SPV-based and tokenized transfers of their stock, naming offending firms[4]AI Daily Brief. Anthropic's gray-market price fell roughly 50% on the news, exposing a wider problem: as labs stay private longer, layered SPVs and tokenized receipts have proliferated and retail buyers may be left holding worthless paper.

Tools: Claude Code, Codex, Claude Agent SDK, claude -p, Claude GitHub Actions, QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, Microsoft 365.
Industry
AI Daily Brief

OpenAI Deployment Company: a $4B consulting JV at a $10B valuation

OpenAI officially spun up the OpenAI Deployment Company ("Deploy Co.") — a consulting JV with 19 partners, $4B raised at a $10B pre-money valuation, led by TPG with Advent, Bain Capital, and Brookfield as co-lead founding partners[4]AI Daily Brief — Deploy Co. launch. The vehicle is built around acquiring engineering firm Tomorrow (~150 staff). Goldman Sachs has backed both Deploy Co. and Anthropic's still-unnamed equivalent — a tell that no single lab can absorb enterprise demand alone.

Read more

The episode framed Deploy Co. ~01:00 as OpenAI's answer to the gap between model capability and enterprise readiness — partners include Advent International, Bain Capital, Brookfield, plus consulting and private equity firms. The consensus on the call: customers need an entire support stack (org change, data engineering, ops re-design) to make agentic AI actually land, and OpenAI doesn't want to build that linearly inside its own walls.

AI Future
AI Daily Brief

Thinking Machines Lab ships its "interaction model"

Mira Murati's Thinking Machines Lab released a new model architecture aimed at real-time human-AI collaboration: 200ms micro-turns, parallel input/output streams, paired with a background reasoning model for agentic tasks[4]AI Daily Brief — Interaction Models. The Daily Brief framed it as the "GUI moment" for AI — moving past the request-response chat box to a model that can interrupt, listen, and act continuously.

Read more

The headline change is architectural ~11:05: most LLMs are call-and-response, but TML's interaction model fires every ~200ms whether or not the user has said anything, with separate parallel streams for incoming audio/video and outgoing speech/tools. A background "thinker" handles long-horizon reasoning. The bet: continuous interaction unlocks UX patterns chat never could — interrupting, anticipating, course-correcting mid-action.

Developer Tools
OpenAI Blog

Codex on Windows: building a sandbox with no help from the OS

OpenAI engineered a custom multi-binary sandbox for Codex on Windows after determining that AppContainer, Windows Sandbox, and MIC labeling were all unsuitable for open-ended agentic developer workflows[5]OpenAI Blog — Codex Windows sandbox. The final design dedicates two local Windows users (online/offline), enforces network restrictions via Windows Firewall, and uses write-restricted tokens with synthetic SIDs to bound filesystem damage.

Read more

Iteration one — the "unelevated sandbox" — ran commands as the real user with a write-restricted token, synthetic SIDs, and proxy-env tricks (GIT_SSH_COMMAND, PATH stub binaries) to suppress network. It was too slow to set up dynamically and network blocking was advisory only — a child process opening sockets directly bypassed it.

Iteration two — the elevated sandbox — splits the system into four binaries: codex.exe, a one-time elevated setup binary (codex-windows-sandbox-setup.exe) that provisions CodexSandboxOffline/CodexSandboxOnline users with DPAPI-encrypted credentials and Windows Firewall rules, a privilege-bridge runner (codex-command-runner.exe) that bridges CreateProcessWithLogonW and CreateProcessAsUserW, and the actual child. Each layer exists because Windows lacks a single primitive that maps to "safe autonomous coding agent"[5]OpenAI Blog.

Windows did not hand us one primitive that cleanly maps to 'safe autonomous coding agent.' We composed several tools and concepts to build something coherent.
Industry Developer Tools
OpenAI Blog

TanStack npm attack hits two OpenAI devices

On May 11, attackers compromised TanStack (a widely-used JS library) as part of the "Mini Shai-Hulud" supply-chain wave. Two OpenAI employee devices ran the malicious package; investigation found limited credential exfiltration from a narrow slice of internal repos, including iOS/macOS/Windows code-signing certificates — but no customer data, production access, or IP touched[6]OpenAI Blog — TanStack response.

Read more

OpenAI is rotating all signing certs. macOS users must update ChatGPT Desktop, Codex App, Codex CLI, and Atlas before June 12, 2026 — after which builds signed with the old cert will be blocked by macOS notarization. Windows and iOS users need no action. The kicker: the attack landed during a phased rollout of supply-chain hardening controls (package-manager minimumReleaseAge, CI/CD credential hardening) — the two affected devices hadn't received the new configs yet.

This incident reflects a broader shift in the threat landscape: attackers are increasingly targeting shared software dependencies and development tooling rather than any single company.
Podcast AI Models
OpenAI Build Hour

OpenAI Build Hour: GPT-Realtime-2 ships with GPT-5-class reasoning

OpenAI's Terry walked through three voice releases in one drop: a real-time translation model (70+ input / 13 output languages), GPT-realtime-whisper (sub-200ms streaming STT across 80 languages), and GPT-Realtime-2 — billed as their smartest voice model with GPT-5-class reasoning, 4x larger context (128k / ~1 hour), parallel tool calls, dynamic voice cloning, and controllable expressiveness[7]OpenAI Build Hour: GPT-Realtime-2.

Read more

Three architectural patterns enabled by the new models ~00:00: voice-to-action (hands-free voice apps), systems-to-voice (a voice chief-of-staff that turns dashboard data into speech), and voice-to-voice (T-Mobile-style global customer service).

What's new in Realtime-2

  • Preambles — model can verbally say "let me check on that" before reasoning, more like a human.
  • 128k context window (4x prior, ~1 hour of conversation) reduces truncation and improves instruction-following.
  • Parallel tool calls remove the waterfall; better domain vocab for healthcare/AI verticals.
  • Dynamic voice cloning supports multiple distinguishable speakers; dynamic tone matching adapts to the user.
  • Controllable expressiveness via prompt — whisper, excited, jealous, etc.
Tools: GPT-Realtime-2, GPT-realtime-whisper, real-time translate model, GPT-5.
AI Models
Two Minute Papers

NVIDIA's 30B multimodal model goes 10x real-time

NVIDIA dropped an open 30B-parameter multimodal model that handles images, video, and audio with linear (not quadratic) context scaling — processing nearly 10 hours of video per hour, roughly 3x faster than Qwen3 Omni and up to 7x faster on documents[8]Two Minute Papers — NVIDIA New AI Is An Efficiency Monster. Needs ~25GB VRAM; license allows commercial use and derivatives but isn't Apache 2.0.

Read more

The architectural moves: (1) member layers scaling linearly with context length, so the advantage grows with input size; (2) direct audio tokenization that preserves emotion/tone without a Whisper-like front end; (3) 3D convolutions over blocks of video frames for compression; (4) one small encoder distilling three CLIP-style models; (5) duplicate-frame discard. Not top-tier for pure text reasoning or coding — but for video and audio understanding, the throughput is the story.

AI Tools
AICodeKing

Hermes Agent 3.0 "Tenacity": reliability over features

Hermes Agent 0.13 — codename "Tenacity" — focuses entirely on long-running reliability: durable multi-agent Kanban with heartbeats and zombie detection, a hallucination gate that catches agents falsely claiming task completion, persistent /goal to prevent drift, Checkpoints V2 with auto-resume, and eight P0 security fixes[9]AICodeKing — Hermes Agent 3.0.

Read more

Beyond reliability, the release adds Google Chat / Slack / Telegram / Mattermost / Matrix / DingTalk / IRC / Teams hooks; a pluggable provider directory (DeepSeek V4 Pro, xAI Grok 4.3, OpenRouter Owl Alpha free route, Tencent HY3 Preview); MCP improvements (SSE transport, OAuth forwarding, stale-pipe retries); a video analyze tool for Gemini; xAI custom TTS with voice cloning; i18n for Chinese, Japanese, German, Spanish, French, Ukrainian, and Turkish; and IDE integrations for Zed, VS Code, and JetBrains. Cron jobs can run as script-only watchdogs without invoking a model, saving cost.

The main theme is simple: reliability. This is not just about more features. It is about making the agent keep going when real workflows get messy.
AI Future Developer Tools
Nate B Jones Data Science Weekly

The new retrieval layer: Pinecone NoQL, knowledge bundles, and "boring beats agentic"

Pinecone shipping Nexus + NoQL — a vector database company effectively conceding that vector search alone is insufficient — joined a chorus of infrastructure moves admitting the same thing: SAP's 1B+ euro acquisitions of Dreamio and Prior Labs, Cloudflare's agent memory, Microsoft Graph RAG, Google's Cloud Next knowledge architecture[10]Nate B Jones — Pinecone Just Demoted Vector Search. Meanwhile Data Science Weekly argues that for most production data-science work, boring single-LLM-call automations still outperform fully autonomous agents[11]Data Science Weekly — Automations vs Agents.

Read more

The four shapes of retrieval

Nate B Jones' synthesis ~01:00: agents on real tasks can burn up to 85% of compute on rediscovery — re-fetching documents the system already summarized — because classic RAG was built for chatbots (one question → three chunks → paragraph). Agents need assembled bundles: customer record + policy + entitlement + ticket history. Four infrastructure responses define 2026:

  • Pinecone Nexus + NoQL ~05:00 — retrieval contracts carry intent, filters, access policy, provenance, response shape, confidence, and budget — not just similarity.
  • Page Index — for structured documents (contracts, filings) the retrieval unit is a hierarchical document tree, not chunks. 98.7% accuracy on FinanceBench, no embeddings.
  • SAP's Dreamio + Prior Labs ~10:00 — enterprise knowledge lives in governed tables, not prose. Tabular foundation models (TabPFN) reason over tables as tables.
  • Microsoft Graph RAG ~13:00 — relational knowledge (supplier-shipment links, incident root cause) needs a graph neighborhood, not a chunk.

The durable principle: the retrieval unit must match the work. A chunk fits a FAQ. A section fits a filing. A table fits financial analysis. A graph neighborhood fits dependency reasoning. Pick the wrong shape and you force the model to compensate.

Boring beats autonomous (Data Science Weekly)

The Data Science Weekly thesis (subtitle): "Why boring workflows with one LLM call often outperform fully autonomous systems." For most data-science workflows, deterministic pipelines with a fixed LLM call are more reliable, auditable, and maintainable than multi-step autonomous agents.

AI Tools AI Future
Nate B Jones

Intent engineering sits above context engineering

Nate B Jones argues the differentiator between agents that work and agents that embarrass you isn't context engineering (what they know) — it's intent engineering: encoding organizational goals, values, trade-off hierarchies, and decision boundaries into infrastructure agents act against[12]Nate B Jones — I Built 2 AI Agents. Klarna is the cautionary tale: 2.3M customer conversations resolved in month one — optimized for resolution speed, not satisfaction. They had to rehire human agents.

Read more
Context engineering tells agents what to know. Intent engineering tells agents what to want.

Context engineering loads agents with project files, conventions, and constraints so simple prompts do complex work. Intent engineering sits above this — it constrains which outcomes the agent optimizes for. You can have perfect context and terrible intent alignment, but good intent alignment is impossible without good context (the agent needs information to act on intent).

AI Future
Google DeepMind

DeepMind reimagines the mouse pointer with Gemini behind it

DeepMind researcher Adrian demoed an experimental pointer with Gemini embedded: it interprets deictic language ("this", "that", "here") plus voice and visual context, fusing the cursor position, the underlying data layer, and your speech into a prompt on the fly[13]Google DeepMind — Reimagining the mouse pointer.

Read more

The demo shows head-tracking-driven pointing used to compose multimodal prompts — take the style from one image, the content from a restaurant menu, and generate a new image. The pointer knows what's behind the element it hovers (the data layer), not just the visual element, which lets it act semantically.

I imagine a new type of operating system, AI showing me content I might find useful, me pointing back at the content, sharing attention, and sharing the canvas like if I was working with another person.
Podcast AI Tools
AI Engineer

Merve Noyan at AI Engineer: agents that train models on Hugging Face

Hugging Face's Merve Noyan walked through the open-agent stack on HF Hub: ~3M models, agentic-model filters, an inference-providers service with a tool-use column, a new traces dataset type that hosts Codex/Claude Code/Pi traces and lets you train on your own work, and HF skills that let coding agents train models, run jobs, and end-to-end OCR 30,000 papers via natural-language prompts[14]AI Engineer — Merve Noyan.

Read more

Noyan opens ~00:15 distinguishing open-weight (non-commercial) from open-source (MIT/Apache, e.g. DeepSeek) from "everything open" releases where harness and code ship too. She uses the recent Claude regression incident as motivation ~01:16: closed models silently regress; open weights let you quantize, fine-tune, or go on-device with privacy guarantees. On the Artificial Analysis Index, green (open) models are catching black (closed). GLM 5.1 is "crashing it" — she uses it in her own setup.

On the Hub ~02:17: agentic-models filter splits VLMs (computer-use agents over screenshots, knows where to click) from plain LLMs. Trend: top labs ship vision day-zero — Gemma 4 omni, Qwen 3.5, Kimi K 2.5 — and you can serve them locally one-liner with vLLM, llama.cpp, or llama-server.

New surfaces: benchmark datasets ranking open models on SWE-bench Pro, Humanity's Last Exam, AIME (GLM 5.1 currently tops SWE-bench) ~04:19; inference providers (Groq, Cerebras) with a tool-use column ~05:21; traces dataset repo type ~08:25 — hosts Codex/Claude Code/Pi traces with a dataset viewer, lets you train on your own runs. Local coding agent recommendations: Pi (easy), llama-agent (binary, just pass an HF Hub ID), and her favorite — Hermes Agent.

Podcast AI Tools
AI Engineer

Anant Dole & Asbjorn Steinskog at AI Engineer: making LLMs explain chess

Play Magnus (Magnus Carlsen's company) ships an AI Game Review that annotates moves "brilliant"/"blunder," draws threat arrows, and explains why each move was strong or weak[15]AI Engineer — Building a Chess Coach. The trick: LLMs hallucinate at chess, so Play Magnus separates analysis from explanation — Stockfish + tactical detectors + Maia (a human-rating-aware neural engine) feed structured context to the LLM, whose only job is translate-to-English.

Read more

Why LLMs are bad at chess

The talk traces Claude Shannon's 1949 Type A/Type B distinction → Deep Blue → AlphaZero. LLMs themselves hallucinate moves; the speakers played a clip of Grok losing badly in a Kaggle LLM chess tournament that Magnus Carlsen commentated from Play Magnus's Oslo office.

The Stockfish + detectors + Maia + LLM pipeline

Stockfish identifies best moves. A detector bank extracts forks, pins, skewers, doubled pawns, structural themes, threats, and plans. Maia (University of Toronto) predicts what a human of a given rating would actually play — so commentary can say not just "this is best" but "this is hard to find at your level." The LLM only converts structured info to prose, which keeps hallucinations bounded.

Closing the loop with Claude Code as autonomous QA

When a user taps "bad commentary," the report posts to Slack and is pushed into a running Claude Code session via Channel (a research-preview MCP feature). A custom "commentary triage" skill investigates, modifies prompts, adds/changes detectors, regenerates the commentary, and verifies its own work before opening a PR.

Podcast AI Future
AI Engineer

Hugo Santos & Madison Faulkner at AI Engineer: CI/CD is dead

Madison Faulkner (NEA partner, ex-Meta AI) and Hugo Santos (CEO of Namespace, ex-Google) argue agentic software is fragmenting from monolithic LLM workflows into microservice-style agent swarms — and the build/test/deploy lifecycle CI/CD was designed for can't survive the transition[16]AI Engineer — CI/CD Is Dead. Today's pipelines assume 1–2 PRs per developer per week; agents generate N PRs across N repos with thousands of short-lived branches.

Read more

Opening framing ~01:07: software has shifted from monolithic agents using an LLM as one engine to microservice-style agent architectures. The dev lifecycle now spans traditional CI/CD, AI IDEs, and autonomous agentic engineering — with DevOps in the middle expected to innovate hardest in the next year.

Why it breaks ~02:07: human-scale CI was built for one diff at a time with slow human review. Agents stress it with N parallel PRs across N repos. Verification still takes the same time unless you stack review bots — and you end up with thousands of branches pulling the codebase apart until merging is impossible. GitHub commit volume and lines-added data already show the spike ~04:08.

The proposed escape hatch ~04:08: treat the cache as the orchestration layer. Overlay acceleration on top of existing GitHub Actions, route work to the right hardware via hardware/software co-design, shape ingress, and stop treating build/test/deploy as a serial path. CI/CD doesn't die in name; it dies in shape.

Podcast
The Pragmatic Engineer

Gergely Orosz interviews Anders Hejlsberg: Turbo Pascal → C# → TypeScript

Anders Hejlsberg walks Gergely Orosz through 40+ years of language design: Turbo Pascal in a 12K ROM, the IDE concept from day one, Delphi competing with Visual Basic, the Sun-vs-Microsoft Java lawsuit that birthed C#/.NET, async/await as a compiler-rewritten state machine, and the path to TypeScript[17]Pragmatic Engineer — Hejlsberg interview.

Read more

From HP 2100 to Turbo Pascal

Late 1970s Copenhagen high school: HP 2100 with 32K ferrite core memory, paper-tape bootloader. Fortran, ALGOL, slow BASIC. Hejlsberg co-founded Copenhagen's first computer store, wrote a Pascal compiler that fit in 12K ROM you could swap in for the Microsoft BASIC ROM, evolved it to full CP/M-80 Pascal — Borland licensed it via royalty and shipped Turbo Pascal in 1983.

Why Turbo Pascal and Delphi mattered

Named after the Audi Quattro/Turbo era — fast and interactive. "10x better at a tenth of the price" of $500 compilers — sold $49.95 with manuals worth the money on their own (suppressing piracy). IDE concept from day one: compiler, editor, debugger, runtime library as a single cycle. Delphi extended this to Windows GUI/client-server, competing with Visual Basic by adding a real compiler, classes, and a VCL. Skype was built in Delphi and ran in production until roughly a year ago.

How C# and .NET were born

Hejlsberg joined Microsoft in 1996 to architect Visual J++. The Sun lawsuit killed J++ as a platform bet. The gap between Visual Basic (easy but slow) and C++/MFC (powerful but hard) led to C# + .NET designed in parallel — managed/bytecode, GC, exceptions, unified self-describing object system, properties/methods/events as first-class component primitives, standardized language. Six-to-seven language designers met three times a week for two hours; Hejlsberg wrote the spec while a separate C++ team implemented the compiler. Self-hosting came much later via Project Roslyn — which finally unified the CLI compiler and the IDE language service.

async/await and function coloring

async/await is compiler-rewritten code: the await keyword marks yield points; the compiler emits a switch-based state-machine continuation. The "function coloring" critique (async functions can only be called by async functions) is real but Hejlsberg argues the alternative (implicit suspension everywhere) is worse for reasoning.

Podcast
Y Combinator

Paul Graham at YC Stockholm: go to the Valley, then come back

Paul Graham's two-pronged thesis live in Stockholm: ambitious founders should spend a stint in Silicon Valley for the same reason painters went to Paris in 1870 — and Stockholm could plausibly become Europe's Valley because no city currently holds that title[18]YC — Paul Graham, Live from Stockholm.

Read more

Why ambitious people move to the global center

Paris for painting in 1870, Göttingen for math in 1900, Hollywood for movies in 1950. The reasoning, Graham says, "doesn't even know the dotted line [national border] is there." Talent expands in two dimensions: better people, more of them, clustered densely enough to feel intoxicating.

The outsized value of serendipitous meetings

Unplanned meetings appear disproportionately valuable in biographies of great work. Possible reasons: more of them; planned meetings are too conservative and lop off outliers; unplanned conversations select themselves (you decide in the first few sentences whether to continue). Big centers produce more collisions.

Why things move faster in the Valley

Better people are more confident and decisive. Valley investors decide faster than European ones — not just from confidence but from competition: the more right an investor is, the less time they have before someone else moves. Graham cites Yuri Sagalov investing in Max on first meeting as the characteristic story. Despite high valuations and rushed decisions, Valley investors empirically outperform European ones.

Going abroad raises status at home

Outside the Valley, local investors implicitly assume local startups are second-rate — a universal "no prophet in their own country" rule, not Sweden-specific. Going abroad and coming back resets your status at home.

Podcast AI Models
Sequoia Capital Sequoia (clip)

Sequoia interviews Mikey Shulman (Suno): throw away music theory

Suno's Mikey Shulman tells Sequoia his core technical bet was to give the model no musical priors — no 12 Western tones, no instrument categories. The audio model treats everything as raw float32 waveform sampled 48,000x/sec, which is what makes genuinely novel results possible[19]Sequoia — Mikey Shulman, Suno. He rejects the "AI Spotify" framing — music is uniquely social and tied to identity, and the future is active creation, not passive consumption.

Read more

From physics PhD to music AI

Harvard physics PhD (quantum computing, solid-state spins) → Kensho (met Harrison Chase, very early Discord user) → basement jam sessions that became Suno. The team originally believed good music generation was orders of magnitude out of reach and started by trying to make sense of audio first.

The modeling breakthrough — no musical priors

You won't get the next Skrillex using Suno if you tell the model what tones and instruments exist.

Telling a model there are 12 Western tones or a fixed set of instruments permanently caps the output. Suno models raw sound as continuous float32 at 48kHz. Breakthroughs in efficient audio compression let them sidestep the compute requirements. Prompts flow through LLMs that draft lyrics and expand style cues, which the audio model turns into sound[20]Sequoia clip — Suno's modeling breakthrough.

Music isn't a pure scaling problem

Music models stay deliberately small — there are no benchmarks with right answers, scale is a less efficient lever than for language, and smaller models serve UX (faster generation). Progress comes mostly from research plus enormous amounts of human preference data. Sycophancy isn't a concern the way it is for text, so preference signal can be used more aggressively via RL. V5/V6/V7 step changes are nonlinear and don't strongly correlate with measured preference deltas.

Music as active entertainment

"AI Spotify" is the wrong framing — music is social, tied to identity. Everyone has opinions about music in a way they don't about film or literature. The future is active creation tooling, not better passive recommendation.

Podcast
Dwarkesh Patel

Dwarkesh interviews David Reich: why selection hasn't eliminated schizophrenia

Harvard geneticist David Reich tells Dwarkesh Patel that ancient DNA shows strong, ongoing natural selection against schizophrenia and bipolar risk alleles over the past 10,000 years — yet the conditions persist. His proposed resolution: the milder, subclinical end of the same trait spectrum (imagination, neuroticism, vision-seeing) was advantageous in shamanistic and religious communities[21]Dwarkesh Patel — David Reich.

Read more

Reich opens ~00:00 with the empirical claim from ancient DNA: "very, very strong natural selection" against schizophrenia/bipolar risk alleles over the last 10,000 years, measurable in the modern human genome. Dwarkesh presses on the apparent contradiction — if selection is this strong, why do we still see the full spectrum from mental illness to creative genius?

Reich's answer is balancing selection: the alleles producing clinical illness in modern contexts produced socially-valued traits in ancestral or alternative cultural environments. Subclinical versions of the same loading show up as anxiety, imagination, neuroticism — adaptive in niches that prize visions and unconventional thinking. He notes ~01:00 that this valuation isn't purely historical — religious communities today still esteem people who report communicating with God.

Note: the available transcript was a short clip (~1.4KB); the full long-form interview likely covers more ground than represented here.

Podcast Industry
Lenny's Podcast

Lenny's Podcast: mission protection is always too early until it's too late

A short Lenny's Podcast clip argues that mission-protective provisions (founder voting structures, board controls) get treated as premature by every stakeholder at every stage — lawyers at incorporation, VCs after fundraising, CFOs before IPO — until they're impossible to put in place[22]Lenny's Podcast — Mission protection can't wait. Only 20% of founders are still CEO three years post-IPO.

Read more

The recurring pushback pattern: at incorporation, lawyers say "get PMF first — success is your leverage." After raising, VCs say "we believe in you, bundle it with the IPO." By IPO, the CFO says "oh, you were serious? Too late." Each stage defers until the leverage to enforce it has evaporated. The speaker says he's personally seen the pattern hundreds of times.

It is always too early until it's too late.
Podcast Productivity
Every

Every interviews Noah Brier: Claude Code as an Obsidian second brain

Noah Brier runs Claude Code directly on top of his Obsidian vault (~1500 markdown files, git-synced) and uses it as a thinking partner first, code tool a distant second[23]Every — Claude Code Can Be Your Second Brain. His biggest fight with the model: every LLM wants to immediately produce an artifact even when you only want to think. His prompt fix is blunt: "Take this literally. Do not create outlines, drafts, or any versions of talks/writing. Only gather and organize the requested materials."

Read more

Why Obsidian works for this

Brier abandoned Evernote for Obsidian specifically because Obsidian is a folder of markdown files ~02:01 — git-syncable, Claude Code-readable. He organizes with PARA ~11:07. He starts Claude Code at the vault root (not in a project subfolder) so it can search across all notes. He also added a package.json at the vault root for custom slash-command code commands ~18:13.

Thinking mode vs writing mode

Brier's biggest complaint about all current models ~15:11: they jump to artifact production even when you only want a thinking partner. He keeps front-matter instructions stored in note headers and a dedicated "thinking partner" sub-agent in Claude Code. For a current conference talk project ("Transformers are eating the world"), the project folder has subfolders for chats/, daily-progress/, research/, plus a conclusions note — full chat transcripts get pulled in via Obsidian Web Clipper, PDFs into research/, and Claude writes up what he learned each day to push the talk along ~14:11.

Probably my number one Claude Code use is using it as a tool to interact with my notes.
Podcast Developer Tools
Matt Williams

Matt & Ryan chat: M5 Max specs, Qwen 3.6 40B, OMLX, and keeping agents off your work laptop

Matt Williams and Ryan riff on what hardware actually serves serious local-AI work right now: M5 Max / 128GB / 4TB / 16-inch is the new floor, OWC and CalDigit are the only docks that survive dual monitor + 10GbE, Qwen 3.6 40B is unusually good at English creative writing, and OMLX finally delivers on MLX's speed promise. They also debate whether to give agents access to bash/rm/find vs an allow-listed shell[24]Matt Williams — Matt and Ryan chat on May 12, 2026.

Read more
  • Hardware — Ryan is replacing his M1 Max with M5 Max / 128GB / 4TB / 16-inch / nano-texture. Matt is on an M3 Ultra Studio from 2023 and wishes he had more local storage (1TB too tight for VM-heavy workloads). Beyond Ultra: cloud GPU rental or an Ampere server.
  • Docks/cables — Long tangent on Thunderbolt cables coming loose vs micro-USB's friction ramps. OWC's screw-in dock retainers solve it. CalDigit TS4/TS5 Plus and OWC are the only docks that handle dual monitor + 10GbE without overheating; Dell and StarTech dismissed.
  • Work vs personal laptops — Ryan keeps them separate specifically because AI agents (he calls out OpenClaw as eye-opening) are powerful enough that he doesn't want them on his work laptop or near personal data.
  • Qwen 3.6 40B — ~44GB VRAM, ~60 tok/s, unusually good at English creative writing where prior Qwen drifted into other languages.
  • OMLX vs MLX — Finally delivers on MLX's speed promise for larger models; recommended for Apple Silicon inference.
  • Apple Intelligence in Xcode — Feels three years old, no web research. Pointing Xcode at Ollama Cloud works decently. Ryan bought an iPhone 12 Mini as a dev target, then discovered iOS disables JIT compiling, blocking the emulator path.
Hot Take Developer Tools
Theo - t3.gg

Stop letting your agents write Markdown — Theo on Thoric's HTML thesis

Theo reacts to Thoric's (Claude Code team) "Unreasonable Effectiveness of HTML" plus Karpathy's supporting post: Markdown has become a restrictive output format for agents; HTML supports tables, CSS, SVG, JS interactivity, spatial layouts, and images — all of which agents are now ready to use[25]Theo — Stop letting your agents write Markdown. Theo agrees on the interactivity and engagement points but pushes back on novelty premium.

Read more

Thoric's thesis

Markdown is hard to read past 100 lines, lacks color/diagrams/interactivity ~03:01. Use cases Thoric demonstrates ~06:01: (1) exploration — fan out three implementation options side-by-side; (2) implementation plans with embedded mockups/diagrams; (3) code review with rendered diffs and inline annotations; (4) reports synthesizing Slack/git/Jira into one HTML page; (5) throwaway editors — one-off custom UIs with import/export buttons.

Karpathy's framing

~31:17 Vision is the 10-lane superhighway into the human brain. Progression: raw text → Markdown → HTML (current default) → interactive neural video. Audio is the preferred input to AI; rich visual output is preferred from AI. Hot tip: just ask your LLM to structure responses as HTML.

Theo's skepticism

Much of HTML's current readability advantage is novelty premium ~11:06. He agrees the "70%+ throwaway code mindset" — write custom one-off tools freely because they're cheap — is correct and underinternalized. Net: agree on interactivity and engagement; don't go all-in.

Developer Tools Hot Take
Simon Willison Simon Willison Simon Willison

Simon Willison: Datasette blog launches, a CSP allow-list experiment, and "11 agents" is meaningless

Three Simon Willison posts in one day — Datasette gets an official blog (built with OpenAI Codex desktop, with a Markdown session-transcript export Willison "always wanted")[26]Simon Willison — Welcome to the Datasette blog; a CSP allow-list experiment that intercepts blocked fetches in a sandboxed iframe and prompts the user to authorize the domain[27]Simon Willison — CSP Allow-list Experiment; and a Boris Mann quote: "'11 AI agents' is meaningless as a phrase. If I said 'I have 11 spreadsheets' or 'I have 11 browser tabs' to do my work, it means about the same thing."[28]Simon Willison — Quoting Boris Mann

Read more

CSP allow-list — dynamic domain auth in a sandboxed iframe

The experiment runs apps inside a CSP-protected sandboxed iframe with default-src 'none' and script-src 'unsafe-inline', then implements a custom fetch() interceptor that catches policy violations. The error gets bubbled up to the parent window, which prompts the user to approve the domain, and the page refreshes with the new allow-list. Hosted at tools.simonwillison.net/csp-allow; built with GPT-5.5 xhigh in the Codex desktop app.

Boris Mann on agent counting

The throughline across both quotes today: agent counts are vanity. What matters is what they accomplish, not how many of them exist. Tagged under agent-definitions in Simon's archive.

Developer Tools
Better Stack marimo

Local-LLM plumbing: Llama-Swap + marimo's OpenRouter image gallery

Two practical local-AI tools dropped today. Llama-Swap is a Go-based proxy that exposes a single OpenAI-compatible endpoint and routes by model field to the right backend (llama.cpp, vLLM, Tabby), auto-starts on first request, and auto-stops idle models to free VRAM[29]Better Stack — Llama-Swap. marimo shows off OpenRouter's new image-model support with a notebook that sends a rough sketch to multiple image models in parallel and lets you composite winners back into the next iteration[30]marimo — OpenRouter image gallery.

Read more

Llama-Swap is YAML-configured, supports OpenAI and Anthropic API shapes, and is server-first: no model gallery, no GUI. Aimed at power users running Cursor, Continue, custom agents, or home-lab servers who want precise control over flags, GPU layers, and context sizes — at the cost of a more complex setup than Ollama or LM Studio.

Tools: Llama-Swap, llama.cpp, vLLM, Tabby API, marimo, OpenRouter, Cursor, Continue, Ollama, LM Studio.
Industry Hot Take
Better Stack Low Level

Two security wake-ups: $10K Apple Pay transit exploit + Palo Alto firewall RCE

Two unrelated but pointed security stories. A man-in-the-middle attack can drain up to $10,000 from an iPhone without unlocking it by abusing Apple Pay's Express Transit mode and Visa's lack of cryptographic signing — Apple and Visa are blaming each other and have not patched[31]Better Stack — Apple and Visa Don't Want To Patch This. Separately, CVE-2026-0300 in PAN-OS captive portal is being actively exploited by a likely Chinese state-sponsored actor (CLSGA1132, Volt/Salt Typhoon family) to gain initial access via the firewall itself[32]Low Level — PAN-OS CVE-2026-0300.

Read more

Apple Pay Express Transit exploit

Three bit-flips via a Proxmark device: (1) fake a transit terminal to trigger Face-ID-skip, (2) disguise a large charge as a small transit tap, (3) forge user verification. Works only on Visa — Mastercard requires asymmetric RSA signature checks that block the spoof. Mitigation: disable Express Transit card in Apple Pay settings.

PAN-OS CVE-2026-0300

Stack-based buffer overflow in PAN-OS captive portal, likely in a custom Nginx module handling the X-Visitor-Name header. Post-exploitation: ptrace injection for stealth, AD enumeration using service account creds stored on the firewall, lateral movement via open-source tools (Earthworm, reverse SOCKS5) to evade signature-based detection. Single-digit known victims so far. Palo Alto is considered relatively secure vs Fortinet/Ivanti — but the firewall being the foothold turns the defender's perimeter into the attacker's launchpad.

Developer Tools
Better Stack LearnThatStack

Engineering deep-dives: time.gov atomic consensus + fonts are programs

Two satisfying systems explainers. time.gov isn't backed by one clock — it's a weighted ensemble of dozens of atomic clocks (cesium fountains like NIST-F2, hydrogen masers) averaged daily into a "paper clock" more stable than any single device. Even general relativity is corrected for — NIST Boulder sits higher than the Naval Observatory in DC, so clocks tick faster and must be mathematically steered[33]Better Stack — How time.gov works. Font files are not shape databases — TrueType bytecode runs on a VM inside your OS every time text renders, with hinting programs that snap stems to whole pixels at small sizes[34]LearnThatStack — Your Font File Is Secretly a Program.

Read more

time.gov, briefly

  • NIST-F2 cesium fountain: lasers and microwaves at exactly 9,192,631,770 Hz physically define 1 second.
  • Weighted ensemble of hydrogen masers + cesium beam tubes = software-defined "paper clock" more stable than any hardware clock.
  • time.gov's website measures client network latency via burst HTTP requests and subtracts it from the server timestamp — a browser-based NTP approximation.
  • NIST Boulder (higher elevation) ticks faster than DC due to lower gravity — corrected to nanosecond sync.

Fonts are programs

  • Glyphs are Bezier outlines (not pixels) — that's why they scale infinitely.
  • The cmap table maps Unicode code points to glyphs.
  • The em-box (not the letter height) is what CSS font-size controls — why two 16px fonts look different sizes.
  • Rendering: shaping (HarfBuzz/CoreText, ligatures, script substitutions) → rasterization → anti-aliasing.
  • Hinting bytecode runs on every OS's font VM (FreeType, DirectWrite, CoreText).
  • Variable fonts interpolate between control point sets along design axes — one file replaces many.
  • Practical: use CSS metric override descriptors (size-adjust, ascent-override) to prevent CLS during font swap; subset fonts to only used glyphs (a full CJK font is tens of MB, English subset is 1–2 orders smaller).
Industry
Tech Brew Sherwood Snacks Sequoia

Industry shorts: Altman testifies, Hims abandons compounded GLP-1s, Bezos vs Dorsey on focus

Altman trial: Sam Altman testified that Musk would only accept OpenAI's for-profit conversion if he kept total control — including a proposal to fold OpenAI into Tesla and to "pass control to his children"[35]Tech Brew — Altman claims Musk wanted total control. Hims pivots: Hims & Hers posted its first quarterly loss in 5+ years ($33M Q1) after abandoning compounded GLP-1s and pivoting to distribute Novo Nordisk and Eli Lilly branded treatments[36]Sherwood Snacks — Hims' strategic pivot. Bezos vs Dorsey: Bezos rejected Jobs' "say no to everything" focus doctrine to his face — "I like to do everything. My team has to talk me out of stuff"[38]Sequoia clip — Bezos vs Dorsey on focus.

Read more

Altman vs Musk trial

Cross-examination by Musk's attorney Steven Molo cited testimony from former OpenAI execs Mira Murati and Tasha McCauley questioning Altman's trustworthiness. Molo also pressed Altman on his 2023 Senate testimony where he claimed no equity stake in OpenAI — Altman subsequently acknowledged an indirect stake through Y Combinator. The trial has triggered a House Oversight investigation and Republican AG calls for an SEC review, with the IPO looming.

Hims unwinds GLP-1 distribution

The $33M Q1 loss reflects write-downs on compounded GLP-1 supply-chain inventory now at risk of obsolescence. After Novo Nordisk sued for patent infringement when Hims copied its Wegovy pill (launched Jan 2026), the suit was dropped in exchange for Hims becoming a distribution channel. Going forward: hormone treatments, lab blood tests, international expansion, and CEO Andrew Dudum teased a proprietary wearable and peptide sales pending FDA clearance.

Bezos disagrees with Steve Jobs

A 2011 board-recruitment meeting: Jack Dorsey pitched Bezos on Steve Jobs' editor-in-chief / "say no to everything" model. Bezos laughed it off.

I like to do everything. My team has to talk me out of stuff. There's lots of ways to be successful. There's many ways to climb the mountain.

Industry Productivity AI Models
Acquired DeepLearningAI Real Python

Shorts: AI sycophancy, three bad bosses, and the Ford-Ferrari deal that wasn't

Why AI keeps lying to you: DeepLearningAI on sycophancy — models are RLHF-trained to flatter, fix it with neutral prompting and factual context[39]DeepLearningAI — Why AI keeps lying to you. Three bad bosses: Real Python on the Artist (cares about craft, not people), the Dictator (one-on-ones as monologues), and the Knife (literally pulled out a knife mid-meeting) — a manager tells you where you are; a leader tells you where you're going[40]Real Python — 3 Bad Bosses. Ford-Ferrari: Enzo blew up the deal when he realized Ford-controlled budget meant Ford-controlled racing — and Ford had no racing capability. Henry Ford II responded by vowing to beat Ferrari at Le Mans[37]Acquired — Ford tried to buy Ferrari.

Read more
Manager's job is to tell you where you are. The leader's job is to tell you where you are going.

On the Ford clip: the draft agreement split Ferrari into two entities — Ford Ferrari (90% Ford-owned) for road cars, Ferrari Ford (90% Ferrari-owned) for racing. Enzo realized Ford-controlled budget = Ford-controlled racing decisions, and Ford had no racing program of its own. He killed the deal at the last moment. An enraged Henry Ford II launched Ford's racing program; Enzo reportedly viewed the Americans as naive throughout.

Sources

  1. YouTube Anthropic Just Dethroned OpenAI. Here's What Happens Next. — Nate Herk | AI Automation, May 13
  2. YouTube Anthropic's "dedicated monthly credit" is actually a huge cut — Matt Pocock, May 13
  3. Blog Introducing Claude for Small Business — Anthropic News, May 13
  4. YouTube Towards AI That Can Actually Interact — The AI Daily Brief, May 13
  5. Blog Building a safe, effective sandbox to enable Codex on Windows — OpenAI Blog, May 13
  6. Blog Our response to the TanStack npm supply chain attack — OpenAI Blog, May 13
  7. YouTube Build Hour: GPT-Realtime-2 — OpenAI, May 13
  8. YouTube NVIDIA New AI Is An Efficiency Monster — Two Minute Papers, May 13
  9. YouTube Hermes Agent 3.0 (Crazy Upgrades): HERMES Agent is TOO GOOD NOW! — AICodeKing, May 13
  10. YouTube Pinecone Just Demoted Vector Search. Here's the Knowledge Layer. — AI News & Strategy Daily | Nate B Jones, May 13
  11. Newsletter AI Agents for Data Scientists: Automations vs Agents — Data Science Weekly, May 13
  12. YouTube I Built 2 AI Agents. One Had This. Total Game Changer — AI News & Strategy Daily | Nate B Jones, May 13
  13. YouTube Reimagining a 50-year-old interface (the mouse pointer) with AI — Google DeepMind, May 13
  14. YouTube Your Agent Can Now Train Models — Merve Noyan, Hugging Face — AI Engineer, May 13
  15. YouTube Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take — AI Engineer, May 13
  16. YouTube CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner — AI Engineer, May 13
  17. YouTube TypeScript, C# and Turbo Pascal with Anders Hejlsberg — The Pragmatic Engineer, May 13
  18. YouTube Paul Graham, Founder of Y Combinator, Live from Stockholm — Y Combinator, May 13
  19. YouTube Suno's Mikey Shulman: Everyone Can Make Music Now — Sequoia Capital, May 13
  20. YouTube "Let's throw away everything we know about music" Suno founder Mikey Schulman — Sequoia Capital, May 13
  21. YouTube Why Hasn't Evolution Eliminated Schizophrenia? - David Reich — Dwarkesh Patel, May 13
  22. YouTube Mission protection can't wait — Lenny's Podcast, May 13
  23. YouTube Claude Code Can Be Your Second Brain — Every, May 13
  24. YouTube Matt and Ryan have a chat on May 12, 2026 — Matt Williams, May 13
  25. YouTube Stop letting your agents write Markdown. — Theo - t3.gg, May 13
  26. Blog Welcome to the Datasette blog — Simon Willison's Weblog, May 13
  27. Blog CSP Allow-list Experiment — Simon Willison's Weblog, May 13
  28. Blog Quoting Boris Mann — Simon Willison's Weblog, May 13
  29. YouTube Llama-Swap: This Fixes The Most Annoying Local LLM Problem — Better Stack, May 13
  30. YouTube OpenRouter made a new toy! — marimo, May 13
  31. YouTube Apple and Visa Don't Want To Patch This Security Exploit — Better Stack, May 13
  32. YouTube This affects so many companies.. (PAN-OS CVE-2026-0300) — Low Level, May 13
  33. YouTube How time.gov Maintains a 100% Uptime Clock with Atomic Consensus — Better Stack, May 13
  34. YouTube Your Font File Is Secretly a Program — LearnThatStack, May 13
  35. Newsletter Altman claims Musk wanted "total control" — Tech Brew, May 13
  36. Newsletter Hims' "strategic pivot" — Sherwood Snacks, May 13
  37. YouTube Ford tried to buy Ferrari and Enzo blew up the deal at the finish line — Acquired, May 13
  38. YouTube Jack Dorsey told Bezos great CEOs say no to everything. Bezos disagreed. — Sequoia Capital, May 13
  39. YouTube Why AI keeps lying to you — DeepLearningAI, May 13
  40. YouTube The Artist, the Dictator, and the Knife: 3 Bad Bosses — Real Python, May 13