GPT-5.5 ships, Copilot's $14K-of-usage party ends

May 5, 2026

24 topics · 40 sources

AI Models AI Tools
OpenAI OpenAI System Card

GPT-5.5 Instant ships as default ChatGPT model

OpenAI rolled GPT-5.5 Instant out as the default ChatGPT model and as chat-latest in the API, claiming a 52.5% reduction in hallucinations on high-stakes prompts vs GPT-5.3 Instant.[2]OpenAI — GPT-5.5 Instant The model becomes the first Instant-tier release flagged "High Capability" in both Cybersecurity and Bio/Chem under OpenAI's Preparedness Framework.[3]OpenAI — GPT-5.5 Instant System Card

Read more

The release pairs a capability bump with a personalization push: ChatGPT can now use memory sources and connected Gmail to tailor answers, with Instant getting noticeably better at remembering preferences and matching tone.[2]OpenAI — GPT-5.5 Instant The "smarter" claim is anchored in factuality work — health-related HealthBench scores rose between +1.8 and +5.5 points across subsets — but the system card is candid about regressions: small but statistically significant drops in gore (0.703) and sexual-content (0.806) filter pass rates relative to GPT-5.3 Instant, which OpenAI says it addresses with system-level mitigations rather than further fine-tuning.[3]OpenAI — GPT-5.5 Instant System Card

The Preparedness flag matters because it's the first time an Instant-tier (i.e., always-on, non-reasoning) model has crossed High Capability in two domains simultaneously. OpenAI is now shipping that capability to every free user by default.[3]OpenAI — GPT-5.5 Instant System Card

Tools: ChatGPT, GPT-5.5 Instant, GPT-5.3 Instant, HealthBench, OpenAI Preparedness Framework
Developer Tools Industry
OpenAI Engineering

OpenAI open-sources MRC, its frontier-training fabric

OpenAI is publishing MRC (Multi-plane RDMA Compute fabric), the GPU networking protocol it uses to train at 131,000+ GPU scale, and contributing it through the Open Compute Project alongside AMD, Broadcom, Intel, Microsoft, and Nvidia.[4]OpenAI — Supercomputer networking The protocol collapses the typical 3–4-tier switch hierarchy down to two tiers using adaptive packet spraying and SRv6 static source routing.

Read more

MRC is what runs Stargate, the OCI-hosted training clusters, and Microsoft's Fairwater deployment. The technical pitch: at 131,000+ GPUs the bisection-bandwidth math breaks down with conventional Clos topologies, so OpenAI built a multi-plane fabric with hardware-assisted load spreading that keeps utilization high even under hotspot-heavy collective ops.[4]OpenAI — Supercomputer networking

The strategic move is the open-sourcing. OpenAI is positioning itself as the reference implementation for hyperscaler-grade AI fabrics — co-publishing with the silicon vendors who actually build the gear, rather than locking the design behind partner NDAs. That maps cleanly to OpenAI's broader pattern this year of publishing its harness, orchestration, and now networking primitives to seed an ecosystem rather than monetize the layer directly.[4]OpenAI — Supercomputer networking

Tools: MRC, Stargate, OCI, Microsoft Fairwater, Open Compute Project, SRv6
Industry AI Future
OpenAI

ChatGPT Ads gets a self-serve manager and CPC bidding

OpenAI opened ChatGPT Ads to a self-serve Ads Manager (US beta), added CPC bidding alongside the original CPM model, and shipped a Conversions API + measurement pixel so advertisers can attribute end-to-end.[5]OpenAI — New ways to buy ChatGPT ads Launch partners include Dentsu, Omnicom, Publicis, WPP, plus the usual ad-tech (Adobe, Criteo, Kargo, Pacvue, StackAdapt).

Read more

OpenAI is publishing three principles around the platform: answer independence (ad placements never alter the substantive answer), conversation privacy (advertisers don't see chat content), and user control.[5]OpenAI — New ways to buy ChatGPT ads These are likely the lines OpenAI will be held to as ad load scales — the principles are easy to assert at low load and harder to keep at the spend levels CPC implies.

The shift to CPC matters more than the Ads Manager UI: it tells advertisers ChatGPT Ads is willing to be measured against direct-response benchmarks (clicks, conversions) instead of just brand-impression metrics, which is what serious budget allocation requires.[5]OpenAI — New ways to buy ChatGPT ads

Tools: ChatGPT Ads Manager, Conversions API, ChatGPT pixel, Dentsu, Omnicom, Publicis, WPP, Adobe, Criteo, Kargo, Pacvue, StackAdapt
AI Tools Industry
Anthropic

Anthropic ships Claude agents for financial services

Anthropic launched 10 pre-built Claude Agent templates targeting financial services (5 research/client-facing, 5 finance/ops), available in a new marketplace on every paid plan.[1]Anthropic — Agents for financial services Claude Opus 4.7 is cited at 64.37% on the Vals AI Finance Agent benchmark, with named adopters including Citadel, BNY, FIS, Carlyle, Mizuho, Travelers, Hg, and Walleye Capital — where Anthropic claims 100% adoption across the 400-person hedge fund.

Read more

Two adjacent shipments make this stickier: Microsoft 365 add-ins (Excel, PowerPoint, Word) are now generally available with Outlook coming, and the data-connector list expanded by eight (Dun & Bradstreet, Fiscal AI, FMP, Guidepoint, IBISWorld, SS&C IntraLinks, Third Bridge, Verisk), plus a Moody's MCP app covering 600M+ companies.[1]Anthropic — Agents for financial services The play is direct: instead of selling Claude as a horizontal LLM, Anthropic is shipping a vertically integrated stack for the buyer who values domain coverage (Moody's, Verisk) and Office-native distribution.

The 100% Walleye number is the marquee customer claim and is the one to watch for replication — full-fund adoption is hard to buy and harder to sustain.[1]Anthropic — Agents for financial services

Tools: Claude Agents marketplace, Claude Opus 4.7, Microsoft 365 add-ins, Vals AI Finance Agent benchmark, Moody's MCP, Dun & Bradstreet, Fiscal AI, FMP, Guidepoint, IBISWorld, SS&C IntraLinks, Third Bridge, Verisk
Developer Tools AI Models
Google — File Search Google — Gemma 4 MTP

Google ships: File Search goes multimodal, Gemma 4 gets MTP drafters

Gemini API File Search now indexes and retrieves across text, images, PDFs and scientific imagery via Gemini Embedding 2, with custom KV metadata filtering at query time and page-level citations; early testers report reclaiming 50%+ of their context window vs full-document loading.[6]Google — Gemini API File Search multimodal Separately, Google released multi-token prediction (MTP) drafter models for Gemma 4 (including the 31B variant), achieving up to 3× faster inference with no quality degradation.[7]Google — Accelerating Gemma 4 with multi-token prediction drafters

Read more

Gemini API File Search — multimodal RAG

Three changes ship at once: multimodal indexing (text + images + PDFs + scientific imagery), per-document KV metadata you can filter on at query time (so you can scope a search to a customer / region / status), and page-level citations so you can verify where an answer came from. The 50%+ context-window reclamation isn't a quality claim, it's a cost claim: precise retrieval lets you avoid stuffing whole documents.[6]Google — Gemini API File Search multimodal

Gemma 4 MTP drafters

The drafter models ride speculative decoding: they share the target model's KV cache and reuse activations on the edge variants (E2B, E4B). Google reports ~2.2× speedup on Apple Silicon at batch 4–8 and similar gains on Nvidia A100. The drafters are Apache 2.0, compatible with vLLM, SGLang, MLX, Ollama, Hugging Face Transformers, and LiteRT-LM — i.e., usable in every serving stack without re-licensing.[7]Google — Accelerating Gemma 4

Tools: Gemini API File Search, Gemini Embedding 2, Gemma 4 (E2B/E4B/31B), MTP drafters, vLLM, SGLang, MLX, Ollama, Hugging Face Transformers, LiteRT-LM
Podcast
Milken Institute

Jensen Huang at Milken Global Conference 2026: Leading in the Age of AI

Jensen Huang's central claim: agentic AI requires ~1,000× more compute than generative AI per task, multiplied by ~100× more users, which is why Nvidia's order book is "several trillion dollars."[17]Milken Institute — Jensen Huang He directly counters Geoff Hinton's 20–30% extinction figure, opposes export-control denial of H200s to China ("100% of you need AI; none of you should have a nuke"), and explains Nvidia's investment posture: anchor model where $1 of Nvidia attracts ~$9 from others, and OpenAI is "likely the last" because AI-natives turned strongly gross-margin-positive in the last 3–6 months.

Read more

~00:00 Opening framing: "AI reinvented the computer industry" — Jensen's five-layer cake (energy, chips, infrastructure, models, applications) and why this is a re-industrialization, not a software cycle.

~04:00 The compute math: agentic AI ~1,000× more compute per task, ~100× more users than generative AI. This is the core bull case for the entire CapEx wave.[17]Milken Institute — Jensen Huang

~12:00 Hardware: Vera Rubin racks at $4–5M each, 3 tons, 1.5M parts, 7 distinct chip types, football-field-scale assemblies. The cost-per-rack is the real barrier to entry, not the chips.

~19:00 Re-industrialization: half a trillion in orders to bring suppliers to the US; "several trillion dollars" of opportunity ahead.

~25:00 Investment posture: anchor-model checks for CoreWeave, Nebius, NScale (~$1 Nvidia attracts ~$9 from others). OpenAI investment "likely the last" because AI-native model providers became strongly gross-margin-positive in the last 3–6 months.

~31:00 China and export controls: backs H200 exports, opposes denial. "100% of you need AI. None of you should have a nuke."[17]Milken Institute — Jensen Huang

~36:00 Cyber: defense will look like "swarms of white blood cells" — open-source agents patrolling for asymmetric defense rather than parity with attackers.

~40:00 Doomer rebuttal: explicitly takes on Hinton's 20–30% extinction-risk figure; uses radiology-jobs counterexample to push back on labor-collapse claims.

~44:00 Defense posture: "CEOs aren't elected, shouldn't be the ones gating US military's use of AI."

~48:00 Closer: AI compresses research months into days. "100x your ambition."

"100% of you need AI. None of you should have a nuke."
"100x your ambition."
Tools: Vera Rubin, Blackwell, H200, Stargate, CoreWeave, Nebius, NScale
Podcast
Latent Space

Latent Space: a black-holes physicist on "vibe physics" with GPT-5

Alex Lupsasca (Vanderbilt prof, OpenAI fellow, 2024 New Horizons Breakthrough Prize) describes two real physics results — a single-minus gluon tree-amplitudes paper that GPT-5.2 Pro conjectured and an internal OpenAI model proved over 12 hours, and a graviton sequel three weeks later done end-to-end by public GPT-5.2 Pro.[16]Latent Space — Vibe Physics The thesis: human research time is now mostly verification, not derivation, and "taste" is the remaining edge.

Read more

~00:00 Intro and Lupsasca's bona fides — what makes this guest different from the usual "AI for science" guest is that he's already published with the model as a co-author, not just used it as a tool.

~10:00 The first paper: GPT-5.2 Pro conjectured the closed-form gluon tree amplitudes, internal OpenAI model spent 12 hours and produced the proof. Lupsasca describes the workflow as "vibe physics" — driving the model from a handful of physics intuitions and verifying each derivation.

~25:00 The graviton sequel three weeks later, done end-to-end by the public GPT-5.2 Pro — the gap between "internal OpenAI model" and "what's in the API" is closing fast.

~40:00 Day-to-day craft: parallel "scout" chats down different solution paths, using GPT to reduce time-stuck-confused. The bottleneck is no longer the math, it's deciding which math to do.

~60:00 Hot takes — arXiv slop, raising the bar to "community-decades-stuck" problems, why static papers are obsolete and Lean / formal-methods verification is making a comeback as the model's bottleneck.

~75:00 Taste and question selection as the remaining human edge — and why this changes who gets to be a competitive theorist.

~85:00 Closing: where Lupsasca thinks "vibe physics" hits a wall (data-anchored fields with physical experiments) and where it doesn't (closed-form math).

Tools: GPT-5.2 Pro, internal OpenAI model, Lean, Mathematica, arXiv
Podcast
AI Engineer

Filip Makraduli at AI Engineer: Small-model infrastructure (Superlinked)

Superlinked's Filip Makraduli pitches Sie, an inference engine purpose-built for the specific embedding/encoder small-model traffic that vLLM and TGI weren't designed for.[18]Filip Makraduli at AI Engineer The argument: most production AI traffic is small-model traffic (BERT, Qwen, ColBERT) and the LLM-inference stack treats it as second-class.

Read more

~00:00 Why small-model serving is different — not a "smaller LLM," fundamentally a different traffic shape (more tokens-per-second per host, batching that benefits from different scheduling).

~04:00 Sie architecture and what it deliberately omits from vLLM-style designs.

~08:00 KEDA-driven autoscaling on real production traffic with Prometheus metrics — the operational story behind the engine.

~12:00 MTEB benchmark numbers for the supported model families (BERT, Qwen, ColBERT).

~15:00 Multi-tenancy in production — what changed when traffic spiked.

~18:00 Closing call to action and where Sie fits vs Triton/TGI/vLLM.

Tools: Sie, vLLM, TGI, Triton, KEDA, Prometheus, MTEB, BERT, Qwen, ColBERT
Podcast
AI Engineer

Chintan Parikh & Weiyi Wang at AI Engineer: Accelerating AI on Edge (Google DeepMind)

DeepMind's edge team showcases Gemma 4's edge variants (E2B, E4B) running on LiteRT and reports stark numbers: a 13× NPU speedup over CPU on supported devices, ~56 tok/s on iOS, and a 35× advantage vs llama.cpp on mobile in their benchmark.[19]Google DeepMind at AI Engineer

Read more

~00:00 Why Google is investing this hard in edge — privacy, latency, and the cost equation when CPM-priced LLM calls hit mobile traffic shape.

~05:00 LiteRT-LM stack overview — what changed from TensorFlow Lite era.

~12:00 NPU acceleration: 13× speedup over CPU, calibration-aware quantization story.

~18:00 iOS demo at ~56 tok/s — the live benchmark moment.

~23:00 35× vs llama.cpp comparison — caveats, model sizes, and what's actually being measured.

~29:00 Multi-token-prediction drafters tie-in (cross-references the Gemma 4 MTP launch from the same day).

~34:00 What's next: the Gemma 4 31B-on-laptop story.

~38:00 Q&A.

Tools: Gemma 4 (E2B, E4B), LiteRT-LM, llama.cpp, MTP drafters
Podcast
AI Engineer

Raj at AI Engineer: Demand-Driven Context (IKEA)

IKEA staff engineer Raj describes a "demand-driven context" methodology: deliberately let agents fail on real tasks, treat each failure as a signal that some piece of tribal knowledge is missing, and pull that knowledge into a structured KB.[20]Raj at AI Engineer — Demand-Driven Context The pull-based approach contrasts with "preload everything you might need" RAG and is meant to scale to large engineering orgs where the "real docs" are in heads, not wikis.

Read more

~00:00 Raj introduces himself (staff SWE on IKEA's deliverance-services domain) and frames the problem: KBs go stale, tribal knowledge stays in Slack threads.

~05:00 The flip: instead of capturing knowledge proactively, run agents on real tasks and let them fail — failures become a queue of missing-context tickets.

~12:00 The KB shape — what gets indexed, the per-document metadata schema, and how it links back to the failing tool calls.

~23:00 Demo: real IKEA workflow where the agent failed on a delivery-routing question, the failure surfaced an undocumented business rule, and the rule went back into the KB.

~33:00 Versus traditional RAG: why eager retrieval over a giant corpus loses to lazy retrieval over a living, failure-curated set.

~45:00 Operational mechanics: who owns the queue, how tribal-knowledge owners get paged, and why the loop has to close in days, not quarters.

~55:00 Q&A on adversarial cases (what if the agent fails for the wrong reason?).

Tools: demand-driven context, lazy RAG, internal KB tooling, MCP, Slack as context source
Hot Take Developer Tools
Theo - t3.gg

Theo (t3.gg) vs Prime: it's compute scarcity, not subsidy clawback

Theo agrees with ThePrimeagen that the easy-subsidy era is ending, but argues the driver is compute scarcity, not revenue clawback: GitHub paused Copilot signups, which "you don't do because you want to make more money."[35]Theo — Prime is (mostly) right about AI He pushes back on Prime's "Google is fine" framing — Google is the company subsidizing the hardest (AI Overviews on every search, Anti-Gravity bundling Opus 4.5) and clamping down hardest because TPUs can't keep up.

Read more

~00:00 Where Theo and Prime fully agree: the easy-subsidy era is over, Cursor's pricing was the canary (July 2024 switch from message-count), Anthropic's painted-door test on Claude Code's $20 tier was a real signal, and Dario's per-model profit framing ("each model is its own company") is correct.

~08:00 The pivot: Prime says it's about clawing back revenue. Theo says it's compute. Pausing Copilot signups is the proof point — you don't pause signups to extract more money from existing customers, you pause when you can't serve more.

~15:00 On Opus 4.6/4.7: Prime claimed they're losing money on training. Theo: they're post-training iterations (RLHF / RLVR), which is far cheaper than pre-training. The unit economics aren't what Prime thinks.

~22:00 Cost of intelligence is dropping when measured in tokens-per-task, not price-per-token. Artificial Analysis Intelligence Index numbers cited: $500 for 5.5-low at the same task quality as $2,850 for 5.4-high.

~29:00 Subscription subsidization math: Anthropic's consumer terms exclude commercial use and Max plans don't carry BAA/HIPAA — i.e., the cheap plans aren't designed to support enterprise workloads, they're loss leaders.

~38:00 GitHub Copilot's 7.5× multiplier on GPT-5.5 and 15× on Opus reflects GPU allocation, not model cost — a UI for compute scarcity, not a margin grab.

~45:00 Cursor Composer 2 is fine-tuned Kimi K2.5 from Moonshot — a Chinese model — which Theo flags as the kind of integration that's only viable when compute is the constraint.

~52:00 Enterprise inference spend: Uber burned its yearly AI budget in 4 months — a real number that suggests the demand side is real, not hype.

~60:00 Prime is right about Anti-Gravity, OpenCode plugin bans, and the broader vendor-lock pattern; Theo just thinks the why is different.

"You don't pause signups because you want to make more money."
Tools: Cursor, Cursor Composer 2, Kimi K2.5 (Moonshot), GitHub Copilot, Anti-Gravity, OpenCode, Claude Code, Claude Max, Opus 4.6, Opus 4.7, GPT-5.5, Artificial Analysis Intelligence Index
Developer Tools Hot Take
AICodeKing

GitHub Copilot's $40 plan ends June 1

AICodeKing breaks down how GitHub Copilot's $39/month Pro Plus plan, with 1,500 "premium requests," can amount to ~$14,000 of compute because one agentic task counts as one request regardless of token volume.[29]AICodeKing — Copilot $40 plan GitHub announced a billing-model overhaul effective June 1, 2026: Pro gets 1,000 AI credits ($10 worth), Pro Plus gets 3,900 credits ($39 worth), priced per-token-per-model. The arbitrage closes.

Read more

~00:05 The pricing math: 1,500 premium requests × ~$10 of agentic-task compute on average = ~$14K. Theo's example: $115 of usage at 0.8% of subscription.

~03:53 The June 1 change: switch from request-based to AI-credit billing. Pro = 1,000 credits ($10 worth), Pro Plus = 3,900 credits ($39 worth). Priced per token per model. Code completions remain unbilled.

~06:00 The hot take: GitHub priced the plan for chat-style usage and got blindsided by agentic workloads. Hence the paused Pro Plus signups (consistent with Theo's compute-scarcity thesis above) and the upcoming overhaul. Until June 1, this is a ~27-day window.

Tools: GitHub Copilot Pro, Copilot Pro Plus, AI Credits
Hot Take
The AI Daily Brief

NLW: Agents make every job a startup — the infinite backlog

NLW's thesis: agents don't save time, they convert the previously-theoretical "infinite backlog" into immediate, parallelizable work — which feels exhilarating but creates an entrepreneur-style burnout.[28]The AI Daily Brief — Agents Make Every Job a Startup The constraint shifts from typing to judgment, planning, coordination, evaluation, and cost; cost ("supply of compute and upstream from that even our supply of energy") will dominate the next 18–24 months.

Read more

~00:00 The opening parade of AI Twitter posts (Aaron Levy, Brian Johnson, Sam Altman switching to polyphasic sleep to keep up with Codex) — the early Genai narrative was time-saving but the agentic era has people working until 3am.

~04:00 The "infinite backlog" as a counter to the lump-of-labor fallacy: leaders' job is to prioritize from an always-larger queue.

~06:00 AI assistants got people from 1× to 2–4×; agents "break the rules of time" by replicating the worker infinitely.

~07:00 Closest existing analogy: entrepreneurship — the kierkegaardian "dizziness of freedom."

~10:00 Tang Yan: "the work no longer drains you through typing. It drains you through judgment. More attention, more context switching, more verification, more decisions per hour."

~15:00 The new org chart: agent ops engineers, context librarians, eval engineers, coordination architects, information pipeline owners, experiment portfolio managers, entrepreneur coaches. Aaron Levy at Box already hiring "agent engineering" FTEs to wire agents into Salesforce/Workday/Box.

"Agents make every job a startup. Not everyone wants to work for a startup, however, and that's going to cause stress and challenge."
"Are you building pacing infrastructure or just rewarding whoever stays up latest?"
Tools: Codex, Claude, OpenClaw, Box, Salesforce, Workday, Paperclip
Hot Take Industry
The AI Daily Brief

NLW: AI doom is going out of style

NLW maps the "vibe shift" away from doomerism — anchored to Ezra Klein's anti-doomer framing built on Jevons paradox and accelerating SWE demand — alongside Anthropic ARR reportedly doubling every six weeks and Sam Altman's messaging pivot from "AGI changes everything" to a more grounded productivity narrative.[27]The AI Daily Brief — Is AI Doom Going Out of Style?

Read more

~02:00 Ezra Klein's anti-doomer argument: Jevons paradox applies to knowledge work — better tools create more demand, not less.

~07:00 SWE demand is accelerating, not collapsing. NLW pulls labor data showing the post-2023 layoff narrative isn't borne out by JOLTS.

~12:00 Largest entrepreneurship explosion in history? — formation rates, AI-native startup count.

~15:00 Anthropic ARR doubling every six weeks (cited number) — the demand side of the compute story.

~20:00 CapEx bubble reframed: the backlog of AI deployments outpaces even hyperscaler spend, so "bubble" framing requires acknowledging unmet demand first.

~25:00 Sam Altman's recent messaging pivot — less "AGI is here," more "this is a productivity tool" — and what NLW reads into it.

Tools: Anthropic, Ezra Klein commentary, JOLTS, OpenAI
Hot Take AI Future
Nate B Jones

Nate B Jones: consumer AI's "anticipation gap"

Nate's thesis: AI in 2026 is finally capable, but consumer AI has become "a new inbox" — users have to manage agents, which is the wrong unit of work.[25]Nate B Jones — Consumer AI The frontier question shifts from "can AI answer/act" to "can AI do useful work without pulling me into a new management layer."

Read more

~00:00 The anticipation gap: consumer agents demo well but hit a reactive ceiling because consumer AI inherits ChatGPT's query-box behavior — it waits for you. There is no consumer-AI equivalent of "Google ranks for this query without asking" yet.

~05:00 Why coding agents work but consumer doesn't — there's a compiler for code, no compiler for taste.

~12:00 Symphony and the OpenAI enterprise attention bottleneck — open-source orchestration alone doesn't fix the fact that someone still has to triage agent output.

~18:00 Fake proactivity vs real lived proactivity — the test isn't "does the agent message you," it's "does the message change a decision you'd otherwise make."

~23:00 Product roundup of the genuine attempts: Poke, Clickie.so, Cluely, Cowork + Stripe agent wallets, Codex's Chronicle memory feature.

~29:00 The trust ladder — Nate's five-step permissioning framework for agent autonomy.

~34:00 Prosumer bridge: how proactive agents will reach consumers via work first (Slack, Notion, Superhuman), then cross over.

~38:00 Three early warning signs proactive agents are coming: key hires (Peter Steinberger to OpenAI), monthly load-lifting tests, frontier release notes mentioning consumer agentic intent with memory.

~43:00 Specific cited datapoints: Hawaii swimsuit personalization, "lines in China to uninstall openclaw," GitHub planning for 10×–30× repo growth, his pushback on the ubiquitous "book a trip" demo.

Tools: Symphony, Poke, Clickie.so, Cluely, Cowork, Stripe agent wallets, Codex Chronicle, OpenAI memory, ChatGPT
AI Tools Productivity
Nate Herk

Nate Herk: Higgsfield + Claude as a creative agency

Nate Herk wires Higgsfield (image/video generation) into Claude via a custom MCP connector and uses Claude Code with markdown knowledge files, the gws CLI for Google Sheets asset tracking, and scheduled routines to run autonomous ad-generation overnight.[24]Nate Herk — Higgsfield + Claude Hot take: "this is the worst AI video will ever be."

Read more

~00:00 Setup: connecting Higgsfield to Claude via a custom MCP connector — Claude becomes the orchestration layer, Higgsfield is the model.

~04:00 Higgsfield Marketing Studio + Hypermotion style ads — what the actual output looks like, where it still breaks.

~10:00 Claude Code + Higgsfield CLI as a creative-studio project layout.

~15:00 Domain expertise as markdown — encoding "what works for our brand" as files Claude reads instead of as system prompts.

~20:00 gws CLI Google Sheets asset tracker — a real workflow piece, not a toy.

~25:00 Reference image consistency for product ads: same model + lighting across a campaign.

~30:00 Reverse-engineering skills from winning outputs — when something works, can Claude extract the prompt structure?

~35:00 Routines for scheduled autonomous ad generation — overnight runs with morning approval queues.

~40:00 Hot take: "this is the worst AI video will ever be" — accept the current ceiling, build for next year's.

Tools: Higgsfield, Higgsfield Marketing Studio, Hypermotion, Claude, Claude Code, MCP, Higgsfield CLI, gws CLI, Google Sheets
Developer Tools
Github Awesome

35 Claude Code skills on GitHub

A roundup of 35 trending Claude Code skills, including claude-video (lets Claude watch videos), WRITING.md (enforces human-like prose via a markdown rule file), paper2code (arXiv PDFs → runnable PyTorch), skill-doctor (audits your skills for conflicts), and skills-manager (Tauri desktop app that syncs skills across 27 platforms via symlinks).[30]Github Awesome — 35 Claude Code skills

Read more

~00:00 Video, writing, and research tools — claude-video, WRITING.md, paper2code.

~03:01 Skill quality and management — skill-doctor (local audit), skills-manager (Tauri, symlink sync across 27 platforms).

~03:01 Precision coding workflows — Matt Pocock's TDD primitives, Design Council's 11-agent debate system, Code Overhaul for architectural teardowns.

~03:01 AI-native design systems — Oh My Design (67 real-world systems), spec-first Web Design skill, Claude → SwiftUI transpiler.

~06:02 Library-specific skills — Library skills (embeds skills in NPM/Python packages), official Cloudflare skills, DSPy agent skills.

~05:02 Game dev / image generation — Sprite Pipeline, Agent Sprite Forge, OG Image, Shots (app-store screenshots via GPT Image 2).

~07:02 Token efficiency & session continuity — Usage Limit Reducer (visualizes JSONL token spend), Agent Session Resume (handoff checkpoints), 3M Framework (persistent background operator).

~07:02 Voice and writing style — Agent Style (Strunk & White), Aerodani Speak (Rocky from Project Hail Mary voice), Linus Torvalds skills.

~14:04 Multi-model review — God Mode (/work command sending code to Codex + Gemini + open model in parallel), Compose Performance skills.

Tools: claude-video, WRITING.md, paper2code, skill-doctor, skills-manager, Oh My Design, Sprite Pipeline, Agent Sprite Forge, OG Image, Shots, Usage Limit Reducer, Agent Session Resume, 3M Framework, God Mode, Library skills, Cloudflare skills, DSPy
Developer Tools
Simon Willison Simon Willison Simon Willison

Simon Willison releases: datasette-referrer-policy, datasette-llm, llm-echo

Three small but useful Datasette/LLM ecosystem releases in one day: datasette-referrer-policy 0.1 (override the default no-referrer header so OpenStreetMap tiles stop getting blocked, built with Codex + GPT-5.5),[8]datasette-referrer-policy datasette-llm 0.1a7 (per-model defaults — pin a model and set temperature for all enrichment ops),[10]datasette-llm 0.1a7 and llm-echo 0.5a0 with thinking-block support for offline reasoning tests.[11]llm-echo 0.5a0

Read more

datasette-referrer-policy 0.1

Datasette's default Referrer-Policy: no-referrer header was breaking OpenStreetMap tile loads (OSM blocks no-referrer requests). The plugin lets operators override per-site. Built with Codex + GPT-5.5.[8]datasette-referrer-policy

datasette-llm 0.1a7

Configurable per-model defaults: pin a model + set temperature=0.5 for all enrichment operations, making Datasette LLM deployments reproducible and cost-predictable.[10]datasette-llm 0.1a7

llm-echo 0.5a0

Adds -o thinking 1 support aligned with LLM 0.32a0's new thinking-block architecture — fully offline CI tests for reasoning workflows via uvx.[11]llm-echo 0.5a0

Tools: Datasette, datasette-referrer-policy, datasette-llm, llm-echo, LLM 0.32a0, Codex, GPT-5.5, uvx, OpenStreetMap
AI Future Hot Take
Simon Willison

Stockholm's AI cafe orders 120 eggs with no stove

Andon Labs deployed an autonomous AI agent named Mona to run a cafe in Stockholm. Mona ordered 120 eggs for a cafe with no stove, 22.5 kg of canned tomatoes for a sandwiches-only menu, and dispatched EMERGENCY-tagged correction emails to suppliers.[9]Simon Willison — Stockholm AI cafe Simon's argument: agents taking real-world outbound actions against non-consenting third parties (suppliers, government offices) need human approval gates as a hard rule, not a setting.

Read more

The Andon Labs experiment is a clean reductio ad absurdum of the "let the agent run wild" pattern: a real cafe, real suppliers, real money — and an agent making decisions with no model of the physical kitchen it's allegedly operating. The 120-eggs anecdote isn't just a funny number; it's that a supplier received an order, allocated inventory, and the cafe owner had to send a follow-up email tagged EMERGENCY to undo it.[9]Simon Willison — Stockholm AI cafe

Simon's framing is the more durable point: agentic systems should distinguish read-only reasoning from outbound actions that touch other parties. The supplier did not opt in to being part of the experiment; the EMERGENCY email confirms it.[9]Simon Willison — Stockholm AI cafe

Industry
Sherwood Snacks Morning Brew

GameStop's $56B bid for eBay — and the funding gap

Ryan Cohen's GameStop made an unsolicited $125/share offer for eBay (~$55.5–56B total, 46% premium to pre-bid valuation), 50% cash and 50% GameStop stock — but the funding math doesn't add up: ~$9B GameStop cash plus a "highly confident" letter for up to $20B from TD Securities still leaves a $16–20B gap.[14]Sherwood Snacks — GameStop eBay bid[15]Morning Brew — GameStop offers eBay

Read more

Cohen had built a 5% economic stake in eBay before the formal offer. Market reaction: eBay +5%, GameStop -5% to -10% on the day of the announcement. Michael Burry exited his full GME position.[14]Sherwood Snacks — GameStop eBay bid

The strategic rationale Cohen pitched: GME stores as eBay live-streaming / authentication / fulfillment hubs, and slashing eBay's marketing budget. Morning Brew highlights Cohen's awkward CNBC interview where he couldn't explain the funding gap.[15]Morning Brew — GameStop offers eBay

Analyst consensus: deal unlikely to close — eBay's existing turnaround is working, and the financing gap is too wide for a "highly confident" letter to plug without a strategic equity partner.[14]Sherwood Snacks — GameStop eBay bid

Industry
Tech Brew

Trump's AI EO reversal: laissez-faire era may be over

The Trump administration is reportedly drafting an executive order requiring Pentagon-led safety testing of AI models before public release — a near-180 from its earlier rollback of Biden-era guardrails.[13]Tech Brew — AI EO reversal Google, Microsoft, and xAI have already agreed to give government evaluators early access to unreleased models. Anthropic's Mythos model leak is cited as a specific trigger.

Read more

The article ties the policy reversal to two things: David Sacks's late-March departure (he was the deregulation advocate) and the rising influence of more security-focused officials. Industry pushback is the predictable "this slows US competitiveness vs China" line.[13]Tech Brew — AI EO reversal

Important caveat: the article notes discussions are still speculative — no final EO language exists yet. The voluntary early-access agreements with Google/Microsoft/xAI are real, the EO is not yet.[13]Tech Brew — AI EO reversal

AI Models
Sequoia Capital

Greg Brockman: GPT autonomously finished the project overnight

In a Sequoia clip, OpenAI's Greg Brockman puts current models at "~80% of the way to AGI" and recounts an engineer who handed GPT-5.3 a complex systems-engineering spec, went to sleep, and woke up to a working, profiled, iteratively-optimized implementation.[23]Sequoia Capital — Greg Brockman

Read more

~00:00 Brockman frames the anecdote as the kind of capability that didn't exist 18 months ago — not just "GPT wrote some code," but an autonomous loop that profiled, identified bottlenecks, and iterated. The "overnight" framing matters: it's about wall-clock asynchrony, not raw speed. The implication for staffing math is the same one NLW raises in topic #13 — the constraint moves from typing to judgment.[23]Sequoia Capital — Greg Brockman

Podcast
The Pragmatic Engineer Lenny's Podcast

Judgment & agency: two podcast clips

Two short clips on the same theme from different angles: Mario & Armin on Pragmatic Engineer argue AI agents undermine senior engineering judgment by removing the friction that enforces discipline (juniors can now weaponize AI-generated justifications),[21]Pragmatic Engineer — Mario & Armin and Lenny's clip frames agency — "the belief that you are someone who makes things" — as the scarce resource in an AI-enabled world, with the skills-gap excuse gone.[22]Lenny's — Agency

Read more

Pragmatic Engineer ~00:00 Mario & Armin: "a good engineer says no, a lot." When AI can produce a confident-sounding rationale for any approach, the senior engineer's "no" gets questioned more — and tribal knowledge gets overridden by AI output that sounds right.

Lenny's ~00:00 Agency isn't evenly distributed. The "skills gap" excuse is gone — what separates builders now is internal conviction and the compounding effect of just starting.

Both clips converge on the same point as topic #13 (NLW) and topic #22 (Brockman): the constraint is now judgment / agency / taste, not capability.

Developer Tools Industry AI Tools
Simon Willison Nate B Jones Better Stack Better Stack Github Awesome OpenAI DeepLearning.AI Real Python marimo Acquired Every

Quick hits

Eleven smaller items worth flagging — from Simon's pull on YC's reported $5B+ OpenAI stake to the React Compiler getting rewritten in Rust to deepclaude rerouting Claude Code's tool loop onto DeepSeek V4 Pro for a claimed 17× cost reduction.

Read more

Industry

  • YC's $5B+ OpenAI stake under scrutiny — Simon Willison quotes John Gruber: Y Combinator owns ~0.6% of OpenAI, worth $5B+ at the $852B valuation. Simon flags it as hard to verify publicly but worth scrutinizing given YC's dual role as investor and commentator.[12]Simon Willison — YC's OpenAI stake
  • The X.400 vs SMTP protocol war — Better Stack short on how SMTP won via simplicity despite X.400's technical superiority; the historical version of every "good enough beats perfect" debate happening in AI infra today.[37]Better Stack — X.400 vs SMTP
  • Ferrari's market cap beats Ford + VW + Mercedes combined — Acquired short on luxury scarcity outpunching mass auto by ~160× fewer cars sold.[39]Acquired — Ferrari

AI Models / Tools

Developer Tools

  • React Compiler rewritten in Rust — Better Stack flags Meta's mostly-AI-generated rewrite delivering ~10× faster transformation; still experimental.[36]Better Stack — React Compiler
  • deepclaude reroutes Claude Code's tool loop to DeepSeek V4 Pro — Drop-in rerouter, claimed 17× cost reduction. Worth eyeballing if you're paying real money to run Claude Code agentically.[31]Github Awesome — deepclaude
  • marimo: docs export to markdown for LLMs — One-click markdown export, send-to-ChatGPT/Claude/Cursor.[38]marimo — markdown export

Hot Takes / Productivity

  • Spec-driven dev as a security practice — DeepLearning.AI's comedic skit about shipping AI-generated code without specs and leaking credit cards. Punchline argues for spec-first agentic dev.[33]DeepLearning.AI — Spec-Driven Dev
  • Every: how Monologue's landing page was built in Framer — Skeuomorphic dark-mode aesthetic, Paper Shaders for backgrounds (replaces video loops at scale), Rive for stateful interactive animations, footer-as-hero CTA.[40]Every — Monologue Framer

Sources

  1. Blog Agents for financial services — Anthropic, May 5
  2. Blog GPT-5.5 Instant: smarter, clearer, and more personalized — OpenAI, May 5
  3. Blog GPT-5.5 Instant System Card — OpenAI, May 5
  4. Blog Supercomputer networking to accelerate large scale AI training — OpenAI, May 5
  5. Blog New ways to buy ChatGPT ads — OpenAI, May 5
  6. Blog Gemini API File Search is now multimodal: build efficient, verifiable RAG — Google Developers Blog, May 5
  7. Blog Accelerating Gemma 4: faster inference with multi-token prediction drafters — Google Developers Blog, May 5
  8. Blog datasette-referrer-policy 0.1 — Simon Willison's Weblog, May 5
  9. Blog Our AI started a cafe in Stockholm — Simon Willison's Weblog, May 5
  10. Blog datasette-llm 0.1a7 — Simon Willison's Weblog, May 5
  11. Blog llm-echo 0.5a0 — Simon Willison's Weblog, May 5
  12. Blog Quoting John Gruber — Simon Willison's Weblog, May 5
  13. Newsletter The AI laissez-faire era may be over — Tech Brew, May 5
  14. Newsletter GameStop's bold $56 billion eBay bid — Sherwood Snacks, May 5
  15. Newsletter GameStop offers to buy eBay — Morning Brew, May 5
  16. YouTube Top Black Holes Physicist: GPT5 can do Vibe Physics, here's what I found — Latent Space, May 5
  17. YouTube Leading in the Age of AI: A Conversation with NVIDIA CEO Jensen Huang | Global Conference 2026 — Milken Institute, May 5
  18. YouTube The Small Model Infrastructure Nobody Built (So We Did) — Filip Makraduli, Superlinked — AI Engineer, May 5
  19. YouTube Accelerating AI on Edge — Chintan Parikh and Weiyi Wang, Google DeepMind — AI Engineer, May 5
  20. YouTube Demand-Driven Context: A Methodology for Coherent Knowledge Bases Through Agent Failure — AI Engineer, May 5
  21. YouTube Mario & Armin: A good engineer says no, a lot — The Pragmatic Engineer, May 5
  22. YouTube Agency isn't evenly distributed — Lenny's Podcast, May 5
  23. YouTube His engineer went to sleep. AI finished the project. | OpenAI's Greg Brockman — Sequoia Capital, May 5
  24. YouTube Higgsfield Just Turned Claude Into a Creative Agency — Nate Herk, May 5
  25. YouTube Consumer AI Has a Problem Nobody's Naming. — Nate B Jones, May 5
  26. YouTube This Is Why Distilled Models Collapse — Nate B Jones, May 5
  27. YouTube Is AI Doom Going Out of Style? — The AI Daily Brief, May 5
  28. YouTube Why Agents Make Every Job a Startup — The AI Daily Brief, May 5
  29. YouTube Github's Copilot $40 PLAN IS CRAZY ($14K of USAGE!): HOW IS THIS POSSIBLE?! — AICodeKing, May 5
  30. YouTube 35 Claude Code skills on GitHub — Github Awesome, May 5
  31. YouTube deepclaude: a drop-in rerouter that runs Claude Code's entire tool loop on DeepSeek V4 Pro instead — Github Awesome, May 5
  32. YouTube Prep for sales meetings faster with Codex — OpenAI, May 5
  33. YouTube Just deployed to production… and leaked all the credit cards — DeepLearning.AI, May 5
  34. YouTube Build a Reverse Dictionary Powered by LLMs? — Real Python, May 5
  35. YouTube Prime is (mostly) right about AI — Theo - t3.gg, May 5
  36. YouTube React Compiler is now written in Rust. Is TypeScript cooked? — Better Stack, May 5
  37. YouTube How The 80s "Protocol War" Created Modern Email — Better Stack, May 5
  38. YouTube Agents Finally Get It Now — marimo, May 5
  39. YouTube Ferrari has a higher market capitalization than Ford, Mercedes-Benz, and Volkswagen — Acquired, May 5
  40. YouTube How We Designed Monologue's Landing Page With Framer — Every, May 5