OpenAI proved Erdős wrong. Anthropic banned the rest.

May 21, 2026

18 topics · 25 sources

AI Future Industry
The Rundown AI

OpenAI disproves an 80-year Erdős conjecture (and calls it Level 4)

OpenAI says an internal general-purpose reasoning model autonomously disproved Erdős' 1946 unit-distance conjecture — using algebraic number theory — and that the result has been independently verified by Tim Gowers, Noga Alon, and Thomas Bloom.[1]The Rundown AI — OpenAI cracks an 80-year math belief Sam Altman called it "kinda big." OpenAI is framing it as the first AI-driven discovery of genuinely novel mathematics — its working definition of "Level 4" AI: systems that can make original contributions across disciplines rather than just synthesizing existing knowledge.

Read more

Erdős' 1946 conjecture is about the maximum number of same-length connections (unit distances) that can link points in a plane — a stubbornly open problem in combinatorial geometry. The Rundown reports the model's argument leans on algebraic number theory, an unusual route for a problem that historically attracted combinatorial approaches.[1]The Rundown AI

The Rundown adds a useful caveat: OpenAI previously overclaimed a GPT-5 result in 2025 that turned out to be a literature finding rather than a true discovery. This one, however, has the names attached — Gowers, Alon, Bloom — so it carries more weight. The strategic angle is the "Level 4" branding: OpenAI is trying to put a stake in the ground for "AI as autonomous researcher," distinct from the model-router and agent-harness conversations of the past month.

"kinda big" — Sam Altman
AI Models Developer Tools
Artificial Analysis

Cursor Composer 2.5: third on the coding index, 10–60× cheaper

Artificial Analysis benchmarked Cursor's new Composer 2.5 model and put it third on its Coding Agent Index with a score of 62 — behind Claude Opus 4.7 (66) and GPT-5.5 (65) — at $0.07/task standard vs. roughly $4.10–$4.82/task for the top two.[2]Artificial Analysis — Cursor's Composer 2.5 The "Fast" variant costs $0.44/task and completes work in 6.7 minutes (third fastest) but is 6× the standard mode cost.

Read more

Composer 2.5 is built on continued training of Moonshot AI's open-weights Kimi K2.5 model, with Cursor reporting that ~85% of total compute came from its own additional training and reinforcement learning.[2]Artificial Analysis Benchmarks span SWE-Bench-Pro-Hard-AA (+35 points over base K2.5), Terminal-Bench v2, and SWE-Atlas-QnA.

The cost story is the headline. Independent of how the benchmark numbers shake out, paying $0.07 per coding task instead of $4+ is a structural shift — Cursor is effectively buying market share from the frontier-priced incumbents while still landing on the same podium. If the quality holds in real-world use, this is the first big crack in Opus 4.7's pricing moat since launch.

AI Models
Artificial Analysis

Cohere ships Command A+ open-weights — first in non-hallucination

Cohere released Command A+ as open weights, a year after the original Command A. It scores 37 on the Artificial Analysis Intelligence Index — the same tier as Claude 4.5 Haiku — but ranks first on AA-Omniscience Non-Hallucination at 86%, and runs at 281 output tokens/sec on Cohere's API.[3]Artificial Analysis — Command A+

Read more

Other reported benchmarks: ~11% HLE, ~76% GPQA Diamond, ~25% Terminal-Bench Hard, ~38% SciCode, 63% MMMU-Pro.[3]Artificial Analysis The headline is the non-hallucination result — Command A+ tops the leaderboard there, which makes it interesting as a retrieval-grounded model for enterprise RAG even when the raw intelligence number is mid-pack.

Artificial Analysis didn't publish parameter count, license terms, or pricing in this write-up — they link out to artificialanalysis.ai/models/command-a-plus for that. Year-over-year, Cohere's open-weights cadence is well behind the open-weights leaders (Kimi, DeepSeek, GLM), but a clear specialization in factuality may be a more defensible position than chasing the frontier.

Industry
Sherwood Snacks Tech Brew

Nvidia: another quarterly beat and the Vera CPU era

Nvidia posted its 15th consecutive top-line beat and 14th straight EPS beat, raised its dividend from $0.01 to $0.25 per quarter, and authorized an $80B buyback expansion.[4]Sherwood Snacks — Nvidia beats again Tech Brew's frame is bigger: with the first Vera CPUs shipping to Anthropic, OpenAI, xAI, and Oracle, Nvidia is now opening a second front against Intel and AMD — Jensen Huang pegs it as a "$200 billion" market.[5]Tech Brew — Nvidia's CPU era has arrived

Read more

Earnings

Sherwood doesn't publish exact revenue/EPS but frames the print as broadly strong with well-received Q2 guidance.[4]Sherwood Snacks Analyst attention is on Blackwell and the upcoming Rubin generation, with sales projected to top $1T through 2027. Nvidia entered earnings up ~20% in 2026, the second-best Magnificent 7 performer — but trailing semiconductor peers concentrated in memory, networking, and CPUs. The newsletter flags a recurring pop-then-fade pattern post-earnings.

"I want the business." — Jensen Huang, on Vera CPUs

Vera CPUs and the CPU wars

Tech Brew reports Vera was announced in March and started shipping within two months to Anthropic, OpenAI, xAI, and Oracle.[5]Tech Brew Nvidia projects $20B in CPU sales for 2026 alone. The framing: agentic AI and physical robotics are CPU-intensive workloads, not pure GPU ones, so Nvidia is filling the gap before Intel and AMD can. Both incumbents are responding — Intel is tightening quality, AMD is building rival chips — while Meta and Amazon continue their homegrown alternatives. Tech Brew calls it "the beginning of the CPU wars."

Tools: Nvidia, Vera, Blackwell, Rubin, Groq
AI Tools Developer Tools
Simon Willison's Weblog

Datasette Agent ships: natural-language SQL, plugin ecosystem, live demo

Simon Willison shipped the first alpha of Datasette Agent — an extensible, plugin-based AI assistant that wraps Datasette and translates natural-language questions into SQL. A live demo runs at agent.datasette.io on Gemini 3.1 Flash-Lite; local models like gemma-4-26b-a4b via LM Studio are supported too.[6]Simon Willison — Datasette Agent

Read more

Datasette Agent combines Willison's three-year-old LLM Python library with Datasette so users can ask questions against any database — the agent generates SQL, runs it, and shows results.[6]Simon Willison — Datasette Agent Same day, he shipped three companion releases:

Worth watching as a reference implementation: the architecture leans into Willison's existing LLM ecosystem, so plugins compose cleanly with whatever model backend you bring. The "View SQL query" UX pattern is a small but pointed bet on transparency over magic — every chart and table answers "how did you get this?"

Developer Tools Industry
Google Blog

Google Play at I/O: Play Shorts, Ask Play, Gemini as a distribution surface

Google rolled out a clutch of Play Store changes at I/O 2026 — short-form app previews ("Play Shorts"), a conversational search ("Ask Play"), and a new distribution surface inside the Gemini assistant on Android and web that bypasses the Play Store entirely.[10]Google Blog — Google Play I/O 2026

Read more

The post is light on developer APIs and heavy on discovery surfaces. The five highlights:[10]Google Blog

  • Play Shorts — short-form video previews of app functionality in the Play Store, beyond screenshots.
  • Ask Play — natural-language app search; metadata and feature descriptions become discovery signals rather than just keywords.
  • Gemini App Distribution Surface — apps surface directly inside the Gemini assistant on Android and web, a new distribution channel that sidesteps the storefront.
  • Engage SDK Expansion — Engage-driven recommendations now extend to more Google surfaces.
  • Play Games Sidebar — an in-game overlay for tips, rewards, and social updates without leaving the game.

The pattern: Google is pushing developers toward a multi-surface distribution model with Gemini as the connective layer. For ASO-heavy teams, "Ask Play" matters most — the SEO playbook for the Play Store is about to change.

Industry
Data Science Weekly

Data Science Weekly #652: measurement, transformers, randomization

Issue 652's picks lean toward fundamentals over fashion: a critique of A/B-test randomization errors, an argument that measurement (not modeling) is still the core of data science in the AI era, and a hands-on transformer-from-scratch tutorial.[11]Data Science Weekly — Issue 652

Read more

Top picks from the issue:[11]Data Science Weekly

  • What's going on in computational neuroscience nowadays? — field notes from the Cosyne conference.
  • Is logistic regression regression? — short essay on whether the name fits the algorithm.
  • What Every Experimenter Must Know About Randomization — randomization errors that quietly invalidate A/B tests and clinical research.
  • What data science is actually about in the age of AI — measurement remains the core mission; modeling is downstream.
  • Transformer From Scratch — implement a transformer from first principles.
Industry
The Rundown AI

Rundown bullets: Google Co-Scientist, Claude vs Grok crime sim, Intuit cuts 17%

Beyond the Erdős story, The Rundown's secondary bullets are unusually dense — a Gemini-powered Co-Scientist hit 91% scarring reduction in a Stanford liver-fibrosis study, Emergence's town sim showed wildly different agent behaviors (Claude: 0 crimes; Grok: 200+, all agents dead by day 4), and Intuit announced 17% workforce cuts on an AI pivot.[1]The Rundown AI

Read more
  • Google Co-Scientist — Gemini-powered tool using "idea tournaments" for biology hypothesis generation; 91% scarring reduction in a Stanford liver-fibrosis test.
  • Emergence town sim — Claude registered 0 crimes; Grok racked up 200+, with all agents dead by day 4. A more striking alignment-by-behavior result than most paper benchmarks.
  • Claude context audit guide — new training resource for inspecting Claude's context window.
  • Amazon space data centers — Bezos says realistic but the 2–3-year timeline some are pushing is too aggressive given energy and cost constraints.
  • OpenAI Guaranteed Capacity — enterprise compute reservations with 1–3 year terms and tiered pricing.
  • GitHub security incident — malicious VS Code extension affected ~4,000 internal projects; no customer data compromised.
  • Intuit layoffs — 17% workforce cut attributed to an AI-focused strategic pivot.
Podcast
Sequoia Capital Sequoia Capital (clip)

Sequoia × Notion's Ivan Zhao: The Refounder

Notion's Ivan Zhao sits with HubSpot's Dharmesh Shah at Sequoia and walks through twice "refounding" the company — first in Kyoto pre-PMF, then in Cancun after GPT-4. The argument: building with LLMs is "brewing beer, not engineering bridges," so Notion now hires barbell teams (very junior + very senior), killed the CMO org, replaces SaaS planning with weekly product "jazz," and treats the company as a "jazz band, not a marching band."[12]Sequoia × Notion's Ivan Zhao

Read more

~00:00Jazz band, not marching band. Zhao opens with the core operating metaphor: classic SaaS hierarchies are marching bands optimized for predictability; building with LLMs requires the opposite — small, taste-driven units improvising against ambiguous outputs.[12]Sequoia × Notion's Ivan Zhao

~03:02Five years of pre-PMF despair. Zhao recounts the years before Notion clicked — building, scrapping, rebuilding. The throughline: most founders quit before the second refounding.

~06:03Building with LLMs is brewing beer. The pivotal frame: deterministic engineering doesn't apply to language models. You tune the yeast, you don't tell it where to go.

"Building classic software is like engineering a bridge… building with language models is like brewing beer — you cannot tell the yeast, 'hey, go toward that flavor profile more.'"

~10:04Hierarchy, flatter orgs, and barbell hiring. Zhao argues that since LLMs normalize capability, the differentiated inputs are taste and agency — which favor very junior people (high agency, low cost) and very senior people (high taste). Mid-level layers become hard to justify.

"Talent equals capability times taste times agency. Language models normalize capability, so we optimize for the latter."

~15:07Killing the CMO org; rebuilding sales hiring. Notion dissolved its CMO function and rebuilt marketing as product-led. In the companion clip, Zhao discusses where they stumbled in enterprise sales: their first hire was a "systems thinker" suited to PLG order-taking, not outbound enterprise selling, and PLG masks the absence of real outbound capability because customers already want to buy.[13]Sequoia — Notion sales mistakes

~24:16Refounding #1: Kyoto and craft culture. Zhao describes leaving SF for Kyoto pre-PMF, absorbing Japanese craft culture, and rebuilding Notion's design philosophy around it.

~33:22Refounding #2: GPT-4 in Cancun. When GPT-4 dropped, Zhao went to Cancun to think it through and came back having rewritten Notion's AI strategy from the ground up.

"GPT-4 was a full-body religious experience. Like, holy shit — anything you do, if you don't do this, it will be meaningless."

~38:30Decalcifying via acqui-hired founders. Bringing in founders from acquired startups as a deliberate counter to the calcification that hits mid-stage companies.

~47:33Personal operating system and AMAs. How Zhao runs his own week — Notion-as-second-brain, regular internal AMAs, transparency rituals.

~52:35Enterprise sales: stop reinventing the wheel. Classic enterprise sales hasn't changed since the 1990s; modern internal tooling can improve efficiency but the motion itself is well-known. Don't try to innovate everywhere — concentrate the innovation budget on the few things that truly matter.

~56:39Company as religion, culture as cult. The closing frame: durable companies look more like religions than corporations — long-lived, ritual-driven, with great founders and great heads of sales.

"The Catholic Church is one of the most successful companies of all time — 2,000 years. Great founder in Jesus and a great head of sales in Paul."
"You have to feel the AGI. You can't read about it, you can't watch YouTube — you have to build."
Tools: Notion, GPT-4, GPT-3, Claude, Opus, OpenAI fine-tuning, Slack, GitHub, HubSpot
Podcast Hot Take
Y Combinator

YC: build the self-improving company, not the Roman legion

A YC partner argues the Roman-legion org chart — humans relaying information up and down a hierarchy — is broken by AI. The replacement: recursive self-improving AI loops as the organizational substrate, not a productivity add-on. YC's own internal example: a monitoring agent watches employee queries, detects failures, writes the fix, opens a PR, has another agent review and deploy it — all overnight. YC's demo-day companies now show ~5× revenue per employee vs. 18 months ago.[14]YC — Self-Improving Company

Read more

~00:00The Roman Legion problem. Most companies are organized like Roman legions. Jack Dorsey's framing: this assumption is broken by AI.[14]YC

~01:00Beyond co-pilots. "AI as productivity booster" is the old frame. The new frame: companies as recursive self-improving AI loops, not engineers with a bolt-on tool.

~03:02The 5-layer loop. Sensor (inputs from the world) → policy (rules + permissions) → tool layer (deterministic APIs) → quality gate (evals/human review) → learning mechanism that feeds failures back in. The loop must run with minimal human intervention to compound overnight.

~04:02Live YC example. A monitoring agent watches all YC employee queries, identifies failures, writes fix code, opens a PR, has an agent review/merge/deploy. The same failing query succeeds the next morning. The speaker calls this the "holy sh** moment."

"For me, that was like the holy [bleep] moment. That's not just AI making you 20 or 30% more valuable. It is the AI going through this loop to figure out how to self-improve." ~05:02

~07:03Tokens > headcount. YC demo-day companies average ~5× more revenue per employee than 18 months ago. Measure token usage directionally. Middle management is replaced by AI coordination; every role becomes an IC with a single DRI.

"Burn tokens, not headcount. We are seeing companies get to demo day with about 5× more revenue per employee than they did 18 months ago."

~08:03Make the org legible to AI. Record everything — emails, Slack, DMs, office hours. "If it is not recorded, it does not exist to the AI." YC regenerated its entire 150-page user manual from 2,000 hours of recorded office hours in one weekend; it now self-updates monthly. Business context is the durable asset; software is ephemeral.

Tools: RAG, Codex (one-shotting internal dashboards), agent-driven CI/merge pipelines, monitoring agents
AI Tools Developer Tools Industry
OpenAI OpenAI (plugins) OpenAI (Appshots) Y Combinator

OpenAI's Codex day: Goals, plugin sharing, Appshots, and $2M for every YC company

OpenAI dropped a coordinated salvo of Codex features and a high-profile YC tie-in. Goals (via /goal) supports long-running tasks across app/IDE/CLI with the goal itself serving as both task prompt and completion criteria; Codex plugin sharing ships with a Shared-with-you tab and deep-link share URLs; and Appshots arrives as a new in-Codex artifact.[15]OpenAI — Codex Goals[16]OpenAI — Codex plugin sharing[17]OpenAI — Appshots Separately, OpenAI announced $2M of tokens (via an uncapped SAFE at Series A valuation) to every YC company in the Spring and Summer 2026 batches.[18]YC — OpenAI $2M for every batch company

Read more

Codex Goals

~00:00 — The new /goal command activates a long-running task mode across the Codex app, IDE, and CLI.[15]OpenAI — Goals The goal itself is both the task prompt and the completion criteria. Codex can help users author goals via plan mode or an interview flow. Running goals support steering messages, non-interrupting side chats, and pause/resume. OpenAI cites a 100-hour single-goal run in the outro — a clear bid for the long-horizon territory that Anthropic's "Claude routines" has been targeting.

Plugin sharing

~00:00 — Outbound sharing to specific teammates or the whole workspace via a modal and share link, a "Shared with you" tab for inbound discovery, and deep-link share URLs for the curated plugin directory (demoed with a Slack plugin).[16]OpenAI — Plugin sharing

Appshots

Codex now produces Appshots — a new artifact for shareable in-Codex snapshots.[17]OpenAI — Appshots

$2M tokens for every YC company

OpenAI is committing $2M in tokens (not cash) to every YC Spring 2026 and Summer 2026 batch company via an uncapped SAFE at Series A valuation.[18]YC — OpenAI $2M for YC batches The deal targets founders running agent-heavy "token-maxing" workflows. A special application window closes May 25, 2026, with decisions by June 5. Read in combination with the Anthropic ban story below, the timing is striking: OpenAI is buying ecosystem mind-share at exactly the moment Anthropic is restricting it.

Industry Hot Take
Better Stack

Anthropic banned every third-party Claude tool. OpenAI is tearing the walls down.

Better Stack walks through Anthropic's multi-stage crackdown on third-party Claude harnesses — silent token blocking in January, a February ToS update, April enforcement that included scanning Git history for keywords like "Open Claude" and "Hermes" — and the new "programmatic credits" system that replaces it.[19]Better Stack — Anthropic third-party ban Credits are billed at full API rates, don't roll over, and require opt-in by June 15. The host's take: classic vendor lock-in, mirrored by OpenAI taking the opposite tack — including Codex in every ChatGPT subscription and offering enterprise Claude switchers two free months of Codex.

Read more

~00:00 The programmatic credits system

Subscription tiers come with API credits worth the subscription price — $20 Pro, $100 Max 5×, $200 Max 20× — but billed at full API rates, with no rollover.[19]Better Stack The host estimates $20 of Opus 4.7 usage burns in ~2 days; $200 may not last a heavy user a week. Opt-in required by June 15.

"Your $200 max subscription gives you $200 of API credits per month, which, if you're a heavy user, could be gone in a single afternoon."

~01:00 Timeline of the crackdown

  • Jan 9: Silent block of subscription tokens outside official apps; no announcement; some developer accounts banned.
  • February: ToS updated to formally prohibit third-party harnesses.
  • April: Active enforcement — apps like Open Code blocked; Claude Code's system prompt scans Git status for keywords like "Open Claude" and "Hermes" and flags accounts even when the tools aren't actively in use.
"No announcement, no warning, just a silent update that broke workflows overnight."

~02:00 Three theories

  1. Compute efficiency — claimed third-party tools weren't using prompt caching well. Undermined by Anthropic's May 6 SpaceX deal for 220,000+ GPUs.
  2. Telemetry gaps — third-party traffic patterns are hard to debug. Counter: SDK-level telemetry would address this without a ban.
  3. Vendor lock-in — every restriction has pushed users toward official Anthropic products (Claude routines, managed agents, remote control). The host finds this the most credible.
"On the 6th of May, Anthropic signed a deal with SpaceX for over 220,000 GPUs. So, if compute was the problem, they just solved it."

~05:01 Anthropic vs. OpenAI ecosystem strategies

OpenAI includes Codex in every ChatGPT subscription with no credit system, allows subscription use in third-party tools, opened its platform to Open Claude (3M users), and is offering enterprise Claude switchers two free months of Codex.[19]Better Stack The host's frame: Anthropic is putting up walls; OpenAI is tearing them down.

"While Anthropic is putting up walls, OpenAI is tearing them down."
"Anthropic is making up weird rules that is giving them free customers."
"The question now is whether the Claude models are still good enough to justify paying more to use them. Right now for me, the answer is yes, but the gap is closing very quickly."
Tools: Agent SDK, claude -p, Open Claude, Conductor, Hermes, Sandcastle, T3 Code, Zed, Open Code, Codex, ChatGPT, Claude routines, Nano Claude
AI Tools Developer Tools
Better Stack

Routa: an AI coding tool built as a Kanban delivery team

Better Stack reviews Routa, a free, open-source, local-first AI coding tool that treats AI-assisted development as a Kanban-based delivery pipeline — backlog → dev → review → evidence → done — rather than a chat session.[20]Better Stack — Routa Model-agnostic via your own API key (host used Claude), supports MCP and ACP, self-hostable via Docker Compose. The thesis: "the next step is not just smarter models, it's better coordination, better traces, better gates."

Read more

Routa targets three problems the host calls "chat hell" — context trapped in conversations, no traceability for AI decisions, and no quality gates (tests, diffs, acceptance criteria) enforced before merge.[20]Better Stack Stages are explicit: backlog → dev → review → evidence → done. Each work item has its own card with context, plan, diffs, and evidence — closer to how a delivery team operates than a chat thread.

It's model-agnostic via bring-your-own-API key, supports MCP and ACP agent protocols, and ships both a desktop app and a Docker Compose self-host. Comparisons the host makes:

  • Cursor / Claude Code — great code assistants but chat-first; conversation-centric.
  • CrewAI / LangGraph — flexible agent frameworks but you still have to build the workflow yourself.
  • Routa — opinionated workflow, fewer ready-made agents, slightly rougher UX than Cursor, but no mandatory subscription and code stays local.
"The next step is not just smarter models — it's better coordination, better traces, better gates."
AI Tools Developer Tools
Sequoia Capital

Serval: describe the workflow, the code appears

Sequoia clips a 45-second pitch from Jake Stauch on Serval: keep traditional workflow and database primitives, but generate and update them via natural language. Describe a workflow (steps, permissions, approvals, logic) and the code is generated instantly; same pattern for data sources.[21]Sequoia — Serval

Read more

Stauch's pitch is that the building blocks (workflows, databases, permissions) are exactly the same as the past 20 years of internal-tool stacks — what changes is the authoring layer.[21]Sequoia Instead of clicking through a low-code UI, you describe the workflow in words and the code is generated and maintained for you. Same with data ingestion: describe the sources you want and Serval generates the fetch code and keeps it up to date.

Worth filing next to the Routa write-up above as another bet on opinionated, structured AI-assisted development — pulling away from chat-first interfaces.

Hot Take Productivity
Nate B Jones

Nate B Jones: prompt engineering is dead — meet the AI Question Method

Nate B Jones argues that traditional prompt engineering is now table stakes and no longer sufficient for Opus 4.7-class agents. The replacement: the "AI Question Method" — three principles for working with senior-partner-grade AI rather than over-instructing it like a junior hire.[22]Nate B Jones — Prompt Engineering Is Dead

Read more

~00:00The reframe. Stop treating AI like a junior employee that needs precise instructions; start treating it like a senior partner that benefits from open questions and directional intent.[22]Nate B Jones

~04:00Principle 1: Flashlight Intent. Frame questions with a central thesis (the beam center) plus explicit edges/exclusions. Avoids both over-open and over-prescriptive prompts.

~09:00Principle 2: Invite Synthesis. For complex creative outputs, pose multiple intersecting open-ended questions and let the AI synthesize across them — rather than writing rigid evals that constrain the answer. He claims this capability is meaningfully better in Opus 4.7 and 5.5.

~14:00Principle 3: Data + Opinion. When pointing the AI at a folder of files, explicitly name data artifacts alongside your thesis-as-question so it ranges across all sources rather than drilling into one.

~22:00Closing. Nate calls this a real inflection point and pushes for a full vocabulary shift from "prompt engineering" to "AI questioning."

"The words were never the things that mattered the most in prompt engineering. The intent was always what mattered."
Tools: Claude Opus 4.7, OpenAI 5.5, Claude Code, Codex, Co-work
AI Tools Developer Tools
AICodeKing

KingGravity: fixing Antigravity 2.0 with Anthropic's design skill

AICodeKing publishes a tips guide for using Google's Antigravity 2.0 despite the host's stated dislike of it. The fixes — labeled "KingGravity" — center on plugging in Anthropic's front-end design skill plus awesome-design.md for UI quality, tuning the King mode prompt with "don't plan unless asked," enabling browser tools, and using the Karpathy skill.[23]AICodeKing — KingGravity

Read more

Four threads from the video:[23]AICodeKing

  • Pricing. Free tier advice and how to avoid blowing through credits on Antigravity 2.0 / Antigravity IDE / Antigravity CLI.
  • Antigravity IDE vs Antigravity 2.0. Where each is stronger; why the host prefers the IDE for now.
  • Fixing Gemini 3.5 Flash regressions. Use Anthropic's front-end design skill + awesome-design.md for UI quality; add "don't plan unless asked" to the King mode prompt to keep responses structured.
  • Browser tools + Karpathy skill. Enable browser actuation rules; layer in the Karpathy skill for richer reasoning.
Tools: Antigravity, Antigravity IDE, Antigravity 2.0, Antigravity CLI, Gemini 3.5 Flash, Anthropic front-end design skill, awesome-design.md, King mode prompt, VS Code, Karpathy skill
Hot Take
Nate B Jones

Hot take: AI detection in schools is mathematically impossible

A ~90-second short in which Nate B Jones — citing Andrej Karpathy — argues that AI writing detection in schools is mathematically impossible to implement correctly, and is already harming students.[24]Nate B Jones — AI Detection

Read more

The core claim: there is no reliable way to distinguish AI-generated prose from human prose at a per-document level, and false-positive rates are high enough that the policy generates real harm — students getting flagged and penalized for writing that they actually wrote.[24]Nate B Jones

Developer Tools
The Pragmatic Engineer

Alice Ryhl: the pitch for Rust on the back end

A short clip from The Pragmatic Engineer: Alice Ryhl pitches Rust to TypeScript developers as a back-end language, emphasizing reliability and bug reduction for server-side workloads like API servers.[25]Pragmatic Engineer — Alice Ryhl

Read more

Ryhl frames Rust's appeal for TypeScript devs not in terms of performance but in terms of reliability — fewer production bugs on the server side, with the type system catching what JavaScript's wouldn't.[25]Pragmatic Engineer

Sources

  1. Newsletter OpenAI cracks an 80-year math belief — The Rundown AI, May 21
  2. Blog Cursor's Composer 2.5: third on the Coding Agent Index and ~10-60x lower cost than rivals — Artificial Analysis, May 21
  3. Blog Cohere launches open weights model Command A+ — Artificial Analysis, May 21
  4. Newsletter Nvidia beats again — Sherwood Snacks, May 21
  5. Newsletter Nvidia's CPU era has arrived — Tech Brew, May 21
  6. Blog Datasette Agent — Simon Willison's Weblog, May 21
  7. Blog datasette-agent-sprites 0.1a0 — Simon Willison's Weblog, May 21
  8. Blog datasette-agent-charts 0.1a2 — Simon Willison's Weblog, May 21
  9. Blog datasette-agent 0.1a3 — Simon Willison's Weblog, May 21
  10. Blog Here's what developers can do with the latest Google Play updates — Google Blog, May 21
  11. Newsletter Data Science Weekly — Issue 652 — Data Science Weekly, May 21
  12. YouTube Notion's Ivan Zhao: The Refounder — Sequoia Capital, May 21
  13. YouTube Notion's CEO on where he stumbled with sales — Sequoia Capital, May 21
  14. YouTube How to Build a Self-Improving Company with AI — Y Combinator, May 21
  15. YouTube Run long tasks in Codex using goals — OpenAI, May 21
  16. YouTube Share Codex plugins with your team — OpenAI, May 21
  17. YouTube Introducing Appshots in Codex — OpenAI, May 21
  18. YouTube OpenAI: $2M in tokens to every YC company in the spring and summer batches. — Y Combinator, May 21
  19. YouTube Why Anthropic Banned Every Third-Party Claude Tool — Better Stack, May 21
  20. YouTube I Tried an AI Coding Tool Built Like a Delivery Team (Routa) — Better Stack, May 21
  21. YouTube Describe the workflow. The code appears. | Jake Stauch, Serval — Sequoia Capital, May 21
  22. YouTube Opus 4.7 and OpenAI 5.5 Made Your Prompting Style Obsolete. — AI News & Strategy Daily | Nate B Jones, May 21
  23. YouTube KingGravity (I Fixed Antigravity 2.0): SIMPLEST AND EASIEST WAY to CORRECTLY USE ANTIGRAVITY! — AICodeKing, May 21
  24. YouTube Cognitive Architecture Beats AI Detection Every Time — AI News & Strategy Daily | Nate B Jones, May 21
  25. YouTube Alice Ryhl: The pitch for Rust — The Pragmatic Engineer, May 21