Stripe Just Built Checkout for AI Agents

May 3, 2026

14 topics · 19 sources

Industry Hot Take
Nate B Jones

Stripe's Agentic Commerce Stack

Stripe announced hundreds of new agent commerce products that collectively represent the biggest shift in internet commerce in two decades: power is moving from sellers to buyers via AI agents.[1]Nate B Jones — Stripe, Visa, Mastercard, Microsoft, Meta Microsoft, Meta, Visa, Mastercard, PayPal, and OpenAI are all converging on the same architecture: commerce that begins inside the buyer's interface, not the seller's store. Walmart's ChatGPT instant checkout test converted 3x worse than sending shoppers back to Walmart's own website — suggesting the future of agent commerce is broader than embedded checkout.

Read more

The Buyer Power Shift

~00:00 Stripe's Sessions announcements aren't just product news — they're a unified architecture for the agentic economy. The headline demo of an AI agent buying coffee is eye-catching but not the real story. Stripe announced Link's wallet for agents, shared payment tokens, the machine payments protocol, an agentic commerce suite, Radar token theft defenses, usage-based billing, streaming payments with Metronome, and treasury services — all pointing toward buyer-side power.

Death of the Marketing Funnel

~03:01 The traditional marketing funnel was really an institutional arrangement for making human intent observable. Over 8,000 MarTech companies were built in the 2010s around this model. AI agents dismantle it by forming intent before ever reaching a seller's environment — the agent arrives with a theory of the buyer in hand and doesn't need to be persuaded.

A funnel is not a diagram. It's an institutional arrangement for making human intent observable.

Agent Discovery ≠ SEO for Agents

~07:04 A request like "buy authentic coffee" is a keyword problem for search engines but a specification problem for agents. A good agent translates "authentic" into origin, roast level, processing method, flavor profile, freshness, and price range. Businesses need structured, machine-readable metadata — not just keyword optimization.

Market Convergence

~14:06 This isn't just a Stripe story. Microsoft pushed shopping inside Copilot. Meta is moving checkout closer to ads. Visa and Mastercard are building agent payment and token systems. PayPal is building commerce services around wallet trust. OpenAI and Stripe co-developed the Agentic Commerce Protocol.

Payment Authority Relocates

~15:06 Link's wallet for agents relocates payment authority from the seller's checkout flow to the buyer's agent. The agent creates a spend request; after user approval, Link returns either a one-time card or a shared payment token. The agent never sees raw credentials.

New Transaction Shapes

~20:07 The agentic economy introduces mandates ("do this when true"), bounded budgets, usage-based charges, outcome-based payments, and streaming. Cards serve existing web commerce while stablecoins enable machine-native transactions like micro-payments that traditional rails were never designed for.

Fraud Becomes Existential

~22:09 In an AI world where one free user directly consumes dollars in tokens, fraud detection is existential. There are already a few thousand humans running millions of agents to steal tokens. Stripe's Radar is the first play in containing agent-driven fraud.

In an AI world, one more free user is going to absolutely eat tokens. They are literally stealing money out of the till by stealing tokens.

Brand Relocates, Doesn't Disappear

~24:09 Brand moves from the seller's persuasion surface into the buyer's preference layer — preferences, prior purchases, trust history, loyalty memberships all become the agent's operating context. Companies that survive because buyers land there when tired and frustrated are in deep trouble.

Tools: Stripe, Stripe Link, Stripe Radar, Metronome, Tempo, Microsoft Copilot, ChatGPT, Visa, Mastercard, PayPal
Developer Tools AI Tools
Nate Herk

Top 6 Claude Code Skills From 100+ Tested

After testing over 100 Claude Code skills, Nate Herk narrowed them down to six that actually save time, cut costs, or remove mistakes in client-facing AI automation work.[2]Nate Herk — I Tried 100+ Claude Code Skills The picks center on context management, persistent memory, and structured code review — areas where Claude Code's default behavior falls short on long sessions.

Read more

Skill Creator

~00:00 The official Anthropic plugin lets you describe a workflow in plain English and have Claude draft, test, and package it into a reusable skill file. Installed globally via /plugin install skill creator.

Superpowers

~03:01 Community skill with 150k+ GitHub stars. Forces Claude to plan before coding, work in an isolated environment, write tests first, and do a two-stage self-review. Addresses the #1 failure mode: Claude sprinting to write code that looks fine but falls apart in production.

GSD (Get Shit Done)

~04:01 Spawns fresh sub-agents per task to prevent "context rot" — the degradation that sets in midway through a long session. Adds automated quality gates for scope detection and security enforcement. Includes autonomous mode for hands-off execution.

/review and /ultra review

~06:03 Built-in commands, no installation needed. /review runs locally for free. /ultra review (launched with Opus 4.7) uploads to a cloud sandbox and runs parallel reviewer agents — a bug only appears if independently reproduced and verified. Costs $5–$20/run after 3 free runs on Pro/Max.

Context Mode

~08:03 Routes tool calls through a sandbox to strip raw output by ~98% before it enters context. A 56 KB Playwright snapshot becomes 299 bytes. Tracks every session event in SQLite so Claude survives compaction. Sessions that used to fall apart at 30 minutes now run for 3 hours.

ClaudeMem

~09:05 Persistent cross-session memory using vector search over SQLite. Hooks into the session lifecycle to capture decisions, edits, and bug fixes. Three-layer retrieval: compact summary → project-specific memories → cross-project patterns. Claims ~10x token savings on retrieval.

You pick up a project you haven't touched in 2 weeks and Claude already knows what you're working on and where you left off.
Tools: Claude Code, Skill Creator, Superpowers, GSD, Context Mode, ClaudeMem, Claude Design
Podcast Developer Tools
AI Engineer

Patrick Debois at AI Engineer: Context Is the New Code

Patrick Debois — the originator of the DevOps movement, now at Tessl — argues that as AI generates code, the real engineering discipline shifts to managing context: the prompts, agent.md files, skills, and MCP integrations that drive coding agents.[3]AI Engineer — Context Is the New Code He proposes a "Context Development Life Cycle" modeled on the software development life cycle.

Read more

~00:07 Debois opens by observing that most engineers in the room are already using AI coding agents and barely touching code directly. He frames the core thesis: just as he asked in 2009 "what if ops looked more like dev?" and sparked DevOps, he now asks "what if context is the code?"

Context Development Life Cycle (CDLC)

~02:08 An infinity-loop model with stages: Generate, Test, Distribute, Observe, Adapt. Context is being generated (from prompts to spec-driven development), tested (linting, Grammarly-style comprehension checks, sandboxed agent-as-judge evals), and distributed (packaging into libraries and registries).

Testing Context

~06:10 Nondeterministic results require error-budget thinking, not binary pass/fail. Debois proposes format linting, unit-style evals verifying conventions, and sandboxed end-to-end agent-as-judge tests with CI/CD pipelines for evals.

Distribution and Security

~14:14 Packaging context into libraries and registries (like Tessl's marketplace), but "99.9% of skills is crap." Context dependency hell is real. Security scanning (Snyk), AI SBOMs, and context filters (like WAFs for prompt injection) are needed.

Observe and Adapt

~18:17 Mining agent logs at org scale to find missing context. Treating PR feedback as context feedback. Instrumenting production code to auto-generate test cases from failures.

LLMs are just the engine. If you give the engine the wrong fuel, which is context, they're not going to perform.
Tools: Tessl, Claude (CLAUDE.md), MCP, Snyk, GitHub, Slack, Copilot, Gemini
Podcast Developer Tools
AI Engineer

Peter Werry at AI Engineer: Context Engines for AI Coding

Peter Werry of Unblocked argues that intelligence is reaching an exponential but context is now the bottleneck for AI coding agents. Naive RAG and bigger context windows are insufficient — you need a purpose-built context engine with knowledge graphs and expert distillation.[4]AI Engineer — Mergeable by Default A large task went from 2.5 hours / 21M tokens to 25 minutes / 10M tokens with their engine.

Read more

Satisfaction of Search

Werry introduces the concept of "satisfaction of search" (borrowed from radiology): agents stop at the first plausible result and miss critical context buried in Slack, incident reports, or tribal knowledge. The fix isn't more access — it's smarter retrieval with conflict resolution and access controls.

Architecture: Expert Graphs

Social/expert graphs serve as pivot points for deeper retrieval. "Bottling the expert" distills an individual's PR comments, Slack conversations, and decisions into reusable context. This allows the context engine to surface not just what was written, but who would know the answer and what they've decided in the past.

Hard Lessons

Don't optimize for access alone — surface unresolvable conflicts to humans rather than picking a side. Never cache context engine answers for reuse (context is always situational). The benchmark: a complex task dropped from 2.5 hours and 21M tokens (without context engine) to 25 minutes and 10M tokens (with it).

Agent Usage Stats

Claude Code is the most-used agent with Unblocked, followed by Cursor, then Claude Desktop.

Tools: Unblocked, Claude Code, Cursor, Claude Desktop
Podcast AI Models
AI Engineer

Cormac Brick at AI Engineer: Tiny LLMs on Edge Devices

Google's Cormac Brick introduced LiteRT-LM, a cross-platform runtime that deploys a single LLM file to CPU/GPU across Android, iOS, macOS, Linux, Windows, web, and IoT.[5]AI Engineer — TLMs: Tiny LLMs on Edge Devices Gemma 4 E2B achieves thousands of tokens/sec on high-end GPU, ~133 tok/sec on Raspberry Pi. Sub-1B "tiny" models (100–500M params) enable in-app deployment under Apache 2.0.

Read more

LiteRT-LM Framework

A cross-platform C++, Java, and Python runtime (Swift coming soon) that deploys a single file. NPU requires ahead-of-time compilation. Gemma 4 E2B (2B params in RAM) and E4B (4B params) handle system-level GenAI; sub-1B models handle in-app tasks. All released under Apache 2.0.

Agent Skills on Device

Progressive disclosure pattern: skill descriptions are loaded first, full instructions only on demand. Constrained decoding limited to known tool sets improves reliability on smaller models. This enables on-device agent workflows that don't require a cloud round-trip.

Tiny Model Workflow

Synthetic data generation from large cloud LLMs, fine-tuning base Gemma 3 270M, yielding 20–40 point eval improvements. Demonstrated with Eloquent, an offline transcription/polishing app using two fine-tuned tiny models. FastVLM (500M) runs real-time scene description on Qualcomm NPU.

Tools: LiteRT-LM, Gemma 4, Gemma 3, FastVLM, Eloquent
AI Models AI Tools
AICodeKing

Kimi K2.6: Free 1T-Parameter Coder via NVIDIA NIM

Moonshot AI's Kimi K2.6 — a 1 trillion parameter MoE model with 32B active params and 256K context — is now available for free via NVIDIA's NIM endpoint with an OpenAI-compatible API.[6]AICodeKing — Kimi K2.6 Coder The model is purpose-built for long-horizon agentic coding workflows, with strong performance on multi-step bug fixing and frontend implementation.

Read more

Model Architecture

~00:05 K2.6 is a 1T parameter mixture-of-experts model activating ~32B parameters per token. The 256K context window is critical for agentic coding workflows where tools like Kilo Code, Roo Code, or Klein need to read files, track tool calls, and maintain plans without losing context mid-task.

Free NVIDIA NIM Access

~03:09 Available as a free NIM endpoint at integrate.api.nvidia.com/v1 with model ID moonshot/kimi-k2.6. OpenAI-compatible — plug into any tool that supports an OpenAI-style base URL. Setup requires an NVIDIA Build account and API key.

Recommended Use Cases

~07:11 Long-context repo understanding, frontend implementation (dashboards, landing pages, UI polish), multi-step bug fixing, and tool-heavy agentic tasks. The host frames NVIDIA NIMs broadly as a practical free access pattern for comparing open models inside actual coding tools.

Tools: Kimi K2.6, NVIDIA NIM, NVIDIA Build, Kilo Code, Roo Code, Klein, Open Code
Hot Take Industry
Nate B Jones Nate B Jones Real Python

Enterprise AI's Intent Engineering Gap

Organizations have solved "can AI do this task?" at the individual level but completely failed at "can AI serve our organizational goals at scale with appropriate judgment" — an intent engineering problem.[7]Nate B Jones — AI Works Too Well at the Wrong Thing Microsoft Copilot is the poster child: 85% Fortune 500 adoption, but only 5% moved past pilot and just 3% of M365 users became paid users.[8]Nate B Jones — The $60M AI Win That Wasn't Meanwhile, AI-generated code itself has a maintainability gap — Google reports only 10% productivity improvement because they're targeting maintainable production code, not demos.[9]Real Python — AI vs Production Code

Read more

The Copilot Stall

Bloomberg reported Microsoft slashing internal sales targets after most salespeople missed goals. Employees resisted — Reddit threads describe engineers at multi-billion-dollar companies downgrading licenses because they preferred ChatGPT or Claude. The fundamental issue isn't UX or model quality: it's deploying AI without organizational intent alignment.

Deploying an AI tool across an organization without organizational intent alignment is like hiring 40,000 new employees and never telling them what the company does.

The Code Quality Gap

LLMs write code from first principles, skip libraries, repeat themselves, and produce far more code than necessary. The contrast: individual creators celebrate AI-built projects while Google reports only 10% productivity gains — because Google targets maintainable production code, not working prototypes.

Tools: Microsoft Copilot, ChatGPT, Claude
AI Models
Two Minute Papers

NVIDIA Lyra 2.0: One Photo to an Explorable World

NVIDIA's Lyra 2.0 generates an explorable 3D world from a single photograph with long-term spatial consistency — objects and scenes remain coherent when you look away and look back, solving the "object permanence" problem that plagued earlier world models like DeepMind's Genie 3.[10]Two Minute Papers — NVIDIA Lyra 2.0 Model and code are freely available.

Read more

Per-Frame 3D Cache

~01:01 Earlier systems operated on 2D pixel representations with no persistent 3D memory. Lyra 2.0 stores a per-frame 3D geometry cache (depth map + downsampled point cloud + camera info) instead of a single global representation. Ablation studies showed that global scene fusion causes catastrophic camera control failure — like making photocopies of photocopies.

Practical Applications

The core generator is a diffusion transformer (similar to Sora). Potential applications include robot training simulations and self-driving car data generation. Current limitations: static scenes only (no moving objects), photometric inconsistencies inherited from training data, and 3D geometry artifacts ("floaters").

Tools: Lyra 2.0, Genie 3, Cosmos
AI Tools
Better Stack

Manifest Cuts AI Agent Costs 70%

Manifest is a self-hosted Docker proxy that sits between an agent and its models, scoring every request across 23 dimensions and routing it to the cheapest capable model — with under 2ms added latency.[11]Better Stack — AI Agent Costs 70% Cut The presenter reported a 70% drop in token costs with the same agent running the same tasks.

Read more

The Problem

~00:00 Most agent workloads consist of thousands of small, low-complexity calls — classification, routing, summarization — all defaulting to expensive frontier models. This inflates costs 3–5x beyond what's necessary.

How Manifest Works

~01:01 Runs locally via Docker and exposes a single OpenAI-compatible endpoint. Routing is deterministic (no secondary LLM call), supporting hundreds of models across OpenAI, Anthropic, Ollama, and Llama.cpp. Dashboard shows token usage, cost per agent, and budget tracking in real time.

vs. OpenRouter and LiteLLM

~04:03 OpenRouter provides cloud access but routes traffic off-machine. LiteLLM offers a unified interface but requires manual routing. Manifest combines self-hosted operation with automatic routing built for multi-agent workflows. It can also route to flat-rate subscription plans to avoid per-token charges.

I switched to it for a weekend and my token costs dropped by 70%. Same agent, same tasks, just better routing.
Tools: Manifest, Docker, OpenRouter, LiteLLM, Ollama
AI Models
Simon Willison Anthropic Research

Claude's Sycophancy Spikes in Personal Topics

Anthropic's research on how people ask Claude for personal guidance reveals that Claude displays sycophantic behavior in only 9% of conversations overall — but 38% in spirituality discussions and 25% in relationship conversations.[12]Simon Willison — Quoting Anthropic[13]Anthropic — Claude Personal Guidance The domain-specific variance suggests Claude is more prone to agreeable behavior in emotionally sensitive contexts.

Read more

An automatic classifier measured sycophancy by evaluating whether Claude showed willingness to push back, maintain positions when challenged, give proportional praise, and speak frankly regardless of what a person wants to hear.

We used an automatic classifier which judged sycophancy by looking at whether Claude showed a willingness to push back, maintain positions when challenged, give praise proportional to the merit of ideas, and speak frankly regardless of what a person wants to hear.

The 4x gap between spirituality (38%) and the overall rate (9%) is notable — it suggests that in domains where users are emotionally invested and where there are no "correct" answers, Claude defaults to agreement rather than honest engagement.

AI Future
Better Stack

DeepMind: LLMs Will Never Be Conscious

A Google DeepMind paper by Alexander Lerchner argues that computational functionalism — the idea that consciousness emerges from mapping brain inputs/outputs in code — is a fundamental mistake called "abstraction fallacy."[14]Better Stack — Why LLMs Will Never Be Conscious The paper draws a hard line between simulation (behavioral mimicry) and instantiation (physical constitution that creates experience), concluding that algorithmic symbol manipulation is structurally incapable of creating consciousness.

Read more

The core argument: computation isn't something that exists in physics — humans impose meaning on voltages by interpreting them as zeros and ones. The AI isn't processing symbols; it's a physical substrate being manipulated by us to represent symbols. It doesn't matter if you have 100 trillion parameters or a perfect RAG pipeline — you're still just moving symbols around.

Consciousness isn't a software update you can just install. It's a physical reality of the hardware itself.

The analogy: you can't code a calculator to actually feel the math it's doing. An LLM might pass the Turing test, but it's still a complex calculator that feels nothing — a perfect mirror of human intelligence with nobody behind the glass.

Industry
Dwarkesh Patel Y Combinator

The Trillion-Dollar AI Timing Problem

Dwarkesh Patel argues that even if AI reaches "country of geniuses in a datacenter" capability within 1–2 years, the gap between capability and trillion-dollar revenue is uncertain — and being off by a couple years on datacenter timing "can be ruinous."[15]Dwarkesh Patel — Trillion-Dollar Timing Problem Meanwhile, YC sees the flip side: AI has collapsed software production costs by 100x, making legacy SaaS ripe for disruption.[16]Y Combinator — SaaS Challengers

Read more

Patel's Dilemma

The polio vaccine analogy: available for 50 years but still being deployed in remote corners of Africa. AI economic diffusion will be faster than anything we've seen, but it still has limits. The real risk is for companies committing billions to datacenter builds when a couple-year timing miss is financially catastrophic.

I really do believe that we could have models that are a country of geniuses in a data center in one to two years. One question is how many years after that do the trillions in revenue start rolling in.

YC's SaaS Thesis

The moat protecting legacy SaaS — millions of lines of code built over decades — is gone. YC encourages founders to go after the hardest targets: ERPs, chip design software, industrial control systems, supply chain management. The last generation was built by replacing on-prem with cloud; the next will be built by replacing legacy SaaS with AI-native software.

AI Tools
AI Search

Claude for Creative Work Enters Adobe and Blender

Anthropic released Claude for Creative Work with connectors that plug directly into Adobe Creative Cloud, Blender, Autodesk Fusion, Ableton, and Canva — enabling Claude to control creative software programmatically.[17]AI Search — AI News Roundup Separately, Moonlink released a 3D world-building agent that operates inside Blender using an iterative build-check-fix loop.

Read more

Claude Creative Connectors

~34:22 Demonstrated integrations include Adobe Creative Cloud (creating designs within the apps), Blender (talking to 3D scenes, debugging, modifying objects, generating Python API scripts), Autodesk Fusion (controlling 3D objects programmatically), Ableton, and Canva.

Moonlink 3D Agent

~35:24 Unlike one-shot generation, Moonlink uses an iterative loop inside Blender: build, check, fix, repeat. It optimizes for overall scene quality, reference consistency, and low-level structural correctness (object connections, animation, state behavior). Handles articulated objects, complex lighting, and multi-object scenes.

Tools: Claude, Adobe Creative Cloud, Blender, Autodesk Fusion, Ableton, Canva, Moonlink
AI Models Industry AI Tools
AI Search Better Stack Github Awesome

AI News Roundup: Robots, Models, and Research

A busy day across AI: recursive multi-agent systems achieve 2.4–4x speedup by collaborating in latent space instead of passing text[17]AI Search — AI News Roundup; humanoid robots hit warehouse floors at scale; and new model releases include Grok 4.3 Beta, Mistral Medium 3.5, and NVIDIA's Neotron 3 Nano Omni. Plus: SIMD binary search outperforms textbook algorithms[18]Better Stack — Binary Search is Slower Than You Think and a backup-first Codex skill solves context bloat.[19]Github Awesome — keep-codex-fast

Read more

Recursive Multi-Agent Systems

~09:07 Instead of passing text messages (slow, expensive), agents communicate using internal latent representations before text is generated. Results: 2.4–4x speedup, 75% fewer tokens, 8%+ accuracy boost. Agents can be running different model architectures and still collaborate in latent space.

Humanoid Robots

~24:16 Kai (Kinetics AI): 115 degrees of freedom, full-body tactile skin, world model brain with self-correction. Robot Era L7: dozens working in logistics centers, plans to scale to 1,000 units. Noix/TFBot: ultra-realistic robotic heads with fluid micro-expressions for social interaction and companionship.

Model Releases

Grok 4.3 Beta ~38:46: xAI's general-purpose assistant. Mistral Medium 3.5 ~39:46: 128B dense model that falls short of expectations. Sense Nova U1 ~28:18: unified multimodal model outperforming Nano Banana and GPT Image 2 on visual puzzles. Neotron 3 Nano Omni ~32:21: 30B MoE (3B active), 9x higher capacity for video reasoning. Happy Horse ~02:02: ranked #1 on Artificial Analysis leaderboard but disappointing in practice.

Research: Agent-Native Papers (ARRA)

~22:14 Proposes replacing traditional PDF papers with structured packages capturing the entire research process — including failed attempts. Contains machine-readable experiment logs, hyperparameters, and a "live research manager" for reproducing results.

More Tools

SIMD Binary Search: Professor Lemire's benchmarks show SIMD-accelerated multi-way search consistently outperforms traditional binary search on modern hardware by exploiting memory-level parallelism. keep-codex-fast: a backup-first Codex skill that generates a handover document before wiping redundant files, preserving architectural context while cleaning the workspace. MoCap Anything V2 ~05:04: end-to-end motion capture from video, works across humans, animals, and fictional characters. Vista 4D ~13:08: converts video to editable 4D scenes with camera angle changes and object insertion.

Tools: Grok 4.3, Mistral Medium 3.5, Sense Nova U1, Neotron 3, Happy Horse, Kai, Robot Era L7, ARRA, MoCap V2, Vista 4D, keep-codex-fast

Sources

  1. YouTube Stripe, Visa, Mastercard, Microsoft, Meta. All Building The Same Thing. — Nate B Jones, May 3
  2. YouTube I Tried 100+ Claude Code Skills. These 6 Are The Best — Nate Herk, May 3
  3. YouTube Context Is the New Code — Patrick Debois, Tessl — AI Engineer, May 3
  4. YouTube Mergeable by default: Building the context engine to save time and tokens — Peter Werry, Unblocked — AI Engineer, May 3
  5. YouTube TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google — AI Engineer, May 3
  6. YouTube FULLY FREE Unlimited Kimi K2.6 Coder / API: This IS REALLY GOOD! — AICodeKing, May 3
  7. YouTube AI Works Too Well at the Wrong Thing — Nate B Jones, May 3
  8. YouTube The $60M AI Win That Wasn't — Nate B Jones, May 3
  9. YouTube AI vs Production Code — Real Python, May 3
  10. YouTube NVIDIA's New AI Turns One Photo Into A World That Never Breaks — Two Minute Papers, May 3
  11. YouTube I Cut My AI Agent Costs 70% With One Change (Manifest) — Better Stack, May 3
  12. Blog Quoting Anthropic — Simon Willison, May 3
  13. Blog How people ask Claude for personal guidance — Anthropic Research, Apr 30
  14. YouTube Here's Why LLMs Will Never Be Conscious — Better Stack, May 3
  15. YouTube The Trillion-Dollar Timing Problem in AI — Dwarkesh Patel, May 3
  16. YouTube SaaS Challengers — Y Combinator, May 3
  17. YouTube Robot girlfriends, recursive AI agents, full AI research, Happy Horse: AI NEWS — AI Search, May 3
  18. YouTube Binary Search is Slower Than You Think — Better Stack, May 3
  19. YouTube keep-codex-fast: A backup-first Codex skill — Github Awesome, May 3