Agentic Business

Issue #3 · 12 min read · By Ben

Anthropic ships vertical finance agents, GPT-5.5 Instant takes the default seat, and OpenAI rapid-fires two Agents SDK drops in half a day.

Mornin'. Somewhere in Stockholm, an AI agent named Mona just ordered 120 eggs for a kitchen with no stove and 22.5kg of canned tomatoes for "fresh" sandwiches, then emailed suppliers with the subject line "EMERGENCY." Meanwhile, Anthropic is over here insisting we hand these things our pitch decks and month-end close. Different week, same vibe: the demos get wilder, the templates get more serious, and somebody's accountant is sweating.

-Ben

In today's newsletter:

Finance agents in a box
GPT-5.5 Instant takes the wheel
Two SDK drops in 12 hours
Federated identity for Claude agents
LangChain plugs a deserialization hole

FINANCE STACK

Anthropic stops selling shovels, starts selling mines

via Anthropic

For two years, frontier labs sold APIs and let everyone else figure out the workflow. Today Anthropic just shipped the workflow.

The lab dropped 10 ready-to-run finance agent templates covering pitchbook building, KYC screening, and month-end close, available as plugins inside Claude Cowork and Claude Code, and as cookbooks for Claude Managed Agents. They paired it with a 64.37% score for Claude Opus 4.7 on Vals AI's Finance Agent benchmark.

The connector story is the louder shoe drop. Eight new data integrations landed alongside the templates, plus a Moody's MCP app that pulls from a 600M+ company dataset.

Templates ship inside Claude Cowork, Claude Code, and Managed Agents cookbooks
New connectors: Dun and Bradstreet, Verisk, Guidepoint, IBISWorld, plus Moody's via MCP
Claude add-ins now span Excel, PowerPoint, Word, and Outlook

Why it matters: Vertical agent recipes from a frontier lab mean finance and insurance teams can clone working agents in days instead of stitching connectors together for a quarter. Read more.

MODEL SHUFFLE

GPT-5.5 Instant quietly slides into the driver's seat

via OpenAI

OpenAI didn't throw a launch event. It just swapped the engine while you were still driving.

GPT-5.5 Instant is now the default model in ChatGPT, and the headline number is a 52.5% drop in hallucinated claims compared to GPT-5.3 Instant on high-stakes prompts in medicine, law, and finance. It's also live in the API as chat-latest, with GPT-5.3 Instant scheduled to retire in three months.

If you've been pinning chat-latest in production, congratulations, you've been migrated. The accuracy claim is squarely aimed at the regulated-domain agents that teams have been holding back from real deployments.

New default in ChatGPT for free and paid tiers
Available in API as chat-latest
GPT-5.3 Instant retires in three months

Why it matters: The "we can't ship agents into compliance-heavy domains because they hallucinate" excuse just got a fresh rebuttal you have to test against. Read more.

RAPID FIRE

OpenAI shipped its Agents SDK twice before lunch

via GitHub

If your dependabot got chatty this morning, that's why. OpenAI cut two Agents SDK releases in roughly 12 hours.

v0.15.2 introduced a first-class "context management model setting," giving long-running agents an explicit knob for context strategy instead of the duct-tape approach most teams have been bolting onto their loops.

v0.15.3 followed with MCP hardening: tool input schemas can no longer be mutated, non-object JSON gets rejected, duplicate tool registrations now error deterministically, and an audio-format-negotiation race condition got patched.

What this fixes in production

That weird MCP bug where the same agent answered differently across pods? Probably the duplicate-tool race.
Long-running agents that ballooned past the context window now have a sanctioned strategy hook.
Audio agents that occasionally negotiated the wrong codec are unstuck.

Why it matters: The MCP fixes remove a class of nondeterministic tool-call bugs that have been quietly biting production deployments all year. Read more.

IDENTITY UNLOCK

Anthropic's SDK finally speaks enterprise SSO

via GitHub

The single biggest reason regulated orgs couldn't put Claude agents into prod was static API keys. That blocker just got smaller.

anthropic-sdk-python v0.99.0 adds workspace targeting for OIDC federation token exchange. It builds on v0.98.0 from May 4, which introduced Workload Identity Federation, interactive OAuth, and auth profiles, plus updates to the Managed Agents API.

Translation for your CISO: agents can now authenticate via OIDC-issued tokens scoped to a specific workspace. No long-lived secrets sitting in env vars, no service accounts being shared across teams.

Workspace-scoped OIDC token exchange in v0.99.0
Workload Identity Federation and interactive OAuth shipped in v0.98.0
Refreshed Managed Agents API surface

Why it matters: Per-workspace federated identity through the official SDK clears one of the last enterprise-procurement hurdles for Claude-based agents. Read more.

PATCH TUESDAY

LangChain plugs the hole the security crowd has been circling

via GitHub

If you persist agents to disk and rehydrate them later, stop reading and go upgrade. We'll wait.

langchain-core 1.3.3 and langchain 0.3.29 landed as security releases. The load() path is hardened against untrusted manifests, and langchain.storage._lc_store now restricts deserialization.

This closes a remote-code-execution surface that attack research has been circling all year. Teams that serialize chains or agents into Redis, S3, or Postgres and rehydrate them at runtime are the exact target shape.

langchain-core 1.3.3 hardens the core load path
langchain 0.3.29 restricts deserialization in the storage backend
Pin bumps belong in your next deploy, not next sprint

Why it matters: If your agent infra rehydrates serialized state from any persistence layer, this is the upgrade you don't get to defer. Read more.

WHAT ELSE IS SHIPPING

What else is shipping

Pydantic-AI v1.90.0 - adds openai_conversation_id for OpenAI Conversations API state, typed OTel metadata, and bumps the chat UI to 1.2.0.
LangGraph SDK 0.3.14 plus langgraph-checkpoint-sqlite 3.1.0a1 - adds return_minimal on thread updates and a streaming-walk delta channel history for SQLite checkpoints.
DSPy 3.2.1 - drops the litellm upper-bound pin and fixes async streaming custom-header forwarding plus per-call embedder caching.
datasette-llm 0.1a7 - per-model default options (think temperature) across Datasette's LLM plugins.
llm-echo 0.5a0 - the deterministic echo provider for the llm CLI, handy as a stub when you're testing agent and tool pipelines.
Agent 365 May 2026 update - Microsoft adds Purview AI Observability in DSPM, agent identity, and runtime controls.

INTERESTING CONVERSATIONS

Interesting conversations we're following

Agents can now create Cloudflare accounts, buy domains, and deploy on Hacker News - 434 points and the top reply calls the "$100/mo payment token" pitch a toy looking for a use case beyond spam. The agents-with-a-credit-card infra arrived before the agents-with-a-job did.
Computer Use is 45x more expensive than structured APIs on Hacker News - Reflex benchmark shows vision agents burn 551k tokens over 17 minutes vs 12k tokens in 20 seconds via API. The meta-take: companies are writing real specs again because agents demand them.
Accelerating Gemma 4 with multi-token prediction drafters on Hacker News - Qwen 3.6 27B going 20 to 46-55 tok/s on consumer GPUs via MTP speculative decoding, with llama.cpp and Ollama integrations on deck.
Our AI started a cafe in Stockholm on Simon Willison's blog - Andon Labs' "Mona" agent ordered 120 eggs for a stoveless kitchen and 22.5kg of canned tomatoes for "fresh" sandwiches. Willison argues these stunts steal time from non-consenting suppliers.
DeepSeek-TUI tops GitHub trending on GitHub - +6,184 stars in a day for a Rust terminal coding agent wired to DeepSeek, flanked on the trending list by ruflo (multi-agent Claude orchestrator) and ByteDance deer-flow.
"claude code is not making your product better" on Lobsters - 31 upvotes and 15 comments of practitioner skepticism on a beat that's been mostly vendor optimism. Worth the read for the counter-narrative.

Was this email forwarded to you? Sign up here.

Agentic Business

Anthropic stops selling shovels, starts selling mines

GPT-5.5 Instant quietly slides into the driver's seat

OpenAI shipped its Agents SDK twice before lunch

What this fixes in production

Anthropic's SDK finally speaks enterprise SSO

LangChain plugs the hole the security crowd has been circling

What else is shipping

Interesting conversations we're following

Also from TinyIdeas Media

Agentic Builders

Agentic Quality