Agentic QA
2026-05-08
Quiet day on the beat, but Jest v30.4.0 is the most substantive runner release of the week.
Mornin'. If you have ever shipped a library that is dual-published as CJS and ESM, you know the special hell of test files that import one shape and runtime code that exports the other. Jest v30.4.0 quietly fixed it yesterday: a runtime rewrite that lets you require() ES modules on Node 24.9+, plus fake timers for the new Temporal API. One thing to try this week: pull the version, delete one of your ESM shims, and see what falls out. My bet is less than you fear.
-Ben
In today's newsletter:
- Jest finally requires ES modules
- Storybook's agent setup grows up
- Claude Code gets clean worktrees
RUNNER REWRITE
Jest v30.4.0 finally lets you require() ES modules, plus Temporal fake timers
via GitHub
The CJS/ESM cold war just got a peace treaty, and Jest is the country that signed it.
Yesterday's v30.4.0 release ships what the team calls a "major runtime rewrite for ESM stabilization," with the headline that you can finally require() ES modules on Node v24.9 and up. That is the long-standing pain point for hybrid CJS/ESM test suites, the one that has been generating cursed moduleNameMapper snippets in every Node monorepo for the last three years.
It is not just the require interop. The same minor adds fake-timer support for the Temporal proposal, React 19 in pretty-format, a --collect-tests discovery flag, per-project verbose and silent, and jest.config.mts support. A same-week v30.4.1 patch follows up with tuple-form custom runner config and a CJS-from-ESM default-export fix that closes a real footgun for libraries dual-published in both shapes.
Why your test box cares
- Native ESM
require()on Node 24.9+ means most of your ESM workarounds can come out. - Fake timers for
Temporal.Duration,Temporal.Instant,Temporal.ZonedDateTime, andTemporal.Now.*make Jest the first major runner with first-class fake-time for Temporal. - The v30.4.1 default-export fix matters specifically for dual-published libraries where Jest was picking the wrong shape.
Why it matters: If your team has been pinning Node or shimming around ESM in tests, this is the upgrade that lets you delete code, not add it. Read more.
AGENT ONBOARDING
Storybook 10.4 alpha hardens its agentic setup for partial failures
via GitHub
Most agent demos are a clean repo and a single happy path. Real codebases are neither, and Storybook just admitted it.
Storybook's 10.4 alpha.18 ships a one-line headline with a much bigger meaning: "Agentic Setup: Allow failed stories to persist." The agent-driven onboarding flow no longer bails out the moment a single story errors. It keeps going, finishes the install, and leaves you with a working setup plus a list of broken stories to fix.
The same alpha also adds a react-vite to tanstack-react automigration, barrel-aware named-import resolution for change detection, and a High Contrast Mode a11y fix in ArgsTable. But the partial-failure tolerance is the one to watch.
What this signals
- Agent-run scaffolding is becoming a first-class install path next to the human CLI flow.
- Tolerating partial failure is the line between an agent that completes setup on a real repo and one that ragequits on the first broken story.
- Expect more QA tooling to follow the same pattern: agent flows that degrade gracefully, not ones that demand a pristine tree.
Why it matters: If you are evaluating AI-native test tools, "what happens when one thing fails" is now a more useful question than "what does the demo look like." Read more.
CI FOOTGUN PATCH
Claude Code v2.1.133 gives agents a clean worktree base and effort-aware hooks
via GitHub
Three small switches in a patch release, and one of them is the kind of default you wish you had set before the post-mortem.
Yesterday's Claude Code v2.1.133 adds a worktree.baseRef config with two values: fresh (branch from origin/<default>) or head (branch from local HEAD, the old behavior). It also exposes $CLAUDE_EFFORT to hook scripts so CI can branch on the effort level the user requested, and finally fixes a bug where subagents were not discovering project, user, or plugin skills.
It is a smaller release than v2.1.132 the day prior, but the worktree default is the one that earns its keep at 2 a.m.
Set this before the next incident
worktree.baseRef: freshkills the "agent inherited my dirty branch" class of bug for teams running parallel agents in CI.$CLAUDE_EFFORTin hooks lets test and lint pipelines scale their work to the mode the user picked, instead of running the same suite for every effort level.- The subagent skills fix is the kind of thing that quietly unblocks a lot of half-working internal tooling.
Why it matters: If you have parallel Claude Code agents anywhere near your CI, worktree.baseRef: fresh is a one-line config change with an outsized blast radius reduction. Read more.
THE BIG IDEA
The big idea
Agent-aware test infrastructure is graduating from demos to defaults. The test stack is quietly being reshaped around the assumption that agents, not just humans, are the ones running, writing, and debugging the suite.
Today alone gave us two concrete signals: Storybook's agentic setup learned to tolerate partial failure, and Claude Code added worktree.baseRef so parallel agents can branch from a clean origin instead of a developer's dirty tree. Earlier this week Vitest 4.1 shipped a reporter built for AI agents and Playwright 1.59 handed agents the browser keys. The runner and the browser-driver layers are both growing first-class agent surfaces.
The skeptical voice today is Resurf's Show HN, which argues none of this matters until agent E2E tests are actually reproducible. The surrounding tests-as-observability projects (FaultSense, Incidentary) hint that the community is treating tests as production probes precisely because it does not trust them as gates. The infrastructure is graduating. The determinism story still has homework to do.
WHAT ELSE IS SHIPPING
What else is shipping
- Jest v30.4.1 - same-week patch with tuple-form custom runner config and a CJS-from-ESM default-export fix.
INTERESTING CONVERSATIONS
Interesting conversations we're following
- Show HN: Resurf, realistic, reproducible test framework for AI browser agents on Hacker News - pitches a deterministic environment for testing browser-using agents, exactly the gap QA orgs hit gating Claude or Codex changes in CI.
- Show HN: Turn E2E tests into observability signals (FaultSense) on Hacker News - reframes E2E suites as continuous production probes rather than gate-only artifacts.
- Show HN: An OTel exporter that posts the cause to your incident channel (Incidentary) on Hacker News - summarizes root cause from OTel and pushes it to the team's incident channel, adjacent to flake-triage workflows.
- Show HN: Kill-The-Backlog, self-hosted background agents on Hacker News - self-hosted background agents closing the loop between prompting, testing, and deploying with preview-env validation.
- Formatting an entire 25M-line codebase overnight: the rubyfmt story on Lobsters - how Stripe landed a sweeping repo-wide change without nuking everyone's tests, useful reference for CI-touching migrations.
- jj v0.41.0 released on Lobsters - relevant context for teams structuring branch-per-test-fix workflows around jj.
Also from TinyIdeas Media
|
Agentic Business
For operators
What’s shipping in agentic AI, decoded for operators. Adoptable today vs. demoware.
|
Agentic Builders
For engineers
Frameworks, OSS, MCP servers. Concrete releases, not press releases.
|
Agentic Quality
For QA teams
AI-native testing tools, evals, reliability patterns. No benchmark vibes.
|
Was this email forwarded to you? Sign up here.