Agentic QA

Agentic QA

Issue #5 · 12 min read · By Ben

Quiet day on the beat, but Jest v30.4.0 is the most substantive runner release of the week.

Mornin'. If you have ever shipped a library that is dual-published as CJS and ESM, you know the special hell of test files that import one shape and runtime code that exports the other. Jest v30.4.0 quietly fixed it yesterday: a runtime rewrite that lets you require() ES modules on Node 24.9+, plus fake timers for the new Temporal API. One thing to try this week: pull the version, delete one of your ESM shims, and see what falls out. My bet is less than you fear.

-Ben

In today's newsletter:

Jest finally requires ES modules
Storybook's agent setup grows up
Claude Code gets clean worktrees

RUNNER REWRITE

Jest v30.4.0 finally lets you require() ES modules, plus Temporal fake timers

via GitHub

The CJS/ESM cold war just got a peace treaty, and Jest is the country that signed it.

Yesterday's v30.4.0 release ships what the team calls a "major runtime rewrite for ESM stabilization," with the headline that you can finally require() ES modules on Node v24.9 and up. That is the long-standing pain point for hybrid CJS/ESM test suites, the one that has been generating cursed moduleNameMapper snippets in every Node monorepo for the last three years.

It is not just the require interop. The same minor adds fake-timer support for the Temporal proposal, React 19 in pretty-format, a --collect-tests discovery flag, per-project verbose and silent, and jest.config.mts support. A same-week v30.4.1 patch follows up with tuple-form custom runner config and a CJS-from-ESM default-export fix that closes a real footgun for libraries dual-published in both shapes.

Why your test box cares

Native ESM require() on Node 24.9+ means most of your ESM workarounds can come out.
Fake timers for Temporal.Duration, Temporal.Instant, Temporal.ZonedDateTime, and Temporal.Now.* make Jest the first major runner with first-class fake-time for Temporal.
The v30.4.1 default-export fix matters specifically for dual-published libraries where Jest was picking the wrong shape.

Why it matters: If your team has been pinning Node or shimming around ESM in tests, this is the upgrade that lets you delete code, not add it. Read more.

AGENT ONBOARDING

Storybook 10.4 alpha hardens its agentic setup for partial failures

via GitHub

Most agent demos are a clean repo and a single happy path. Real codebases are neither, and Storybook just admitted it.

Storybook's 10.4 alpha.18 ships a one-line headline with a much bigger meaning: "Agentic Setup: Allow failed stories to persist." The agent-driven onboarding flow no longer bails out the moment a single story errors. It keeps going, finishes the install, and leaves you with a working setup plus a list of broken stories to fix.

The same alpha also adds a react-vite to tanstack-react automigration, barrel-aware named-import resolution for change detection, and a High Contrast Mode a11y fix in ArgsTable. But the partial-failure tolerance is the one to watch.

What this signals

Agent-run scaffolding is becoming a first-class install path next to the human CLI flow.
Tolerating partial failure is the line between an agent that completes setup on a real repo and one that ragequits on the first broken story.
Expect more QA tooling to follow the same pattern: agent flows that degrade gracefully, not ones that demand a pristine tree.

Why it matters: If you are evaluating AI-native test tools, "what happens when one thing fails" is now a more useful question than "what does the demo look like." Read more.

CI FOOTGUN PATCH

Claude Code v2.1.133 gives agents a clean worktree base and effort-aware hooks

via GitHub

Three small switches in a patch release, and one of them is the kind of default you wish you had set before the post-mortem.

Yesterday's Claude Code v2.1.133 adds a worktree.baseRef config with two values: fresh (branch from origin/<default>) or head (branch from local HEAD, the old behavior). It also exposes $CLAUDE_EFFORT to hook scripts so CI can branch on the effort level the user requested, and finally fixes a bug where subagents were not discovering project, user, or plugin skills.

It is a smaller release than v2.1.132 the day prior, but the worktree default is the one that earns its keep at 2 a.m.

Set this before the next incident

worktree.baseRef: fresh kills the "agent inherited my dirty branch" class of bug for teams running parallel agents in CI.
$CLAUDE_EFFORT in hooks lets test and lint pipelines scale their work to the mode the user picked, instead of running the same suite for every effort level.
The subagent skills fix is the kind of thing that quietly unblocks a lot of half-working internal tooling.

Why it matters: If you have parallel Claude Code agents anywhere near your CI, worktree.baseRef: fresh is a one-line config change with an outsized blast radius reduction. Read more.

THE BIG IDEA

The big idea

Agent-aware test infrastructure is graduating from demos to defaults. The test stack is quietly being reshaped around the assumption that agents, not just humans, are the ones running, writing, and debugging the suite.

Today alone gave us two concrete signals: Storybook's agentic setup learned to tolerate partial failure, and Claude Code added worktree.baseRef so parallel agents can branch from a clean origin instead of a developer's dirty tree. Earlier this week Vitest 4.1 shipped a reporter built for AI agents and Playwright 1.59 handed agents the browser keys. The runner and the browser-driver layers are both growing first-class agent surfaces.

The skeptical voice today is Resurf's Show HN, which argues none of this matters until agent E2E tests are actually reproducible. The surrounding tests-as-observability projects (FaultSense, Incidentary) hint that the community is treating tests as production probes precisely because it does not trust them as gates. The infrastructure is graduating. The determinism story still has homework to do.

WHAT ELSE IS SHIPPING

What else is shipping

Jest v30.4.1 - same-week patch with tuple-form custom runner config and a CJS-from-ESM default-export fix.

INTERESTING CONVERSATIONS

Interesting conversations we're following

Show HN: Resurf, realistic, reproducible test framework for AI browser agents on Hacker News - pitches a deterministic environment for testing browser-using agents, exactly the gap QA orgs hit gating Claude or Codex changes in CI.
Show HN: Turn E2E tests into observability signals (FaultSense) on Hacker News - reframes E2E suites as continuous production probes rather than gate-only artifacts.
Show HN: An OTel exporter that posts the cause to your incident channel (Incidentary) on Hacker News - summarizes root cause from OTel and pushes it to the team's incident channel, adjacent to flake-triage workflows.
Show HN: Kill-The-Backlog, self-hosted background agents on Hacker News - self-hosted background agents closing the loop between prompting, testing, and deploying with preview-env validation.
Formatting an entire 25M-line codebase overnight: the rubyfmt story on Lobsters - how Stripe landed a sweeping repo-wide change without nuking everyone's tests, useful reference for CI-touching migrations.
jj v0.41.0 released on Lobsters - relevant context for teams structuring branch-per-test-fix workflows around jj.

Was this email forwarded to you? Sign up here.

Jest v30.4.0 finally lets you require() ES modules, plus Temporal fake timers

Why your test box cares

Storybook 10.4 alpha hardens its agentic setup for partial failures

What this signals

Claude Code v2.1.133 gives agents a clean worktree base and effort-aware hooks

Set this before the next incident

The big idea

What else is shipping

Interesting conversations we're following

Also from TinyIdeas Media

Agentic Business

Agentic Builders