A/B Testing

Design experiments with honest statistics.

Test report

Verdict
Tested · Works
Score
8.4/10
Tested
Jun 15, 2026
Environment
Claude Code 2.x

Designed an experiment with correct sample-size math and honest run-time estimates; flagged our proposed test as underpowered, which was right.

Scored on four weighted criteria — install, triggering, output vs. baseline, docs. How scoring works

  • Installs cleanly 5/5
  • Triggers reliably 4/5
  • Output vs. baseline 8/10
  • Docs & honesty 4/5

What it does

Experiment design, hypothesis framing, sample-size and duration math. Notably refuses to bless underpowered tests — the most common real-world A/B testing mistake.

Install

Copy the skill folder into ~/.claude/skills/ and describe the change you want to test.

Skills live in ~/.claude/skills/ (global) or .claude/skills/ (per-project). Restart Claude Code after installing.

Appears in