All posts
Feb 13, 2026 · 6 min read

What is a Pivot Score? How AI startup-idea grading works

A plain-English explainer of how PivotProof turns a startup idea into a single 0-100 number — and why the number is more useful than any narrative critique.

ShareXLinkedIn

A Pivot Score is a single number from 0 to 100 that quantifies how survivable a startup idea looks when stress-tested against five hostile expert lenses simultaneously. It's the headline output of every PivotProof report — and it's the artifact that most founders find genuinely decision-changing.

This post explains what the score means, how it's calculated, what its limits are, and why we believe quantifying ideas — even imperfectly — is more useful to founders than any amount of unstructured AI feedback.

What the score actually measures

The Pivot Score is a probabilistic estimate of how likely the idea is to survive the most common early-stage failure modes, weighted by five expert lenses. The lenses are: a hostile VC, a skeptical customer, a competitor founder, a domain expert, and a devil's advocate. Each persona issues a verdict (REJECT / REVISIT / BUY) and a 0-100 sub-score, and the final Pivot Score is a weighted aggregate.

Roughly speaking:

  • 0-34 (BURNED): The idea has multiple disqualifiers that historically kill startups: no clear ICP, commoditized category, unwinnable competitive dynamics, or a market that isn't actually large. Examples we've seen score in this range: most "AI for [generic vertical]" ideas, most no-code clones, most consumer subscription products in saturated categories.
  • 35-69 (ON THE FENCE): The idea has one or two real defensibility issues but isn't terminally broken. With the right pivot or a sharper wedge, this becomes a viable starting point. Most ideas that founders are seriously considering land here.
  • 70-100 (SURVIVOR): The idea passes most of the standard hostile filters. There's a clear painful problem, a credible wedge, a defensible angle, and a path to scale. Survivor ideas still fail — most ideas do — but they fail for execution reasons, not idea reasons.

How the score is computed

The mechanism, at a high level:

  1. Each persona receives the idea description and a hostile-by-default system prompt. They generate a verdict (REJECT/REVISIT/BUY) and a numeric confidence.
  2. A synthesis pass aggregates the five verdicts, weighting them by the persona's relevance to the idea (e.g., the Competitor persona's view weighs more heavily for ideas in crowded markets; the Domain Expert weighs more in regulated industries).
  3. A final calibration step normalizes the score against a corpus of historical reports so "52" means the same thing today as it did last month.

The score is generated by Anthropic's Claude with carefully tuned prompts designed to override the model's default agreeableness. We've written more about why hostile prompting matters.

Why a number, not a paragraph?

Because numbers are comparable across ideas and across time, and paragraphs aren't. If you run three variations of your idea through PivotProof and see scores of 18, 41, and 67, you instantly know which direction to push. If instead you get three paragraphs of nuanced critique, you'll subconsciously prefer the variation that flatters you the most.

The score is also the artifact that survives the bus-factor problem: it can be pasted into a slack channel, an investor email, or a co-founder argument. "The Pivot Score on your current idea is 22" is harder to wave away than "I have some concerns."

What the score doesn't tell you

It doesn't tell you whether you are the right founder for the idea. It doesn't measure execution speed, team strength, or fundraising ability. It can't reliably grade ideas in highly specialized regulated domains (deeptech bio, frontier physics, certain regulated fintech wedges) where the differentiating insight requires expertise outside the model's training data.

It also can be gamed: if you write your idea description to be vague, you'll get a vague score. The system rewards specificity. The most useful Pivot Scores come from idea submissions that include a concrete ICP, a stated wedge, and an honest description of the existing solution your customers use today.

The most useful way to use it

Don't run an idea once and treat the score as a verdict. Run the idea, read the critiques, write a sharper version, run it again. The delta between iterations is more useful than any single score. Founders who use PivotProof well typically run 3-5 variations of their idea over a week, with the score climbing as the wedge sharpens and the kill criteria tighten.

Get your first Pivot Score — it's free to try one idea, and you'll see exactly what the number represents in your own context.

Found this useful? Pass it on.
ShareXLinkedIn

Want your idea torn apart by 5 hostile experts?

Try PivotProof