Sandbox

Price the shape of a pull request.

This browser sandbox is a marketing demo of the Class1 method. The production path runs the Python engine in CI.

Scenario controls

How to read the sandbox

The browser demo teaches the decision shape; CI runs the real estimator.

The sliders expose the levers that usually move AI cost: call volume, output length, retry tail, added input context, and the budget threshold. Increasing any one of them can look manageable in isolation, but the budget-case risk appears when they compound.

The real Class1 path does not rely on these sliders. In CI, takeoff.estimate_pr reads the actual file versions, scans the diff, builds baseline and PR scenarios, runs the Python Monte Carlo engine, and writes the policy payload. The browser page is deliberately labelled as illustrative so the site stays honest.

For sales, the sandbox is still useful. It lets a buyer feel the approval problem in thirty seconds: the same feature can be acceptable at expected cost and unacceptable at P90 after retries or output growth. That is the moment the blocking gate becomes intuitive.

Paste-a-diff product note

The real web demo is still a GOAL.md open item.

This page sells the workflow and approximates the economics in the browser. The repo's remaining L1 item is a true paste-a-diff web demo over estimate_pr(file_versions).

python -m takeoff.estimate_pr \
  --base origin/main \
  --head HEAD \
  --json gate.json

What the production demo should add next

This is the remaining L1 web surface item.

Paste a real diffLet a visitor paste a unified diff and route it through estimate_pr(file_versions), instead of approximating with sliders.
Show the generated PR commentRender the exact text that would be posted to GitHub, including estimate class, policy verdict, BOE, and footprint block.
Export gate JSONGive technical evaluators the machine-readable output so they can imagine CI and dashboard integration.
Keep local privacyA local/browser-first version can demonstrate sensitive code handling without requiring upload to a hosted service.

Sandbox questions

How evaluators should interpret the browser numbers.

Why does the sandbox use sliders instead of reading my repository?

This static page is built to sell and explain the workflow without uploading code. The production path reads real file versions through the Python takeoff engine.

Why is P90 usually much higher than P50?

The tail compounds systemic factors: output growth, retry pressure, context expansion, fallback rate, and demand spikes. Averages hide that approval risk.

What should I do when the sandbox fails the budget?

Treat it like the real gate: reduce output, cap retries, shrink context, lazy-load tools, choose a better-fit model, or bring in the budget owner.

What makes the real CI run stronger?

CI uses the actual diff, the frozen price basis, model-fit recommendation, BOE, footprint block, and machine-readable gate payload instead of this simplified browser formula.