The 3-Step AI Value Test

Value Test Builder

Organisation name

Step 1: Choose One Workflow That Matters

Focus. Not "Where can we use AI?" but "Where does performance matter most?"

Workflow Name

Why this workflow matters (cost, risk, speed, quality)

Current measurable output (e.g. turnaround time, error rate, cost per case)

Baseline performance today

Step 2A: Break the Workflow Into Steps

List the process in 3–7 discrete steps.

Step 1 Step 2 Step 3 Step 4 Step 5

Step 2B: Test Each Step

For each step: Can AI perform it independently? What would "good enough" reliability look like? What is the actual reliability when tested?

Step	AI-Tested? (Y/N)	Required Reliability	Observed Reliability	Human Oversight?
1
2
3

Where reliability is below threshold, redesign or constrain the task — don't scale it yet.

Step 3A: Define the New Human–AI Split

Based on testing.

AI will reliably handle

Humans will retain responsibility for

Escalation trigger (when AI output needs review)

Step 3B: Define Measurable Impact

If this redesign works, what improves?

Primary metric

Baseline

90-day target

Accountable owner

Review date

Leadership Commitment

"We will test and redesign because improving it will directly impact ."

Plan Preview

Caversham House

Step 1: Choose One Workflow

Workflow: ...

Why it matters: ...

Current output: ...

Baseline today: ...

Step 2A: Workflow Steps

1. ...

2. ...

3. ...

4. ...

5. ...

Step 2B: Test Results

Step	AI-Tested?	Required	Observed	Human Oversight?
1	...	...	...	...
2	...	...	...	...
3	...	...	...	...

Step 3A: Human–AI Split

AI will handle: ...

Humans retain: ...

Escalation trigger: ...

Step 3B: Measurable Impact

Primary metric: ...

Baseline: ... → 90-day target: ...

Owner: ... | Review date: ...

Leadership Commitment

"We will test and redesign ... because improving it will directly impact ...."