What is the best way to measure GaaS ROI?

Measure the change in verified business outcomes against the full implementation and operating cost. Include human review, corrections, exceptions, reliability, adoption, and risk—not only model or task activity.

What baseline is needed before a GaaS pilot?

Record current task volume, labour time, elapsed time, handoffs, error and rework rates, missed work, operating cost, quality, exceptions, and customer or employee experience.

What is an accepted outcome?

An accepted outcome is a completed result that meets predefined source, accuracy, completeness, validation, destination, and reviewer requirements without substantial rework.

Should saved employee time be counted as financial benefit?

Count it carefully. Time creates value when it reduces paid effort, avoids hiring or outsourcing, increases useful capacity, or is reassigned to measurable higher-value work.

How should GaaS risk affect ROI?

Include expected correction, incident, downtime, compliance, oversight, and provider-dependence costs. Use conservative assumptions rather than claiming the full value of unlikely avoided losses.

When should a GaaS pilot be scaled?

Scale only after it meets agreed thresholds for accepted outcomes, quality, cost per result, reviewer burden, reliability, risk, activity evidence, and user value.

How to Measure GaaS ROI

Measuring the return on Agentic as a Service requires more than counting model calls, completed runs, or hours that appear to be automated.

A GaaS service creates value only when it improves a real process.

That improvement may come from:

completing work faster;
reducing the cost to deliver an accepted outcome;
improving quality or consistency;
increasing capacity;
reducing missed work;
shortening waiting time;
enabling a capability that did not exist before; or
lowering operational risk.

ROI should therefore be measured at the level of the verified business outcome.

An agent that runs frequently but produces corrections, exceptions, duplicates, or unusable output may generate activity without generating value.

Start with the decision the measurement must support

Before choosing metrics, decide what decision the ROI analysis should inform.

Common decisions include:

whether to continue a pilot;
whether to move into production;
whether to expand to more users;
whether to add write access;
whether to replace an existing process;
whether to renew a provider;
whether to build internally;
whether to change models or tools; or
whether to retire the service.

Different decisions require different evidence.

A short pilot may focus on task quality and review effort.

A renewal decision should include full operating cost, reliability, support, adoption, and switching risk.

Define the unit of value

Select one primary unit that represents useful completed work.

Examples include:

accepted report;
correctly routed request;
approved customer-response draft;
resolved service case;
verified document package;
completed research brief;
approved record update;
detected project risk;
processed supplier review; or
completed recurring workflow.

The unit should be specific enough to count consistently.

“Agent run” is usually not a good value unit because one run may fail, repeat, or produce an outcome that a person rejects.

Distinguish activity, output, and outcome

These three measurement levels should not be confused.

Level	Example	What it shows
Activity	Model calls, tool calls, workflow runs	How much the system did
Output	Draft, summary, classification, proposed update	What the system produced
Outcome	Accepted report, correct routing, confirmed update	Whether the work created usable value

Activity is useful for cost and diagnosis.

Output is useful for quality review.

Outcome is the strongest basis for ROI.

Establish the current-process baseline

Measure the existing process before introducing GaaS.

A baseline should cover a representative period and include:

task volume;
employee time;
waiting time;
total completion time;
number of handoffs;
error rate;
rework;
missed work;
exception rate;
operating cost;
technology cost;
customer or employee experience; and
risk events.

Without a baseline, later improvement claims become assumptions.

Decompose the process

Break the task into stages.

For example:

Stage	Current effort
Collect project updates	45 minutes
Identify missing responses	20 minutes
Review deadlines and blockers	35 minutes
Prepare report	40 minutes
Correct and distribute	20 minutes

This reveals where GaaS creates value and where human effort remains.

It also prevents the organisation from attributing every process improvement to the agentic service.

Measure elapsed time and labour separately

Labour time is the active human effort required.

Elapsed time is the total duration from request to accepted outcome.

GaaS may reduce one without reducing the other.

For example, an agent may prepare a draft in five minutes but leave it in an approval queue for two days.

Measure:

human handling time;
agent processing time;
queue time;
approval time;
correction time;
exception resolution time; and
total completion time.

This helps identify whether the service removes work or only moves the delay.

Calculate the full GaaS cost

Total cost includes more than the provider invoice.

Include:

subscriptions;
task or usage charges;
model usage;
tool and API charges;
storage;
integrations;
implementation;
workflow design;
data preparation;
testing;
security and privacy review;
employee training;
human review;
exception handling;
monitoring;
support;
incident response;
maintenance;
change management; and
migration or exit costs.

An apparently inexpensive agent may become costly when review and correction effort are included.

Separate one-time and recurring costs

One-time costs may include:

provider selection;
implementation;
integration;
initial data preparation;
workflow design;
pilot testing;
governance setup; and
training.

Recurring costs may include:

provider charges;
model and tool usage;
monitoring;
reviews;
corrections;
support;
access audits;
ongoing evaluation;
updates; and
incident handling.

Keep the categories separate so the organisation can understand payback time and steady-state economics.

Use an accepted-outcome rate

The accepted-outcome rate measures how often GaaS produces a result that can be used without substantial rework.

A practical calculation is:

Accepted outcomes
÷
Total completed attempts

Define “accepted” before measurement begins.

It may require:

correct source use;
complete required fields;
valid formatting;
no unsupported claims;
correct destination;
successful validation; and
reviewer approval.

A lenient definition can make performance look better than it is.

Measure human correction

Track how much human effort is needed after the service completes its part.

Record:

minor edits;
factual corrections;
missing information;
wrong classifications;
source changes;
destination corrections;
complete rework; and
rejected output.

Useful measures include:

percentage requiring correction;
average correction time;
correction cost per outcome; and
reasons for correction.

The service creates limited efficiency when employees spend as long fixing the output as they previously spent doing the work.

Measure exception handling

Agentic services should route uncertain or unsupported cases to people.

Exceptions are not automatically failures.

They may show that the service stops safely.

Measure:

exception rate;
correct escalation rate;
false escalation rate;
missed escalation rate;
time to resolve;
reviewer effort; and
recurring exception causes.

A low exception rate is not always better if the agent is continuing through cases it should stop.

Measure quality

Quality metrics depend on the use case.

They may include:

factual accuracy;
source accuracy;
field accuracy;
completeness;
correct classification;
correct routing;
valid calculations;
policy compliance;
consistent formatting;
correct destination;
duplicate prevention; and
reviewer acceptance.

Use representative samples and clear scoring criteria.

Avoid relying only on general impressions such as “the output looks good.”

Measure speed to outcome

Speed to outcome compares the total time needed to reach an accepted result.

Measure:

baseline completion time;
GaaS completion time;
review time;
correction time;
waiting time; and
exception time.

Faster execution is valuable only when quality remains acceptable.

A service that produces a wrong result immediately is not more efficient.

Measure cost to serve

Cost to serve represents the total cost of delivering one accepted outcome.

Include:

human labour;
provider charges;
model usage;
tool usage;
review;
correction;
monitoring;
support; and
allocated operating costs.

A practical comparison is:

Total process cost
÷
Accepted outcomes

Compare the baseline and GaaS-enabled process using the same definition.

Measure capacity

GaaS may create value by increasing the amount of work a team can handle.

Measure:

tasks completed per day or week;
backlog size;
waiting time;
cases per employee;
work completed outside peak hours;
percentage handled without manual preparation; and
ability to absorb demand spikes.

Higher volume should not hide lower quality.

Track capacity together with acceptance and correction rates.

Measure new capability

Some value does not come from reducing an existing cost.

GaaS may enable work that was previously impractical, such as:

daily monitoring instead of monthly review;
source-based briefings for every account;
faster after-hours intake;
wider document review;
earlier risk detection;
multilingual preparation;
continuous exception tracking; or
service for low-volume cases that previously received no support.

Define the business value of the new capability.

Possible measures include:

risks detected earlier;
opportunities identified;
additional customers served;
additional projects reviewed;
prevented delays;
improved conversion;
reduced backlog; or
avoided external cost.

Measure risk and avoided loss carefully

GaaS can reduce or increase risk.

Potential positive measures include:

fewer missed checks;
fewer duplicate actions;
earlier escalation;
more complete records;
stronger evidence retention;
reduced access scope; and
faster incident detection.

Potential negative measures include:

wrong writes;
privacy incidents;
unsupported claims;
excessive permissions;
outages;
duplicate actions;
missed escalations; and
provider failures.

Avoid claiming the full value of a hypothetical avoided loss without a defensible method.

Use historical frequency, probability ranges, and conservative assumptions.

Calculate risk-adjusted value

A simple business case may overstate returns when it ignores failures.

Risk-adjusted value should subtract:

expected correction cost;
expected exception cost;
expected incident cost;
expected downtime impact;
compliance overhead;
additional review cost; and
provider concentration risk.

The purpose is not to assign perfect monetary values to every risk.

It is to prevent the analysis from treating agentic execution as consequence-free.

Measure adoption

A technically capable service creates little value when people do not use it correctly.

Track:

eligible users;
active users;
repeat users;
tasks submitted;
tasks abandoned;
approval response time;
manual workarounds;
user-reported trust;
user satisfaction; and
reasons for non-use.

Low adoption may result from poor training, weak task fit, slow performance, unclear responsibility, or lack of trust.

Measure reviewer burden

Human oversight has a cost.

Track:

number of approvals;
average review time;
rejected proposals;
corrections per approval;
reviewer backlog;
missed approval windows;
approval fatigue; and
escalation quality.

A service that saves ten minutes of preparation but adds fifteen minutes of review does not create a positive labour return.

Review may still be worthwhile for quality or risk reasons, but that value should be stated honestly.

Measure reliability

GaaS ROI depends on whether the service remains available and predictable.

Track:

successful runs;
failed runs;
tool failures;
source failures;
model failures;
authentication failures;
schedule failures;
partial completions;
recovery time;
repeated incidents; and
tasks requiring manual restart.

Reliability problems create hidden labour and delay.

Measure automation quality, not automation percentage

A high automation rate can be misleading.

An organisation might automate many low-value steps while leaving the expensive review and exception work unchanged.

Instead of asking only:

What percentage of tasks was automated?

Also ask:

How many accepted outcomes were completed?
How much human effort remained?
Which steps still caused delay?
How often did the service need correction?
Did the service reduce or create risk?
Did customers or employees receive better results?

Use a practical ROI formula

A simple ROI formula is:

(Total measurable benefit − Total [GaaS cost)
÷
Total GaaS](/gaas/how-to-implement-gaas-safely) cost
× 100

Total measurable benefit may include:

labour savings;
reduced external spend;
increased capacity;
reduced delay;
reduced rework;
increased revenue;
avoided loss; and
valued new capability.

Total GaaS cost should include implementation and ongoing operation.

Use conservative, documented assumptions.

Calculate payback period

The payback period shows how long it takes for benefits to recover the initial investment.

A practical calculation is:

Initial implementation cost
÷
Net monthly benefit

This is useful when comparing pilots or deciding whether a use case is worth scaling.

A positive long-term ROI may still be unattractive when the payback period is too long for the organisation's needs.

Use scenarios instead of one confident estimate

Create at least three cases:

conservative;
expected;
optimistic.

Vary assumptions such as:

task volume;
acceptance rate;
correction time;
provider cost;
model usage;
reviewer effort;
adoption;
reliability; and
value per outcome.

Scenario analysis shows which assumptions have the greatest effect on the result.

Compare with the true alternative

The alternative is not always the original manual process.

Compare GaaS with realistic options such as:

improving the current workflow;
using a SaaS feature;
building deterministic automation;
using one AI capability;
building an internal agent;
outsourcing the work;
changing the process; or
doing nothing.

This avoids choosing GaaS merely because it performs better than an inefficient baseline.

Design the pilot for measurement

A measurable pilot should define:

baseline period;
pilot period;
eligible task volume;
control or comparison group when possible;
accepted-outcome definition;
quality scoring;
reviewers;
cost categories;
success thresholds;
risk limits;
stop conditions; and
final decision date.

Use representative cases rather than only easy examples.

Create a compact scorecard

A practical GaaS scorecard may include:

Area	Example metric
Outcome	Accepted outcomes per week
Quality	Percentage passing review without material correction
Speed	Median time to accepted outcome
Cost	Total cost per accepted outcome
Capacity	Additional tasks completed
Oversight	Average review time
Exceptions	Correctly escalated cases
Reliability	Successful scheduled runs
Risk	Wrong or duplicate external actions
Adoption	Repeat use by eligible users

Keep the scorecard focused.

Too many metrics can hide the measures that determine the business decision.

Set scale-up thresholds

Define what must be true before expanding.

Example thresholds might include:

accepted-outcome rate above the agreed level;
material-correction rate below the agreed level;
no unresolved high-impact incident;
cost per accepted outcome below the baseline;
reviewer burden within available capacity;
reliable activity evidence;
stable tool and schedule performance; and
positive user feedback.

Scale should be earned through evidence.

Set pause and retirement thresholds

Define when the service should be paused or retired.

Examples include:

repeated wrong actions;
duplicate writes;
loss of source reliability;
uncontrolled cost;
falling acceptance;
rising correction time;
reviewer overload;
unresolved incidents;
low adoption;
provider instability; or
a simpler alternative becoming available.

ROI measurement should support stopping decisions, not only expansion.

Review ROI by stakeholder

Different groups may experience different costs and benefits.

Employees may gain:

less repetitive work;
faster context gathering;
better-prepared drafts; and
more time for judgement.

Managers may gain:

better visibility;
shorter backlogs;
more consistent processes; and
measurable capacity.

Customers may gain:

faster responses;
clearer status;
fewer missed handoffs; and
more consistent service.

Security, privacy, and operations teams may incur:

access reviews;
monitoring;
incident work;
policy development; and
provider oversight.

A complete ROI analysis includes value and workload across stakeholders.

Review ROI over time

GaaS economics change after launch.

Costs may fall as the workflow improves.

They may also rise because of:

increasing volume;
more expensive models;
extra tools;
longer context;
broader use;
added monitoring;
new review requirements; or
provider price changes.

Review ROI:

after the pilot;
after production launch;
after major workflow changes;
after model or provider changes;
after incidents;
before renewal; and
at a regular operating interval.

Avoid common ROI mistakes

Common mistakes include:

measuring model output instead of accepted outcomes;
having no baseline;
ignoring implementation cost;
treating reassigned time as automatic cash savings;
excluding human review;
ignoring correction and exception work;
counting every agent run as success;
assuming maximum adoption;
monetising speculative benefits aggressively;
ignoring risk and incidents;
using only ideal test cases; and
continuing a weak pilot because the technology appears promising.

How Feluda can support ROI measurement

Feluda can help users observe and compare controlled agentic workflows.

Workbench Activity can support review of interactive tool calls and results.

Studio can make process steps, branches, validation, and error paths visible.

RunFlows can show execution progress, intermediate output, warnings, and errors.

Schedule Manager can provide recurring-run history and support pausing or resuming automation.

Local and cloud provider options can be compared for task quality, runtime, privacy requirements, and operating cost.

These capabilities can help teams connect workflow behaviour with outcome, quality, reliability, and cost measures.

A practical GaaS ROI example

Consider a weekly project-reporting process.

Baseline

20 reports per month;
120 minutes of employee time per report;
10 percent require rework;
average elapsed time of two days;
monthly labour cost of €3,200.

GaaS pilot

20 reports per month;
25 minutes of human review per report;
15 percent require material correction;
average elapsed time of four hours;
provider, model, and tool cost of €700;
monitoring and support cost of €250;
remaining labour cost of €670.

Monthly comparison

The pilot's direct monthly operating cost is:

€700 provider and usage
+ €250 monitoring and support
+ €670 human review and correction
= €1,620

The direct monthly difference from the baseline is:

€3,200 − €1,620 = €1,580

Before claiming ROI, the organisation should also include implementation cost, validate the report quality, and confirm that the remaining employee time creates useful capacity.

Final ROI checklist

Before presenting a GaaS business case, confirm that:

the decision is clear;
the unit of value is defined;
the baseline is measured;
labour and elapsed time are separate;
one-time costs are included;
recurring costs are included;
human review is included;
correction effort is included;
accepted outcomes are counted;
quality is measured;
exceptions are measured;
reliability is measured;
adoption is measured;
risk and incidents are considered;
new capability is valued conservatively;
realistic alternatives are compared;
scenarios are documented;
scale-up thresholds are defined;
pause thresholds are defined; and
ROI will be reviewed after changes.

The practical conclusion

GaaS ROI is not the number of tasks an agent starts.

It is the measurable value of accepted outcomes after every cost, correction, exception, delay, and risk control is considered.

Begin with a credible baseline.

Measure the complete process.

Count human oversight.

Use conservative assumptions.

Separate efficiency from new capability.

Expand only when quality, cost, reliability, and risk evidence support the decision.

The strongest GaaS business case is not the one with the largest theoretical saving.

It is the one that can be measured repeatedly and still demonstrates useful value under normal operating conditions.