How to Measure GaaS ROI
Measuring the return on Agentic as a Service requires more than counting model calls, completed runs, or hours that appear to be automated.
A GaaS service creates value only when it improves a real process.
That improvement may come from:
- completing work faster;
- reducing the cost to deliver an accepted outcome;
- improving quality or consistency;
- increasing capacity;
- reducing missed work;
- shortening waiting time;
- enabling a capability that did not exist before; or
- lowering operational risk.
ROI should therefore be measured at the level of the verified business outcome.
An agent that runs frequently but produces corrections, exceptions, duplicates, or unusable output may generate activity without generating value.
Start with the decision the measurement must support
Before choosing metrics, decide what decision the ROI analysis should inform.
Common decisions include:
- whether to continue a pilot;
- whether to move into production;
- whether to expand to more users;
- whether to add write access;
- whether to replace an existing process;
- whether to renew a provider;
- whether to build internally;
- whether to change models or tools; or
- whether to retire the service.
Different decisions require different evidence.
A short pilot may focus on task quality and review effort.
A renewal decision should include full operating cost, reliability, support, adoption, and switching risk.
Define the unit of value
Select one primary unit that represents useful completed work.
Examples include:
- accepted report;
- correctly routed request;
- approved customer-response draft;
- resolved service case;
- verified document package;
- completed research brief;
- approved record update;
- detected project risk;
- processed supplier review; or
- completed recurring workflow.
The unit should be specific enough to count consistently.
“Agent run” is usually not a good value unit because one run may fail, repeat, or produce an outcome that a person rejects.
Distinguish activity, output, and outcome
These three measurement levels should not be confused.
| Level | Example | What it shows |
|---|---|---|
| Activity | Model calls, tool calls, workflow runs | How much the system did |
| Output | Draft, summary, classification, proposed update | What the system produced |
| Outcome | Accepted report, correct routing, confirmed update | Whether the work created usable value |
Activity is useful for cost and diagnosis.
Output is useful for quality review.
Outcome is the strongest basis for ROI.
Establish the current-process baseline
Measure the existing process before introducing GaaS.
A baseline should cover a representative period and include:
- task volume;
- employee time;
- waiting time;
- total completion time;
- number of handoffs;
- error rate;
- rework;
- missed work;
- exception rate;
- operating cost;
- technology cost;
- customer or employee experience; and
- risk events.
Without a baseline, later improvement claims become assumptions.
Decompose the process
Break the task into stages.
For example:
| Stage | Current effort |
|---|---|
| Collect project updates | 45 minutes |
| Identify missing responses | 20 minutes |
| Review deadlines and blockers | 35 minutes |
| Prepare report | 40 minutes |
| Correct and distribute | 20 minutes |
This reveals where GaaS creates value and where human effort remains.
It also prevents the organisation from attributing every process improvement to the agentic service.
Measure elapsed time and labour separately
Labour time is the active human effort required.
Elapsed time is the total duration from request to accepted outcome.
GaaS may reduce one without reducing the other.
For example, an agent may prepare a draft in five minutes but leave it in an approval queue for two days.
Measure:
- human handling time;
- agent processing time;
- queue time;
- approval time;
- correction time;
- exception resolution time; and
- total completion time.
This helps identify whether the service removes work or only moves the delay.
Calculate the full GaaS cost
Total cost includes more than the provider invoice.
Include:
- subscriptions;
- task or usage charges;
- model usage;
- tool and API charges;
- storage;
- integrations;
- implementation;
- workflow design;
- data preparation;
- testing;
- security and privacy review;
- employee training;
- human review;
- exception handling;
- monitoring;
- support;
- incident response;
- maintenance;
- change management; and
- migration or exit costs.
An apparently inexpensive agent may become costly when review and correction effort are included.
Separate one-time and recurring costs
One-time costs may include:
- provider selection;
- implementation;
- integration;
- initial data preparation;
- workflow design;
- pilot testing;
- governance setup; and
- training.
Recurring costs may include:
- provider charges;
- model and tool usage;
- monitoring;
- reviews;
- corrections;
- support;
- access audits;
- ongoing evaluation;
- updates; and
- incident handling.
Keep the categories separate so the organisation can understand payback time and steady-state economics.
Use an accepted-outcome rate
The accepted-outcome rate measures how often GaaS produces a result that can be used without substantial rework.
A practical calculation is:
Accepted outcomes
÷
Total completed attempts
Define “accepted” before measurement begins.
It may require:
- correct source use;
- complete required fields;
- valid formatting;
- no unsupported claims;
- correct destination;
- successful validation; and
- reviewer approval.
A lenient definition can make performance look better than it is.
Measure human correction
Track how much human effort is needed after the service completes its part.
Record:
- minor edits;
- factual corrections;
- missing information;
- wrong classifications;
- source changes;
- destination corrections;
- complete rework; and
- rejected output.
Useful measures include:
- percentage requiring correction;
- average correction time;
- correction cost per outcome; and
- reasons for correction.
The service creates limited efficiency when employees spend as long fixing the output as they previously spent doing the work.
Measure exception handling
Agentic services should route uncertain or unsupported cases to people.
Exceptions are not automatically failures.
They may show that the service stops safely.
Measure:
- exception rate;
- correct escalation rate;
- false escalation rate;
- missed escalation rate;
- time to resolve;
- reviewer effort; and
- recurring exception causes.
A low exception rate is not always better if the agent is continuing through cases it should stop.
Measure quality
Quality metrics depend on the use case.
They may include:
- factual accuracy;
- source accuracy;
- field accuracy;
- completeness;
- correct classification;
- correct routing;
- valid calculations;
- policy compliance;
- consistent formatting;
- correct destination;
- duplicate prevention; and
- reviewer acceptance.
Use representative samples and clear scoring criteria.
Avoid relying only on general impressions such as “the output looks good.”
Measure speed to outcome
Speed to outcome compares the total time needed to reach an accepted result.
Measure:
- baseline completion time;
- GaaS completion time;
- review time;
- correction time;
- waiting time; and
- exception time.
Faster execution is valuable only when quality remains acceptable.
A service that produces a wrong result immediately is not more efficient.
Measure cost to serve
Cost to serve represents the total cost of delivering one accepted outcome.
Include:
- human labour;
- provider charges;
- model usage;
- tool usage;
- review;
- correction;
- monitoring;
- support; and
- allocated operating costs.
A practical comparison is:
Total process cost
÷
Accepted outcomes
Compare the baseline and GaaS-enabled process using the same definition.
Measure capacity
GaaS may create value by increasing the amount of work a team can handle.
Measure:
- tasks completed per day or week;
- backlog size;
- waiting time;
- cases per employee;
- work completed outside peak hours;
- percentage handled without manual preparation; and
- ability to absorb demand spikes.
Higher volume should not hide lower quality.
Track capacity together with acceptance and correction rates.
Measure new capability
Some value does not come from reducing an existing cost.
GaaS may enable work that was previously impractical, such as:
- daily monitoring instead of monthly review;
- source-based briefings for every account;
- faster after-hours intake;
- wider document review;
- earlier risk detection;
- multilingual preparation;
- continuous exception tracking; or
- service for low-volume cases that previously received no support.
Define the business value of the new capability.
Possible measures include:
- risks detected earlier;
- opportunities identified;
- additional customers served;
- additional projects reviewed;
- prevented delays;
- improved conversion;
- reduced backlog; or
- avoided external cost.
Measure risk and avoided loss carefully
GaaS can reduce or increase risk.
Potential positive measures include:
- fewer missed checks;
- fewer duplicate actions;
- earlier escalation;
- more complete records;
- stronger evidence retention;
- reduced access scope; and
- faster incident detection.
Potential negative measures include:
- wrong writes;
- privacy incidents;
- unsupported claims;
- excessive permissions;
- outages;
- duplicate actions;
- missed escalations; and
- provider failures.
Avoid claiming the full value of a hypothetical avoided loss without a defensible method.
Use historical frequency, probability ranges, and conservative assumptions.
Calculate risk-adjusted value
A simple business case may overstate returns when it ignores failures.
Risk-adjusted value should subtract:
- expected correction cost;
- expected exception cost;
- expected incident cost;
- expected downtime impact;
- compliance overhead;
- additional review cost; and
- provider concentration risk.
The purpose is not to assign perfect monetary values to every risk.
It is to prevent the analysis from treating agentic execution as consequence-free.
Measure adoption
A technically capable service creates little value when people do not use it correctly.
Track:
- eligible users;
- active users;
- repeat users;
- tasks submitted;
- tasks abandoned;
- approval response time;
- manual workarounds;
- user-reported trust;
- user satisfaction; and
- reasons for non-use.
Low adoption may result from poor training, weak task fit, slow performance, unclear responsibility, or lack of trust.
Measure reviewer burden
Human oversight has a cost.
Track:
- number of approvals;
- average review time;
- rejected proposals;
- corrections per approval;
- reviewer backlog;
- missed approval windows;
- approval fatigue; and
- escalation quality.
A service that saves ten minutes of preparation but adds fifteen minutes of review does not create a positive labour return.
Review may still be worthwhile for quality or risk reasons, but that value should be stated honestly.
Measure reliability
GaaS ROI depends on whether the service remains available and predictable.
Track:
- successful runs;
- failed runs;
- tool failures;
- source failures;
- model failures;
- authentication failures;
- schedule failures;
- partial completions;
- recovery time;
- repeated incidents; and
- tasks requiring manual restart.
Reliability problems create hidden labour and delay.
Measure automation quality, not automation percentage
A high automation rate can be misleading.
An organisation might automate many low-value steps while leaving the expensive review and exception work unchanged.
Instead of asking only:
What percentage of tasks was automated?
Also ask:
- How many accepted outcomes were completed?
- How much human effort remained?
- Which steps still caused delay?
- How often did the service need correction?
- Did the service reduce or create risk?
- Did customers or employees receive better results?
Use a practical ROI formula
A simple ROI formula is:
(Total measurable benefit − Total [GaaS cost)
÷
Total GaaS](/gaas/how-to-implement-gaas-safely) cost
× 100
Total measurable benefit may include:
- labour savings;
- reduced external spend;
- increased capacity;
- reduced delay;
- reduced rework;
- increased revenue;
- avoided loss; and
- valued new capability.
Total GaaS cost should include implementation and ongoing operation.
Use conservative, documented assumptions.
Calculate payback period
The payback period shows how long it takes for benefits to recover the initial investment.
A practical calculation is:
Initial implementation cost
÷
Net monthly benefit
This is useful when comparing pilots or deciding whether a use case is worth scaling.
A positive long-term ROI may still be unattractive when the payback period is too long for the organisation's needs.
Use scenarios instead of one confident estimate
Create at least three cases:
- conservative;
- expected;
- optimistic.
Vary assumptions such as:
- task volume;
- acceptance rate;
- correction time;
- provider cost;
- model usage;
- reviewer effort;
- adoption;
- reliability; and
- value per outcome.
Scenario analysis shows which assumptions have the greatest effect on the result.
Compare with the true alternative
The alternative is not always the original manual process.
Compare GaaS with realistic options such as:
- improving the current workflow;
- using a SaaS feature;
- building deterministic automation;
- using one AI capability;
- building an internal agent;
- outsourcing the work;
- changing the process; or
- doing nothing.
This avoids choosing GaaS merely because it performs better than an inefficient baseline.
Design the pilot for measurement
A measurable pilot should define:
- baseline period;
- pilot period;
- eligible task volume;
- control or comparison group when possible;
- accepted-outcome definition;
- quality scoring;
- reviewers;
- cost categories;
- success thresholds;
- risk limits;
- stop conditions; and
- final decision date.
Use representative cases rather than only easy examples.
Create a compact scorecard
A practical GaaS scorecard may include:
| Area | Example metric |
|---|---|
| Outcome | Accepted outcomes per week |
| Quality | Percentage passing review without material correction |
| Speed | Median time to accepted outcome |
| Cost | Total cost per accepted outcome |
| Capacity | Additional tasks completed |
| Oversight | Average review time |
| Exceptions | Correctly escalated cases |
| Reliability | Successful scheduled runs |
| Risk | Wrong or duplicate external actions |
| Adoption | Repeat use by eligible users |
Keep the scorecard focused.
Too many metrics can hide the measures that determine the business decision.
Set scale-up thresholds
Define what must be true before expanding.
Example thresholds might include:
- accepted-outcome rate above the agreed level;
- material-correction rate below the agreed level;
- no unresolved high-impact incident;
- cost per accepted outcome below the baseline;
- reviewer burden within available capacity;
- reliable activity evidence;
- stable tool and schedule performance; and
- positive user feedback.
Scale should be earned through evidence.
Set pause and retirement thresholds
Define when the service should be paused or retired.
Examples include:
- repeated wrong actions;
- duplicate writes;
- loss of source reliability;
- uncontrolled cost;
- falling acceptance;
- rising correction time;
- reviewer overload;
- unresolved incidents;
- low adoption;
- provider instability; or
- a simpler alternative becoming available.
ROI measurement should support stopping decisions, not only expansion.
Review ROI by stakeholder
Different groups may experience different costs and benefits.
Employees may gain:
- less repetitive work;
- faster context gathering;
- better-prepared drafts; and
- more time for judgement.
Managers may gain:
- better visibility;
- shorter backlogs;
- more consistent processes; and
- measurable capacity.
Customers may gain:
- faster responses;
- clearer status;
- fewer missed handoffs; and
- more consistent service.
Security, privacy, and operations teams may incur:
- access reviews;
- monitoring;
- incident work;
- policy development; and
- provider oversight.
A complete ROI analysis includes value and workload across stakeholders.
Review ROI over time
GaaS economics change after launch.
Costs may fall as the workflow improves.
They may also rise because of:
- increasing volume;
- more expensive models;
- extra tools;
- longer context;
- broader use;
- added monitoring;
- new review requirements; or
- provider price changes.
Review ROI:
- after the pilot;
- after production launch;
- after major workflow changes;
- after model or provider changes;
- after incidents;
- before renewal; and
- at a regular operating interval.
Avoid common ROI mistakes
Common mistakes include:
- measuring model output instead of accepted outcomes;
- having no baseline;
- ignoring implementation cost;
- treating reassigned time as automatic cash savings;
- excluding human review;
- ignoring correction and exception work;
- counting every agent run as success;
- assuming maximum adoption;
- monetising speculative benefits aggressively;
- ignoring risk and incidents;
- using only ideal test cases; and
- continuing a weak pilot because the technology appears promising.
How Feluda can support ROI measurement
Feluda can help users observe and compare controlled agentic workflows.
Workbench Activity can support review of interactive tool calls and results.
Studio can make process steps, branches, validation, and error paths visible.
RunFlows can show execution progress, intermediate output, warnings, and errors.
Schedule Manager can provide recurring-run history and support pausing or resuming automation.
Local and cloud provider options can be compared for task quality, runtime, privacy requirements, and operating cost.
These capabilities can help teams connect workflow behaviour with outcome, quality, reliability, and cost measures.
A practical GaaS ROI example
Consider a weekly project-reporting process.
Baseline
- 20 reports per month;
- 120 minutes of employee time per report;
- 10 percent require rework;
- average elapsed time of two days;
- monthly labour cost of €3,200.
GaaS pilot
- 20 reports per month;
- 25 minutes of human review per report;
- 15 percent require material correction;
- average elapsed time of four hours;
- provider, model, and tool cost of €700;
- monitoring and support cost of €250;
- remaining labour cost of €670.
Monthly comparison
The pilot's direct monthly operating cost is:
€700 provider and usage
+ €250 monitoring and support
+ €670 human review and correction
= €1,620
The direct monthly difference from the baseline is:
€3,200 − €1,620 = €1,580
Before claiming ROI, the organisation should also include implementation cost, validate the report quality, and confirm that the remaining employee time creates useful capacity.
Final ROI checklist
Before presenting a GaaS business case, confirm that:
- the decision is clear;
- the unit of value is defined;
- the baseline is measured;
- labour and elapsed time are separate;
- one-time costs are included;
- recurring costs are included;
- human review is included;
- correction effort is included;
- accepted outcomes are counted;
- quality is measured;
- exceptions are measured;
- reliability is measured;
- adoption is measured;
- risk and incidents are considered;
- new capability is valued conservatively;
- realistic alternatives are compared;
- scenarios are documented;
- scale-up thresholds are defined;
- pause thresholds are defined; and
- ROI will be reviewed after changes.
The practical conclusion
GaaS ROI is not the number of tasks an agent starts.
It is the measurable value of accepted outcomes after every cost, correction, exception, delay, and risk control is considered.
Begin with a credible baseline.
Measure the complete process.
Count human oversight.
Use conservative assumptions.
Separate efficiency from new capability.
Expand only when quality, cost, reliability, and risk evidence support the decision.
The strongest GaaS business case is not the one with the largest theoretical saving.
It is the one that can be measured repeatedly and still demonstrates useful value under normal operating conditions.