How do I confirm that an MCP server is truly available?

Check the connection state, confirm the expected tools appear, run a known-good read-only tool test, and review the returned result in Workbench Activity or RunFlows.

How often should I check MCP server availability?

Choose a frequency based on importance. Critical production or scheduled services need more frequent checks, while low-use test servers may need only periodic review.

When should I pause schedules that depend on an MCP server?

Pause them when authentication fails, tools disappear, repeated timeouts occur, results become unreliable, destinations are uncertain, or the server cannot be trusted.

What should I do after an MCP server recovers?

Run the known-good test, review Activity, test affected workflows in RunFlows, verify required write destinations, use a one-time scheduled test, and resume recurring schedules gradually.

Monitor MCP Server Availability | Feluda.ai Documentation

Monitor MCP Server Availability

An MCP server must be available when Feluda needs its tools.

A server can appear configured while still failing because of:

a stopped local process;
an unavailable remote service;
an incorrect endpoint;
expired authentication;
a network problem;
a blocked port;
missing permissions;
an unavailable connected source; or
a provider outage.

Availability monitoring helps you detect these problems before they disrupt important Workbench tasks, workflows, or schedules.

What availability means

A server is available when Feluda can:

reach the configured endpoint;
complete authentication;
discover the expected tools;
call a tool successfully;
receive a usable result; and
complete the task within an acceptable time.

A visible connection alone does not prove that every tool works.

Use both connection checks and real tool tests.

Monitor the complete path

An MCP task may depend on several parts:

Feluda
→ MCP Server
→ Connected Source or Service
→ Tool Result
→ Workflow Output

A failure can occur at any point.

For example, Feluda may reach the MCP server while the server cannot reach its connected database.

Monitor the complete path, not only the first connection.

Identify important servers

Not every server needs the same monitoring level.

Give more attention to servers that:

support scheduled workflows;
perform write actions;
provide customer or business records;
support several workflows;
are used by several people;
connect to production systems;
process time-sensitive tasks; or
have a history of outages.

Low-impact test servers may need less frequent review.

Assign an owner

Every important MCP server should have a responsible owner.

The owner should know:

why the server is connected;
who operates it;
which tools it provides;
which workflows depend on it;
which schedules use it;
where authentication is managed;
how to test it;
how to pause dependent automation; and
who to contact during an outage.

Unowned connections can fail without anyone taking responsibility.

Record the server baseline

Keep a simple record for each important server.

Include:

server name;
local or remote location;
endpoint;
authentication method;
expected tools;
important sources;
important destinations;
normal runtime;
dependent workflows;
dependent schedules;
known-good test;
service owner;
Feluda owner; and
last successful check.

Do not record raw credentials.

Open MCP Servers

Select MCP Servers from the Feluda sidebar.

Review the connection state.

Check:

server name;
endpoint;
visible status;
authentication state;
discovered tools;
warnings; and
errors.

Read the full message before changing any settings.

Check the endpoint

Confirm that the endpoint still matches the server owner's current details.

Review:

protocol;
host;
port;
path;
spelling;
capitalisation where relevant;
local or remote address; and
recent server changes.

Do not replace the endpoint with a guessed value.

Check authentication

Confirm that:

the credential belongs to the correct server;
the account remains active;
the credential has not expired;
required permissions remain available;
the authentication method has not changed; and
the value remains stored in protected settings.

A server can become unavailable even when the endpoint is correct.

Check expected tools

Compare the discovered tools with the expected baseline.

Look for:

missing tools;
renamed tools;
new tools;
duplicated tools;
changed descriptions;
changed read or write behaviour; and
changed input or output.

A changed tool list may indicate a server update or incomplete connection.

Use a known-good read test

Keep one safe read-only test for every important server.

For example:

Use only the enabled Internal Knowledge Search tool.

Search for "MCP availability test".

Return:
1. result title;
2. source identifier;
3. returned summary;
4. last updated date; and
5. any warning.

Do not create or change anything.

The expected result should be stable and easy to recognise.

Why a read test is useful

A successful read test confirms more than endpoint reachability.

It checks:

server availability;
authentication;
tool discovery;
model tool use;
connected source access;
returned result structure; and
normal runtime.

Use the same test repeatedly so changes are easier to detect.

Review Workbench Activity

After the test, open the Activity drawer.

Confirm:

the expected tool was called;
the expected server provided it;
input was correct;
the server returned a result;
the result matched the expected source;
warnings were visible;
errors were understood;
no repeated call occurred; and
runtime was reasonable.

Do not rely only on the model's final sentence.

Record the result

For important checks, record:

date and time;
server;
tool;
test input;
success or failure;
runtime;
warning;
error;
source availability;
person reviewing; and
action taken.

This helps identify recurring patterns.

Monitor local MCP servers

A local MCP server depends on the local environment.

Check:

the computer is on;
the server process is running;
the local application is open when required;
the port is available;
the firewall permits access;
the source files are mounted;
the local database is running;
enough memory is available; and
the service starts after restart.

Local availability often depends on device state.

Monitor local endpoints

Common local endpoint problems include:

the process stopped;
the port changed;
another application uses the port;
the service starts only after sign-in;
the computer restarted;
the laptop is asleep;
the local firewall changed; or
the endpoint was edited incorrectly.

Test the local service directly when possible.

Monitor remote MCP servers

A remote MCP server depends on:

internet or network access;
provider uptime;
DNS;
certificates;
VPN;
proxy settings;
authentication;
remote account state; and
the connected service.

Review the provider's service status when available.

Monitor network requirements

Confirm whether the server requires:

public internet;
private network;
VPN;
proxy;
internal DNS;
a specific IP route; or
a specific firewall rule.

A server may work from one network and fail from another.

Monitor the connected source

The server may be available while the source is not.

Check whether the tool depends on:

local files;
a database;
cloud storage;
a business application;
a search index;
an internal service;
a messaging platform; or
another external provider.

Test the source separately when possible.

Monitor the destination

Write tools may remain visible while their destination is unavailable.

Confirm whether the tool can still reach:

the correct workspace;
the correct account;
the correct folder;
the correct project;
the correct record system;
the correct message destination; or
the correct local path.

A write test should use a safe destination.

Define normal runtime

Record how long a normal tool call usually takes.

For example:

local lookup: 2 to 5 seconds;
remote record search: 5 to 15 seconds;
large document retrieval: 20 to 40 seconds.

Use your own measured baseline.

A sudden increase can indicate:

server load;
network delay;
source delay;
model delay;
local hardware pressure; or
a changed tool.

Watch for slow degradation

Availability is not only success or failure.

A server may remain technically available while becoming too slow for normal use.

Watch for:

increasing runtime;
repeated timeouts;
partial results;
delayed write confirmation;
repeated tool calls;
growing error frequency; or
missed scheduled completion.

Investigate trends before complete failure.

Monitor Workbench use

For interactive tasks, review:

whether the tool appears;
whether the model calls it;
whether the correct result returns;
whether the same request now takes longer;
whether warnings have appeared;
whether repeated calls occur; and
whether users report inconsistent results.

Start a new conversation for controlled tests.

Monitor workflow use

For Studio workflows, confirm:

the tool remains selected;
the model still supports it;
input mapping is correct;
result fields remain available;
no-result handling still works;
error paths remain connected;
permissions remain valid; and
the final output remains accurate.

A server may be available while a dependent flow is broken.

Review RunFlows output

RunFlows can show:

starting input;
tool calls;
tool input;
tool output;
intermediate values;
warnings;
errors;
branch results; and
final output.

Use it to confirm that the saved workflow reaches the server and completes the task.

Use Emit blocks

Add an Emit block when you need to inspect intermediate values.

For example:

Input
→ MCP Search
→ Emit Raw Tool Result
→ Prepare Summary
→ Output

This helps distinguish server failure from later model or workflow failure.

Monitor scheduled workflows

Open Schedule Manager to review:

upcoming runs;
recent history;
paused schedules;
failed runs;
missed runs;
conflict warnings; and
repeated failures.

A schedule may fail even when manual Workbench tests succeed.

Why scheduled checks can fail

At scheduled time, one of these may be unavailable:

Feluda;
the computer;
the local model runner;
the local MCP server;
the network;
VPN;
authentication;
the connected source;
the destination; or
required hardware resources.

Compare scheduled conditions with manual test conditions.

Use a one-time scheduled availability test

Before relying on a recurring workflow:

create a one-time schedule;
use a safe read-only flow;
choose a stable test source;
wait for the scheduled time;
review Schedule Manager;
review RunFlows;
inspect tool activity; and
confirm the result.

This tests real scheduled availability.

Monitor conflict warnings

Schedule Manager can show conflict warnings.

Review whether:

two runs may overlap;
one workflow takes longer than expected;
several local models compete for resources;
the same server receives too many calls;
two write workflows target the same destination; or
duplicate actions could occur.

Increase spacing or pause one schedule when needed.

Set a monitoring frequency

Choose a review frequency based on risk.

Examples:

daily for critical production write tools;
weekly for important read services;
monthly for low-use local tools;
after every server update;
after every authentication change;
after every network change;
after repeated errors; and
before important scheduled periods.

Avoid unnecessary high-frequency tests that create load or external actions.

Use read-only health checks

Availability checks should normally use read-only tools.

A health check should not:

create records;
send messages;
update status;
write files;
change settings; or
remove information.

Use a write test only when write availability must be confirmed and a safe test destination exists.

Test write availability separately

A successful read test does not prove that write permissions work.

For important write tools, use:

a test account;
a test workspace;
a temporary record;
a reversible action;
explicit approval; and
destination verification.

Run write tests less frequently than read-only health checks.

Confirm external write results

After a write test, inspect the destination.

Confirm that:

the correct item changed;
only approved fields changed;
the correct account was used;
no duplicate appeared;
the timestamp is correct; and
the test can be reversed.

A tool call marked successful is not enough.

Monitor authentication expiry

Record when credentials are expected to expire when that information is available.

Review:

token expiry;
account status;
permission scope;
password changes;
organisational account changes;
service-owner changes; and
credential rotation.

Renew access before important scheduled runs.

Protect credentials during monitoring

Never place:

passwords;
API keys;
access tokens;
private headers;
client secrets; or
connection values

inside test prompts, workflow output, Journal entries, or monitoring notes.

Store them only in protected settings.

Monitor permission failures

A server can be reachable while tools fail because of permissions.

Review:

read access;
write access;
account scope;
project scope;
record scope;
URL rules;
IP rules;
path rules;
port rules; and
destination access.

Apply only the narrowest approved change.

Monitor result quality

Availability also includes receiving usable results.

Watch for:

empty results;
partial results;
outdated data;
malformed fields;
missing identifiers;
changed dates;
duplicate records;
changed result structure; and
unexpected warnings.

A server that returns unusable results is not fully operational for the task.

Compare with the baseline

Compare each test with the last known-good result.

Review:

tool name;
input;
result count;
required fields;
timestamps;
runtime;
warnings;
errors; and
final answer.

Meaningful differences may indicate an update or service problem.

Define warning conditions

A warning condition may include:

one failed test;
runtime above the normal range;
one missing non-critical field;
temporary source delay;
one schedule conflict; or
an authentication expiry notice.

The owner should review the condition before it becomes a larger outage.

Define outage conditions

Treat the server as unavailable when:

Feluda cannot reach it;
authentication fails;
required tools disappear;
all known-good tests fail;
the source cannot be reached;
required write destinations cannot be reached;
repeated timeouts occur;
scheduled runs fail repeatedly; or
results cannot be trusted.

Pause dependent automation when the impact is unclear.

Pause dependent schedules

Pause schedules when:

the server is unavailable;
authentication fails;
tools disappear;
result structure changes;
write destinations are uncertain;
repeated timeouts occur;
duplicate actions appear;
permission failures repeat; or
the server is no longer trusted.

Resume only after successful manual and scheduled tests.

Use clear outage messages

A workflow should not return a normal-looking result after the MCP tool fails.

Use a message such as:

The connected service is currently unavailable.
No result was produced.
Review the MCP server connection and try again later.

Keep no-match messages separate from outage messages.

Distinguish no result from outage

No result:

The tool worked but found nothing.

Outage:

The tool could not complete the request.

This distinction helps users choose the correct next step.

Create a recovery checklist

When a server becomes unavailable:

identify the affected server;
review MCP Servers;
check endpoint;
check authentication;
check local process or remote status;
check network or VPN;
check the connected source;
check permissions;
pause dependent schedules;
run the known-good test;
review Activity;
test affected RunFlows;
verify write destinations;
use a one-time scheduled test; and
resume automation gradually.

Change one thing at a time.

Recover a local server

For a local server, check:

process status;
application status;
endpoint;
port;
firewall;
local files;
local database;
computer memory;
service startup; and
recent updates.

Restart only the required service when possible.

Recover a remote server

For a remote server, check:

provider status;
endpoint;
DNS;
network;
VPN;
proxy;
certificate;
authentication;
account status; and
remote source availability.

Contact the server owner when the issue is outside Feluda.

Verify recovery

Do not consider the server recovered after only one connection message.

Confirm:

expected tools reappear;
the known-good read test succeeds;
raw output is correct;
runtime is acceptable;
affected workflows succeed;
write tools work when required;
destinations are correct;
a one-time schedule succeeds; and
no repeated error remains.

Resume automation gradually

Resume one important schedule at a time.

Review its first runs.

Check:

server availability;
tool calls;
runtime;
warnings;
errors;
final output;
external destinations; and
duplicates.

Pause again when unexpected behaviour appears.

Escalate when needed

Escalate to the server owner or provider when:

the endpoint is correct but unreachable;
provider status shows an outage;
authentication fails after confirmed renewal;
required tools are missing after an update;
result structure changed without guidance;
repeated timeouts continue;
write destinations behave incorrectly; or
the service cannot be verified.

Provide the error, time, tool name, and safe test details.

Do not include credentials.

Keep an incident record

For important outages, record:

date and time;
affected server;
affected tools;
affected workflows;
affected schedules;
visible error;
first failed test;
action taken;
person contacted;
recovery time;
final verification; and
follow-up action.

This helps prevent repeated problems.

Review recurring incidents

Look for patterns such as:

failures after restart;
failures after token expiry;
failures during VPN disconnect;
failures during heavy local workload;
failures after server updates;
failures at one time of day;
repeated port conflicts; or
repeated schedule overlap.

Fix the underlying cause instead of repeating the same recovery.

Review after updates

Recheck availability after changes to:

Feluda;
operating system;
local model runner;
AI model;
MCP server;
endpoint;
authentication;
network;
VPN;
firewall;
source system;
destination; or
workflow.

Use the same known-good test each time.

Review after server replacement

After replacing a server, monitor:

connection stability;
tool list;
runtime;
result completeness;
warnings;
errors;
scheduled history;
write destinations;
duplicate actions; and
user reports.

Keep the rollback plan until the replacement is stable.

Review privacy during monitoring

Health checks should use minimal information.

Avoid sending:

real customer data;
confidential documents;
private messages;
credentials;
unnecessary identifiers; or
sensitive file contents.

Use a dedicated non-sensitive test record when possible.

Avoid monitoring that creates risk

Do not use frequent write actions as a basic availability check.

Repeated test writes may create:

duplicate records;
unnecessary messages;
extra files;
misleading tasks;
audit noise; or
unintended external activity.

Prefer a read-only test.

Final availability checklist

For every important MCP server, confirm that:

an owner is assigned;
the server baseline is recorded;
the endpoint is current;
authentication is valid;
expected tools appear;
a known-good read test exists;
normal runtime is known;
Workbench Activity is reviewed;
RunFlows is checked;
scheduled history is reviewed;
conflict warnings are understood;
write tests use safe destinations;
pause conditions are defined;
recovery steps are documented;
escalation contacts are known; and
incidents are recorded.

Availability monitoring is most useful when it checks the real task, not only whether the connection exists.

Monitor MCP Server Availability

What availability means

Monitor the complete path

Identify important servers

Assign an owner

Record the server baseline

Open MCP Servers

Check the endpoint

Check authentication

Check expected tools

Use a known-good read test

Why a read test is useful

Review Workbench Activity

Record the result

Monitor local MCP servers

Monitor local endpoints

Monitor remote MCP servers

Monitor network requirements

Monitor the connected source

Monitor the destination

Define normal runtime

Watch for slow degradation

Monitor Workbench use

Monitor workflow use

Review RunFlows output

Use Emit blocks

Monitor scheduled workflows

Why scheduled checks can fail

Use a one-time scheduled availability test

Monitor conflict warnings

Set a monitoring frequency

Use read-only health checks

Test write availability separately

Confirm external write results

Monitor authentication expiry

Protect credentials during monitoring

Monitor permission failures

Monitor result quality

Compare with the baseline

Define warning conditions

Define outage conditions

Pause dependent schedules

Use clear outage messages

Distinguish no result from outage

Create a recovery checklist

Recover a local server

Recover a remote server

Verify recovery

Resume automation gradually

Escalate when needed

Keep an incident record

Review recurring incidents

Review after updates

Review after server replacement

Review privacy during monitoring

Avoid monitoring that creates risk

Final availability checklist

Frequently Asked Questions