Gene Library Courses Download Pricing Contact Sign in

Test MCP Tools Before Regular Use

Test MCP Tools Before Regular Use

Test every new MCP tool before using it with important information or regular workflows.

A successful connection only confirms that Feluda can reach the server.

It does not confirm that:

  • the correct tool appears;
  • the model uses it correctly;
  • the tool receives the right input;
  • the returned data is complete;
  • permission limits behave as intended;
  • write actions reach the correct destination; or
  • scheduled runs will remain dependable.

Use a staged test process.

Begin with a simple read-only task and non-sensitive sample data.

Why testing matters

An MCP tool can connect Feluda to:

  • files;
  • internal services;
  • business records;
  • websites;
  • databases;
  • messages;
  • tasks;
  • external applications; or
  • other connected systems.

A small configuration error can lead to:

  • missing results;
  • wrong records;
  • excessive data sharing;
  • permission errors;
  • duplicate actions;
  • incorrect destinations; or
  • unreliable workflows.

Testing helps you understand the tool before it becomes part of regular work.

Use a staged testing process

Test in this order:

  1. Confirm the MCP server connection.
  2. Confirm the tool appears.
  3. Test one read-only action.
  4. Review Activity.
  5. Test a valid result.
  6. Test an empty result.
  7. Test an error.
  8. Test permission boundaries.
  9. Test write actions separately.
  10. Confirm the destination.
  11. Test the tool in Studio.
  12. Test the saved flow in RunFlows.
  13. Test scheduled use last.

Do not begin with a high-impact write action.

Before you begin

Confirm that:

  • the server owner is understood;
  • the endpoint is correct;
  • authentication is complete;
  • the tool description is available;
  • you know whether the tool reads or writes;
  • a safe test source exists;
  • a safe test destination exists when needed;
  • the selected model supports tool use; and
  • no important schedule depends on the untested connection.

Use a test account or environment when one is available.

Choose safe sample data

Good sample data is:

  • non-sensitive;
  • easy to recognise;
  • small;
  • disposable;
  • clearly marked as a test; and
  • unrelated to real customers or confidential work.

Examples include:

  • a test record;
  • a temporary file;
  • a sample project;
  • a fake reference number;
  • a test Journal entry;
  • a draft message addressed to yourself; or
  • a temporary task.

Do not use real private data only because the test seems simple.

Record the expected result

Write down what should happen before running the test.

For example:

Tool: Internal Knowledge Search

Input:
"MCP connection test"

Expected result:
One test document with title "MCP Test Record"

No write action should occur.

This gives you a clear comparison point.

Step 1: Confirm the server connection

Open MCP Servers from the Feluda sidebar.

Find the server.

Check:

  • connection name;
  • endpoint;
  • connection state;
  • authentication;
  • available tools;
  • warnings; and
  • errors.

Do not continue until the server is available.

Step 2: Confirm the tool appears

Open Workbench.

Select a provider and model that support tool use.

Open Tools.

Confirm that the expected tool appears.

Review:

  • its name;
  • the server that provides it;
  • its description;
  • expected input;
  • returned output;
  • read or write behaviour; and
  • any stated limits.

If the tool does not appear, return to the MCP Servers page.

Step 3: Enable one tool

Enable only the tool you are testing.

Leave unrelated tools disabled.

This helps you understand:

  • whether the model selected the intended tool;
  • what input it prepared;
  • what output returned;
  • whether permissions were used correctly; and
  • where an error occurred.

Add more tools only after the first tool works reliably.

Step 4: Test one read-only request

Begin with a read-only task.

For example:

Use only the enabled Internal Knowledge Search tool.

Search for "MCP connection test".

Return:
1. the title;
2. the source identifier;
3. the returned summary; and
4. any warning.

Do not create, update, send, delete, or change anything.

This test checks the connection without intentionally changing an external destination.

Step 5: Review Workbench Activity

After the response, open Activity.

Confirm:

  • the expected tool was called;
  • no unrelated tool was called;
  • the input matches the request;
  • the server returned a result;
  • warnings were visible;
  • errors were understood; and
  • the final answer matches the tool result.

Do not rely only on the conversational response.

Compare the raw result with the answer

Check exact:

  • names;
  • dates;
  • amounts;
  • statuses;
  • identifiers;
  • source references;
  • missing fields; and
  • warnings.

A tool can return correct data while the model summarises it incorrectly.

Step 6: Test a known valid result

Use sample input that should return a predictable result.

Confirm that:

  • the correct record or file is found;
  • the right source is used;
  • required fields are returned;
  • the model preserves exact values;
  • the source identifier is visible; and
  • no unsupported details are added.

Repeat the test once to check consistency.

Step 7: Test an empty result

Use a safe input that should not match anything.

For example:

Search for record "TEST-NO-MATCH-99999".

Confirm that:

  • the tool completes;
  • no matching result is returned;
  • the model states that no result was found;
  • no record is invented;
  • no unrelated tool is called; and
  • the result is different from an error.

Empty-result handling is essential for reliable workflows.

Step 8: Test invalid input

Use a harmless invalid value.

Examples include:

  • an incomplete identifier;
  • an invalid date;
  • a missing filename;
  • unsupported formatting;
  • an empty required field; or
  • an unknown source.

Confirm that the model or tool returns a clear explanation.

It should not perform a different action to compensate.

Step 9: Test server unavailability

Use only a safe and approved method.

Confirm that the workflow or conversation handles an unavailable server by:

  • showing a clear error;
  • avoiding invented results;
  • avoiding write actions;
  • stopping repeated calls; and
  • explaining what needs review.

Restore the server after testing.

Step 10: Test authentication failure

Use an approved test method.

Confirm that:

  • the error is visible;
  • the credential is not exposed;
  • the model does not claim success;
  • repeated calls do not continue; and
  • the task follows an error or review path.

Never place a private credential in the prompt.

Step 11: Test permission boundaries

Confirm that approved access works and unapproved access remains blocked.

Depending on the tool, review:

  • read and write access;
  • account scope;
  • URL rules;
  • IP rules;
  • path rules;
  • port rules;
  • project scope; and
  • destination scope.

Use the narrowest permission that still supports the approved task.

Test an allowed source

Use one approved source.

Confirm that:

  • the tool can access it;
  • the expected result returns;
  • the correct account is used;
  • the Activity log is clear; and
  • no broader source is accessed.

Test a blocked source

When safe and appropriate, try one clearly unapproved source.

Confirm that:

  • access is denied;
  • the denial is visible;
  • the model does not use another source without permission;
  • no data is returned; and
  • the final answer explains the permission problem.

Test file-path limits

For a file tool, use:

  • one allowed file or folder; and
  • one path outside the approved area.

Confirm that the approved path works and the other path is blocked.

Do not grant access to the full filesystem for a narrow file task.

Test network limits

For a network tool, confirm the approved:

  • URL;
  • domain;
  • IP address;
  • host; and
  • port.

Check that unrelated destinations remain unavailable.

Test with the intended model

Different models may use the same tool differently.

Test the model that will be used in regular work.

Review whether it:

  • selects the correct tool;
  • supplies required input;
  • follows read-only limits;
  • handles no matches;
  • reports errors;
  • stops after success; and
  • preserves returned values.

Compare two models when needed

To compare models fairly:

  1. start a new conversation for each model;
  2. enable the same tool;
  3. use the same input;
  4. use the same instruction;
  5. review each Activity log; and
  6. compare tool choice, input, output interpretation, and errors.

Choose the model that performs the complete task most reliably.

Test repeated calls

Run the same read-only request more than once.

Confirm whether:

  • the result remains consistent;
  • the same tool is selected;
  • input remains stable;
  • warnings repeat;
  • the source has changed; and
  • the model adds new unsupported details.

Consistency matters for repeatable workflows.

Test several enabled tools

Only after one-tool testing succeeds, try the real multi-tool setup.

Confirm that:

  • the model selects the correct tool;
  • unrelated tools remain unused;
  • tool order makes sense;
  • no write tool is called during a read-only task;
  • returned data stays associated with the correct source; and
  • repeated calls do not appear.

Reduce the enabled set when the model becomes confused.

Test write actions separately

A successful read test does not prove that a write tool is safe.

Test write actions only after:

  • the connection is stable;
  • the model uses read tools reliably;
  • permissions are understood;
  • a safe test destination exists; and
  • the action can be reviewed.

Use a test account or reversible action when possible.

Use a draft-first write test

First ask the model to prepare the action.

For example:

Prepare a test task.

Show:
* the destination;
* the task title;
* the description;
* the assigned owner; and
* the deadline.

Do not create the task yet.

Review the draft before authorising the write.

Approve one exact action

The approval should specify:

  • tool;
  • account;
  • environment;
  • destination;
  • record or item;
  • fields;
  • values; and
  • action type.

If the proposal changes, request approval again.

Confirm the write destination

After the tool runs, inspect the external service.

Check:

  • the correct item was created or updated;
  • the intended fields changed;
  • no extra field changed;
  • the correct account was used;
  • the correct environment was used;
  • the timestamp is correct; and
  • no duplicate appeared.

Activity confirms the tool call.

The destination confirms the real outcome.

Test write failure

Use an approved safe method to test a failed write.

Confirm that:

  • the error is visible;
  • no false success message appears;
  • the destination remains unchanged;
  • the workflow does not retry indefinitely;
  • partial writes are visible; and
  • the user receives a clear review message.

Test partial writes

When the service can update several fields, confirm what happens if only some values are accepted.

Review:

  • which fields changed;
  • which fields failed;
  • the returned warning;
  • the external record; and
  • whether a retry would repeat successful fields.

Do not repeat the complete write until the current state is known.

Test timeouts

A timeout may occur before or after an external action completes.

Confirm that the test process includes:

  1. review Activity or RunFlows output;
  2. inspect the destination;
  3. compare timestamps;
  4. confirm whether the action completed; and
  5. retry only when the first action did not complete.

This prevents duplicate writes.

Test duplicate prevention

Repeat the same safe create request.

Confirm whether the tool or workflow:

  • detects an existing item;
  • uses a unique reference;
  • creates a duplicate;
  • overwrites an existing item;
  • returns a warning; or
  • requires a manual check.

Add a duplicate-prevention step when needed.

Test reversibility

Confirm whether a test action can be undone.

For example:

  • can the task be deleted;
  • can the status be restored;
  • can the file be removed;
  • can the record be corrected;
  • can the message be recalled; or
  • can the change be versioned?

Use stronger approval for irreversible actions.

Test in Studio

After Workbench tests succeed, add the tool to a small Studio workflow.

A basic read flow may be:

Input
→ LLM with MCP Read Tool
→ Output

A controlled write flow may be:

Input
→ Retrieve Current State
→ Prepare Proposed Change
→ Emit Proposal
→ Approval Decision
→ MCP Write Tool
→ Output

Keep the first flow simple.

Add Emit blocks

Use Emit blocks to show:

  • raw tool output;
  • extracted values;
  • classification;
  • proposed writes;
  • approval state; and
  • external confirmation.

This helps locate the first incorrect step.

Add no-result paths

A read workflow should distinguish:

  • valid result;
  • no match; and
  • error.

For example:

MCP Search
→ Match Found → Continue
→ No Match → Return No Match
→ Error → Return Review Message

Test every path.

Add visible error paths

A useful error output may say:

The connected service could not complete the request.
Review the MCP server connection, authentication, and permissions.

Do not hide a failed tool call behind a normal-looking answer.

Test the saved flow in RunFlows

After saving:

  1. open RunFlows;
  2. select the saved flow;
  3. enter sample input;
  4. start the run;
  5. inspect intermediate output;
  6. review tool calls;
  7. review warnings and errors;
  8. compare the final answer with the tool result; and
  9. inspect any external destination.

This confirms that the saved workflow works outside Studio.

Test realistic input

After sample tests pass, use realistic but non-sensitive input.

Confirm that:

  • the tool receives the expected format;
  • response time remains acceptable;
  • result size is manageable;
  • required fields are preserved;
  • the model follows the instruction;
  • permissions remain narrow; and
  • no new error appears.

Test long input

When the real task includes larger content, test an appropriate sample.

Review:

  • input size;
  • model context limits;
  • tool limits;
  • timeout behaviour;
  • returned result size;
  • omitted fields; and
  • final answer quality.

Do not assume a tool that works with one sentence will work with a long document.

Test unusual input

Include safe examples with:

  • unexpected wording;
  • missing fields;
  • extra fields;
  • unusual characters;
  • multiple matches;
  • duplicate records;
  • empty content; or
  • unsupported requests.

Confirm that the tool and model respond clearly.

Test privacy boundaries

Review exactly what appears in Activity or RunFlows output.

Confirm whether the tool receives:

  • only necessary text;
  • unnecessary personal information;
  • complete files;
  • private identifiers;
  • internal paths;
  • confidential notes; or
  • unrelated conversation content.

Reduce the input before regular use.

Test local and remote behaviour

Confirm whether the server is:

  • local;
  • internal;
  • remote; or
  • connected to another external service.

Test network loss when remote access is expected.

Test offline use only when every required component is local.

Test scheduled use last

Schedule the workflow only after:

  • Workbench tests pass;
  • Studio tests pass;
  • RunFlows tests pass;
  • error paths work;
  • permissions are confirmed;
  • write destinations are verified;
  • duplicate prevention is in place; and
  • someone is assigned to review results.

Use a one-time schedule for the first test.

Test a one-time schedule

Schedule the flow for a near-future time.

Confirm that:

  • Feluda is available;
  • the computer is awake;
  • the model is available;
  • the MCP server is available;
  • authentication remains valid;
  • the expected tool is called;
  • the result appears;
  • the destination is correct; and
  • Schedule Manager records the run.

Review the complete result before creating a recurring schedule.

Test overlapping conditions

When recurring runs may overlap, test whether:

  • one run finishes before the next begins;
  • the server handles simultaneous calls;
  • the local computer has enough resources;
  • write actions use unique identifiers;
  • shared files remain safe; and
  • duplicate items are prevented.

Increase the spacing between runs when needed.

Record the test results

For important tools, record:

  • server;
  • tool;
  • model;
  • test date;
  • sample input;
  • expected result;
  • actual result;
  • permissions;
  • warnings;
  • errors;
  • destination;
  • reviewer; and
  • final decision.

This makes future changes easier to evaluate.

Re-test after changes

Repeat testing when:

  • the endpoint changes;
  • authentication changes;
  • the server updates;
  • the tool name changes;
  • the output format changes;
  • permissions change;
  • the model changes;
  • the workflow changes;
  • the destination changes; or
  • repeated errors appear.

Do not rely on an old successful test after an important change.

Stop testing when the risk is unclear

Pause and review when:

  • the server owner is unknown;
  • the destination is unclear;
  • credentials appear in activity;
  • permissions are broader than expected;
  • write actions cannot be reversed;
  • repeated calls appear;
  • the wrong environment is used;
  • sensitive data is exposed; or
  • the result cannot be verified.

Use an approved alternative only after testing it separately.

Final MCP tool test checklist

Before regular use, confirm that:

  • the server is trusted;
  • the endpoint is correct;
  • authentication is protected;
  • the tool appears;
  • the selected model uses it correctly;
  • read-only tests succeed;
  • no-match handling works;
  • invalid input is handled;
  • server and authentication errors are visible;
  • permission boundaries work;
  • write actions are reviewed;
  • destinations are verified;
  • timeouts do not create duplicates;
  • Studio and RunFlows tests succeed;
  • scheduled use has been tested separately; and
  • a reviewer is responsible for important results.

A tool is ready for regular use only when its normal, empty, error, and high-impact behaviours are understood.

Frequently Asked Questions

What should I test first with a new MCP tool?
Start with one read-only request using non-sensitive sample data, then review the Workbench Activity drawer to confirm the tool, input, output, warnings, and errors.
Why should I test an empty result?
It confirms that the tool and model can report no match without inventing information or treating the result as a system error.
How should I test an MCP write tool?
Use a draft-first process, approve one exact action, use a safe destination, inspect the external result, and test timeouts and duplicate prevention.
When is an MCP tool ready for scheduled use?
Only after Workbench, Studio, and RunFlows tests pass, error and no-result paths work, permissions are confirmed, destinations are verified, and a reviewer is assigned.