Qdrant MCP Server – Semantic Memory Search
Qdrant's official MCP server acts as a semantic memory layer on top of the Qdrant vector search engine. Use it to let AI agents store information with optional metadata and retrieve relevant memories or snippets using semantic search.
Overview
Qdrant's official MCP server connects compatible AI clients to the Qdrant
vector search engine. It is designed as a semantic memory layer: an agent can
store useful information and later retrieve relevant entries by meaning rather
than by exact keyword matching.
What the MCP server enables
The server exposes two core tools:
qdrant-storestores information in a Qdrant collection with optional JSON
metadata.qdrant-findretrieves relevant stored information from Qdrant using a
natural-language query.
A default COLLECTION_NAME can be configured. When no default collection is
set, collection name is supplied per tool call. The server automatically
creates the configured collection when needed. It uses FastEmbed by default and
currently supports FastEmbed models, with
sentence-transformers/all-MiniLM-L6-v2 as the default embedding model.
When to use it
Use Qdrant MCP when an AI agent needs durable semantic memory or reusable
retrieval over a curated knowledge base. Practical uses include storing project
decisions, code snippets, implementation notes, documentation fragments, support
answers, research findings, or team conventions, then searching them later by
intent. The tool descriptions can be customized to specialize the server for a
codebase, documentation collection, or workflow-specific memory store.
Connection and authentication
The standard local command is uvx mcp-server-qdrant. The default transport is
stdio. The same command can start SSE or Streamable HTTP by adding
--transport sse or --transport streamable-http. Server host and port use
FastMCP settings, with port 8000 as the documented default for network
transports.
Configure either QDRANT_URL for a running Qdrant server or
QDRANT_LOCAL_PATH for a local Qdrant database path, but not both. Use
QDRANT_API_KEY when connecting to a Qdrant server that requires API-key
authentication.
Key considerations
The store tool writes data into Qdrant. Set QDRANT_READ_ONLY=true when an
agent should only retrieve existing memories. The server focuses on semantic
store and find workflows rather than full database administration, and only
FastEmbed models are supported for embeddings at this time. Keep API keys in a
protected environment, avoid storing secrets as memories, and restrict network
transports to trusted clients and networks. For Docker deployments, bind
carefully and expose only the required port.
Supported Transports
stdio
Command: uvx
Args:
mcp-server-qdrant
sse
URL: http://localhost:8000/sse
streamable_http
URL: http://localhost:8000/mcp
Frequently Asked Questions
- When should an AI agent use the Qdrant MCP server?
- Use it when an agent needs durable semantic memory or retrieval over stored context, such as project notes, code snippets, documentation fragments, implementation details, support answers, or reusable workflow knowledge.
- What does the Qdrant MCP server add to an AI agent's capabilities?
- It lets the agent store information in Qdrant and retrieve relevant entries by semantic similarity, giving the model access to persistent external memory instead of relying only on conversation history or static knowledge.
- What can an AI agent access or manage through Qdrant MCP?
- The official server exposes `qdrant-store` for writing information with optional metadata and `qdrant-find` for semantic retrieval. It is not a full Qdrant administration server and focuses on memory-style storage and search.
- How is authentication configured for the Qdrant MCP server?
- Configure the server with QDRANT_URL for a remote Qdrant instance or QDRANT_LOCAL_PATH for local mode. Set QDRANT_API_KEY when the target Qdrant server requires it. Store API keys securely and do not put secrets directly into stored memories.
- Which transport should be used for the Qdrant MCP server?
- Use stdio for local desktop or IDE clients. Use Streamable HTTP or SSE only when a network transport is required, and protect the endpoint because the server can retrieve and, unless read-only mode is enabled, store information in the configured Qdrant collection.