feat: add Qdrant local mode support with in-memory and persistent storage

Adds flexible Qdrant deployment modes to reduce infrastructure requirements
for local development and smaller deployments:

**Configuration Changes:**
- Add QDRANT_LOCATION environment variable (mutually exclusive with QDRANT_URL)
- Three modes: network (URL), in-memory (:memory:, default), persistent (file path)
- Settings dataclass validation via __post_init__ ensures mutual exclusivity
- API key warning when set in local mode (ignored, only for network mode)

**Client Initialization:**
- Auto-detect mode: network (url + api_key) vs local (:memory: or path=)
- In-memory: AsyncQdrantClient(":memory:") - zero config default
- Persistent: AsyncQdrantClient(path="/app/data/qdrant") - file storage
- Network: AsyncQdrantClient(url, api_key) - production mode

**Docker Compose Updates:**
- Qdrant service moved to optional profile (--profile qdrant)
- MCP service uses QDRANT_LOCATION=:memory: by default
- Added mcp-data volume for persistent storage (/app/data)
- No hard dependency on qdrant service

**Documentation:**
- Comprehensive configuration guide in docs/configuration.md
- All three modes documented with pros/cons
- Docker Compose examples for each mode
- Environment variable reference table

**Tests:**
- 13 new config validation tests (mutual exclusivity, defaults, warnings)
- Persistent mode integration test (create, close, reopen, verify persistence)
- All 82 unit tests + 5 smoke tests pass

**Breaking Change:**
- Default changed from QDRANT_URL=http://qdrant:6333 to QDRANT_LOCATION=:memory:
- Simplifies local development (no external service needed)
- Production deployments: explicitly set QDRANT_URL or QDRANT_LOCATION

Related: ADR-007 background vector sync implementation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Chris Coutinho
2025-11-09 07:07:07 +01:00
parent 72232f937a
commit 857d8f2152
6 changed files with 465 additions and 17 deletions
+12 -5
View File
@@ -74,10 +74,10 @@ services:
depends_on:
app:
condition: service_healthy
qdrant:
condition: service_healthy
ports:
- 127.0.0.1:8000:8000
volumes:
- mcp-data:/app/data
environment:
- NEXTCLOUD_HOST=http://app:80
- NEXTCLOUD_USERNAME=admin
@@ -88,9 +88,13 @@ services:
- VECTOR_SYNC_SCAN_INTERVAL=10
- VECTOR_SYNC_PROCESSOR_WORKERS=1
# Qdrant configuration
- QDRANT_URL=http://qdrant:6333
- QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key}
# Qdrant configuration (three modes):
# 1. Network mode: Set QDRANT_URL=http://qdrant:6333 (requires qdrant service)
# 2. In-memory mode: Set QDRANT_LOCATION=:memory: (default if nothing set)
# 3. Persistent local: Set QDRANT_LOCATION=/app/data/qdrant (stored in mcp-data volume)
- QDRANT_LOCATION=:memory:
# - QDRANT_URL=http://qdrant:6333 # Uncomment for network mode
# - QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key} # Only for network mode
- QDRANT_COLLECTION=nextcloud_content
# Ollama configuration (optional - uses SimpleEmbeddingProvider if not set)
@@ -215,6 +219,8 @@ services:
interval: 10s
timeout: 5s
retries: 10
profiles:
- qdrant
volumes:
nextcloud:
@@ -224,3 +230,4 @@ volumes:
keycloak-tokens:
keycloak-oauth-storage:
qdrant-data:
mcp-data:
+152
View File
@@ -108,6 +108,158 @@ NEXTCLOUD_PASSWORD=your_app_password_or_password
---
## Semantic Search Configuration (Optional)
The MCP server includes semantic search capabilities powered by vector embeddings. This feature requires a vector database (Qdrant) and an embedding service.
### Qdrant Vector Database Modes
The server supports three Qdrant deployment modes:
1. **In-Memory Mode** (Default) - Simplest for development and testing
2. **Persistent Local Mode** - For single-instance deployments with persistence
3. **Network Mode** - For production with dedicated Qdrant service
#### 1. In-Memory Mode (Default)
No configuration needed! If neither `QDRANT_URL` nor `QDRANT_LOCATION` is set, the server defaults to in-memory mode:
```dotenv
# No Qdrant configuration needed - defaults to :memory:
VECTOR_SYNC_ENABLED=true
```
**Pros:**
- Zero configuration
- Fast startup
- Perfect for testing
**Cons:**
- Data lost on restart
- Limited to available RAM
#### 2. Persistent Local Mode
For single-instance deployments that need persistence without a separate Qdrant service:
```dotenv
# Local persistent storage
QDRANT_LOCATION=/app/data/qdrant # Or any writable path
VECTOR_SYNC_ENABLED=true
```
**Pros:**
- Data persists across restarts
- No separate service needed
- Suitable for small/medium deployments
**Cons:**
- Limited to single instance
- Shares resources with MCP server
#### 3. Network Mode
For production deployments with a dedicated Qdrant service:
```dotenv
# Network mode configuration
QDRANT_URL=http://qdrant:6333
QDRANT_API_KEY=your-secret-api-key # Optional
QDRANT_COLLECTION=nextcloud_content # Optional
VECTOR_SYNC_ENABLED=true
```
**Pros:**
- Scalable and performant
- Can be shared across multiple MCP instances
- Supports clustering and replication
**Cons:**
- Requires separate Qdrant service
- More complex deployment
### Vector Sync Configuration
Control background indexing behavior:
```dotenv
# Vector sync settings (ADR-007)
VECTOR_SYNC_ENABLED=true # Enable background indexing
VECTOR_SYNC_SCAN_INTERVAL=300 # Scan interval in seconds (default: 5 minutes)
VECTOR_SYNC_PROCESSOR_WORKERS=3 # Concurrent indexing workers (default: 3)
VECTOR_SYNC_QUEUE_MAX_SIZE=10000 # Max queued documents (default: 10000)
```
### Embedding Service Configuration
The server uses an embedding service to generate vector representations. Two options are available:
#### Ollama (Recommended)
Use a local Ollama instance for embeddings:
```dotenv
OLLAMA_BASE_URL=http://ollama:11434
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Default model
OLLAMA_VERIFY_SSL=true # Verify SSL certificates
```
#### Simple Embedding Provider (Fallback)
If `OLLAMA_BASE_URL` is not set, the server uses a simple random embedding provider for testing. This is **not suitable for production** as it generates random embeddings with no semantic meaning.
### Environment Variables Reference
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `QDRANT_URL` | ⚠️ Optional | - | Qdrant service URL (network mode) - mutually exclusive with `QDRANT_LOCATION` |
| `QDRANT_LOCATION` | ⚠️ Optional | `:memory:` | Local Qdrant path (`:memory:` or `/path/to/data`) - mutually exclusive with `QDRANT_URL` |
| `QDRANT_API_KEY` | ⚠️ Optional | - | Qdrant API key (network mode only) |
| `QDRANT_COLLECTION` | ⚠️ Optional | `nextcloud_content` | Qdrant collection name |
| `VECTOR_SYNC_ENABLED` | ⚠️ Optional | `false` | Enable background vector indexing |
| `VECTOR_SYNC_SCAN_INTERVAL` | ⚠️ Optional | `300` | Document scan interval (seconds) |
| `VECTOR_SYNC_PROCESSOR_WORKERS` | ⚠️ Optional | `3` | Concurrent indexing workers |
| `VECTOR_SYNC_QUEUE_MAX_SIZE` | ⚠️ Optional | `10000` | Max queued documents |
| `OLLAMA_BASE_URL` | ⚠️ Optional | - | Ollama API endpoint for embeddings |
| `OLLAMA_EMBEDDING_MODEL` | ⚠️ Optional | `nomic-embed-text` | Embedding model to use |
| `OLLAMA_VERIFY_SSL` | ⚠️ Optional | `true` | Verify SSL certificates |
### Docker Compose Example
Enable network mode Qdrant with docker-compose:
```yaml
services:
mcp:
environment:
- QDRANT_URL=http://qdrant:6333
- VECTOR_SYNC_ENABLED=true
qdrant:
image: qdrant/qdrant:latest
ports:
- 127.0.0.1:6333:6333
volumes:
- qdrant-data:/qdrant/storage
profiles:
- qdrant # Optional service
volumes:
qdrant-data:
```
Start with Qdrant service:
```bash
docker-compose --profile qdrant up
```
Or use default in-memory mode (no `--profile` needed):
```bash
docker-compose up
```
---
## Loading Environment Variables
After creating your `.env` file, load the environment variables:
+29 -3
View File
@@ -1,3 +1,4 @@
import logging
import logging.config
import os
from dataclasses import dataclass
@@ -162,8 +163,9 @@ class Settings:
vector_sync_processor_workers: int = 3
vector_sync_queue_max_size: int = 10000
# Qdrant settings
qdrant_url: str = "http://qdrant:6333"
# Qdrant settings (mutually exclusive modes)
qdrant_url: Optional[str] = None # Network mode: http://qdrant:6333
qdrant_location: Optional[str] = None # Local mode: :memory: or /path/to/data
qdrant_api_key: Optional[str] = None
qdrant_collection: str = "nextcloud_content"
@@ -172,6 +174,29 @@ class Settings:
ollama_embedding_model: str = "nomic-embed-text"
ollama_verify_ssl: bool = True
def __post_init__(self):
"""Validate Qdrant configuration and set defaults."""
logger = logging.getLogger(__name__)
# Ensure mutual exclusivity
if self.qdrant_url and self.qdrant_location:
raise ValueError(
"Cannot set both QDRANT_URL and QDRANT_LOCATION. "
"Use QDRANT_URL for network mode or QDRANT_LOCATION for local mode."
)
# Default to :memory: if neither set
if not self.qdrant_url and not self.qdrant_location:
self.qdrant_location = ":memory:"
logger.info("Using default Qdrant mode: in-memory (:memory:)")
# Warn if API key set in local mode
if self.qdrant_location and self.qdrant_api_key:
logger.warning(
"QDRANT_API_KEY is set but QDRANT_LOCATION is used (local mode). "
"API key is only relevant for network mode and will be ignored."
)
def get_settings() -> Settings:
"""Get application settings from environment variables.
@@ -220,7 +245,8 @@ def get_settings() -> Settings:
os.getenv("VECTOR_SYNC_QUEUE_MAX_SIZE", "10000")
),
# Qdrant settings
qdrant_url=os.getenv("QDRANT_URL", "http://qdrant:6333"),
qdrant_url=os.getenv("QDRANT_URL"),
qdrant_location=os.getenv("QDRANT_LOCATION"),
qdrant_api_key=os.getenv("QDRANT_API_KEY"),
qdrant_collection=os.getenv("QDRANT_COLLECTION", "nextcloud_content"),
# Ollama settings
+31 -9
View File
@@ -1,11 +1,12 @@
"""Qdrant client wrapper."""
import logging
import os
from qdrant_client import AsyncQdrantClient
from qdrant_client.models import Distance, VectorParams
from nextcloud_mcp_server.config import get_settings
logger = logging.getLogger(__name__)
@@ -19,6 +20,11 @@ async def get_qdrant_client() -> AsyncQdrantClient:
Automatically creates collection on first use if it doesn't exist.
Supports three Qdrant modes:
- Network mode: QDRANT_URL set (e.g., http://qdrant:6333)
- In-memory mode: QDRANT_LOCATION=:memory: (default if nothing configured)
- Persistent local mode: QDRANT_LOCATION=/path/to/data
Returns:
Configured AsyncQdrantClient instance
@@ -28,17 +34,33 @@ async def get_qdrant_client() -> AsyncQdrantClient:
global _qdrant_client
if _qdrant_client is None:
url = os.getenv("QDRANT_URL", "http://qdrant:6333")
api_key = os.getenv("QDRANT_API_KEY")
settings = get_settings()
_qdrant_client = AsyncQdrantClient(
url=url,
api_key=api_key,
timeout=30,
)
# Detect mode and initialize client accordingly
if settings.qdrant_url:
# Network mode
logger.info(f"Using Qdrant network mode: {settings.qdrant_url}")
_qdrant_client = AsyncQdrantClient(
url=settings.qdrant_url,
api_key=settings.qdrant_api_key,
timeout=30,
)
elif settings.qdrant_location:
# Local mode (either :memory: or persistent path)
if settings.qdrant_location == ":memory:":
logger.info("Using Qdrant in-memory mode: :memory:")
_qdrant_client = AsyncQdrantClient(":memory:")
else:
# Persistent local mode - use path parameter
logger.info(f"Using Qdrant persistent mode: {settings.qdrant_location}")
_qdrant_client = AsyncQdrantClient(path=settings.qdrant_location)
else:
# Should not happen due to __post_init__ validation, but handle gracefully
logger.warning("No Qdrant mode configured, defaulting to :memory:")
_qdrant_client = AsyncQdrantClient(":memory:")
# Ensure collection exists
collection_name = os.getenv("QDRANT_COLLECTION", "nextcloud_content")
collection_name = settings.qdrant_collection
# Import here to avoid circular dependency
from nextcloud_mcp_server.embedding import get_embedding_service
+88
View File
@@ -10,6 +10,9 @@ Uses SimpleEmbeddingProvider for deterministic, in-process embeddings
without requiring external services like Ollama.
"""
import tempfile
from pathlib import Path
import pytest
from qdrant_client import AsyncQdrantClient
from qdrant_client.models import Distance, PointStruct, VectorParams
@@ -342,3 +345,88 @@ async def test_batch_embedding(simple_embedding_provider: SimpleEmbeddingProvide
for emb in embeddings:
norm = math.sqrt(sum(x * x for x in emb))
assert abs(norm - 1.0) < 1e-6
async def test_qdrant_persistent_mode(
simple_embedding_provider: SimpleEmbeddingProvider,
sample_notes: list[dict],
):
"""Test Qdrant in persistent local mode with file storage."""
with tempfile.TemporaryDirectory() as tmpdir:
storage_path = Path(tmpdir) / "qdrant_data"
# Create first client with persistent storage using path parameter
client1 = AsyncQdrantClient(path=str(storage_path))
try:
collection_name = "test_persistent"
# Create collection and index notes
await client1.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)
# Index sample notes
points = []
for note in sample_notes:
content = f"{note['title']}\n\n{note['content']}"
embedding = await simple_embedding_provider.embed(content)
points.append(
PointStruct(
id=note["id"],
vector=embedding,
payload={
"note_id": note["id"],
"title": note["title"],
"category": note["category"],
},
)
)
await client1.upsert(
collection_name=collection_name, points=points, wait=True
)
# Verify data was written
count_result = await client1.count(collection_name=collection_name)
assert count_result.count == len(sample_notes)
# Close first client
await client1.close()
# Create new client with same storage path
client2 = AsyncQdrantClient(path=str(storage_path))
try:
# Data should persist - verify collection exists
collections = await client2.get_collections()
collection_names = [c.name for c in collections.collections]
assert collection_name in collection_names
# Verify indexed data persisted
count_result = await client2.count(collection_name=collection_name)
assert count_result.count == len(sample_notes)
# Verify search still works
query = "Python programming"
query_embedding = await simple_embedding_provider.embed(query)
response = await client2.query_points(
collection_name=collection_name,
query=query_embedding,
limit=3,
)
# Should find Python note as top result
assert len(response.points) > 0
assert response.points[0].payload["note_id"] == 1
finally:
await client2.close()
finally:
# Cleanup
await client1.close()
+153
View File
@@ -0,0 +1,153 @@
"""Tests for configuration validation."""
import os
from unittest.mock import patch
import pytest
from nextcloud_mcp_server.config import Settings, get_settings
class TestQdrantConfigValidation:
"""Test Qdrant configuration validation."""
def test_mutually_exclusive_url_and_location(self):
"""Test that setting both QDRANT_URL and QDRANT_LOCATION raises ValueError."""
with pytest.raises(
ValueError,
match="Cannot set both QDRANT_URL and QDRANT_LOCATION",
):
Settings(
qdrant_url="http://qdrant:6333",
qdrant_location="/app/data/qdrant",
)
def test_default_to_memory_mode(self):
"""Test that :memory: is used when neither URL nor location is set."""
settings = Settings()
assert settings.qdrant_location == ":memory:"
assert settings.qdrant_url is None
def test_network_mode_only(self):
"""Test network mode with only URL set."""
settings = Settings(qdrant_url="http://qdrant:6333")
assert settings.qdrant_url == "http://qdrant:6333"
assert settings.qdrant_location is None
def test_local_mode_only(self):
"""Test local mode with only location set."""
settings = Settings(qdrant_location="/app/data/qdrant")
assert settings.qdrant_location == "/app/data/qdrant"
assert settings.qdrant_url is None
def test_in_memory_mode_explicit(self):
"""Test explicit in-memory mode."""
settings = Settings(qdrant_location=":memory:")
assert settings.qdrant_location == ":memory:"
assert settings.qdrant_url is None
def test_api_key_warning_in_local_mode(self, caplog):
"""Test that API key in local mode triggers warning."""
import logging
caplog.set_level(logging.WARNING, logger="nextcloud_mcp_server.config")
Settings(
qdrant_location=":memory:",
qdrant_api_key="test-api-key",
)
assert "API key is only relevant for network mode" in caplog.text
def test_api_key_no_warning_in_network_mode(self, caplog):
"""Test that API key in network mode doesn't trigger warning."""
import logging
caplog.set_level(logging.WARNING, logger="nextcloud_mcp_server.config")
Settings(
qdrant_url="http://qdrant:6333",
qdrant_api_key="test-api-key",
)
assert "API key is only relevant for network mode" not in caplog.text
class TestGetSettings:
"""Test get_settings() function with environment variables."""
@patch.dict(os.environ, {}, clear=True)
def test_get_settings_defaults_to_memory(self):
"""Test get_settings() defaults to :memory: when no env vars set."""
settings = get_settings()
assert settings.qdrant_location == ":memory:"
assert settings.qdrant_url is None
@patch.dict(
os.environ,
{
"QDRANT_URL": "http://qdrant:6333",
"QDRANT_API_KEY": "test-key",
},
clear=True,
)
def test_get_settings_network_mode(self):
"""Test get_settings() with network mode env vars."""
settings = get_settings()
assert settings.qdrant_url == "http://qdrant:6333"
assert settings.qdrant_api_key == "test-key"
assert settings.qdrant_location is None
@patch.dict(
os.environ,
{"QDRANT_LOCATION": "/app/data/qdrant"},
clear=True,
)
def test_get_settings_persistent_mode(self):
"""Test get_settings() with persistent local mode env vars."""
settings = get_settings()
assert settings.qdrant_location == "/app/data/qdrant"
assert settings.qdrant_url is None
@patch.dict(
os.environ,
{"QDRANT_LOCATION": ":memory:"},
clear=True,
)
def test_get_settings_explicit_memory(self):
"""Test get_settings() with explicit :memory: env var."""
settings = get_settings()
assert settings.qdrant_location == ":memory:"
assert settings.qdrant_url is None
@patch.dict(
os.environ,
{
"QDRANT_URL": "http://qdrant:6333",
"QDRANT_LOCATION": "/app/data/qdrant",
},
clear=True,
)
def test_get_settings_mutual_exclusion_error(self):
"""Test get_settings() raises error when both URL and location set."""
with pytest.raises(
ValueError,
match="Cannot set both QDRANT_URL and QDRANT_LOCATION",
):
get_settings()
@patch.dict(
os.environ,
{
"QDRANT_COLLECTION": "test_collection",
"VECTOR_SYNC_ENABLED": "true",
"VECTOR_SYNC_SCAN_INTERVAL": "600",
"VECTOR_SYNC_PROCESSOR_WORKERS": "5",
"VECTOR_SYNC_QUEUE_MAX_SIZE": "5000",
},
clear=True,
)
def test_get_settings_vector_sync_config(self):
"""Test get_settings() with vector sync configuration."""
settings = get_settings()
assert settings.qdrant_collection == "test_collection"
assert settings.vector_sync_enabled is True
assert settings.vector_sync_scan_interval == 600
assert settings.vector_sync_processor_workers == 5
assert settings.vector_sync_queue_max_size == 5000