From 857d8f21528a00ac221ed2b60530027668cfbad8 Mon Sep 17 00:00:00 2001
From: Chris Coutinho <chris@coutinho.io>
Date: Sun, 9 Nov 2025 07:07:07 +0100
Subject: [PATCH] feat: add Qdrant local mode support with in-memory and
 persistent storage
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Adds flexible Qdrant deployment modes to reduce infrastructure requirements
for local development and smaller deployments:

**Configuration Changes:**
- Add QDRANT_LOCATION environment variable (mutually exclusive with QDRANT_URL)
- Three modes: network (URL), in-memory (:memory:, default), persistent (file path)
- Settings dataclass validation via __post_init__ ensures mutual exclusivity
- API key warning when set in local mode (ignored, only for network mode)

**Client Initialization:**
- Auto-detect mode: network (url + api_key) vs local (:memory: or path=)
- In-memory: AsyncQdrantClient(":memory:") - zero config default
- Persistent: AsyncQdrantClient(path="/app/data/qdrant") - file storage
- Network: AsyncQdrantClient(url, api_key) - production mode

**Docker Compose Updates:**
- Qdrant service moved to optional profile (--profile qdrant)
- MCP service uses QDRANT_LOCATION=:memory: by default
- Added mcp-data volume for persistent storage (/app/data)
- No hard dependency on qdrant service

**Documentation:**
- Comprehensive configuration guide in docs/configuration.md
- All three modes documented with pros/cons
- Docker Compose examples for each mode
- Environment variable reference table

**Tests:**
- 13 new config validation tests (mutual exclusivity, defaults, warnings)
- Persistent mode integration test (create, close, reopen, verify persistence)
- All 82 unit tests + 5 smoke tests pass

**Breaking Change:**
- Default changed from QDRANT_URL=http://qdrant:6333 to QDRANT_LOCATION=:memory:
- Simplifies local development (no external service needed)
- Production deployments: explicitly set QDRANT_URL or QDRANT_LOCATION

Related: ADR-007 background vector sync implementation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docker-compose.yml                           |  17 ++-
 docs/configuration.md                        | 152 ++++++++++++++++++
 nextcloud_mcp_server/config.py               |  32 +++-
 nextcloud_mcp_server/vector/qdrant_client.py |  40 +++--
 tests/integration/test_semantic_search.py    |  88 +++++++++++
 tests/unit/test_config.py                    | 153 +++++++++++++++++++
 6 files changed, 465 insertions(+), 17 deletions(-)
 create mode 100644 tests/unit/test_config.py

diff --git a/docker-compose.yml b/docker-compose.yml
index 9b62183..6db717e 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -74,10 +74,10 @@ services:
     depends_on:
       app:
         condition: service_healthy
-      qdrant:
-        condition: service_healthy
     ports:
       - 127.0.0.1:8000:8000
+    volumes:
+      - mcp-data:/app/data
     environment:
       - NEXTCLOUD_HOST=http://app:80
       - NEXTCLOUD_USERNAME=admin
@@ -88,9 +88,13 @@ services:
       - VECTOR_SYNC_SCAN_INTERVAL=10
       - VECTOR_SYNC_PROCESSOR_WORKERS=1
 
-      # Qdrant configuration
-      - QDRANT_URL=http://qdrant:6333
-      - QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key}
+      # Qdrant configuration (three modes):
+      # 1. Network mode: Set QDRANT_URL=http://qdrant:6333 (requires qdrant service)
+      # 2. In-memory mode: Set QDRANT_LOCATION=:memory: (default if nothing set)
+      # 3. Persistent local: Set QDRANT_LOCATION=/app/data/qdrant (stored in mcp-data volume)
+      - QDRANT_LOCATION=:memory:
+      # - QDRANT_URL=http://qdrant:6333  # Uncomment for network mode
+      # - QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key}  # Only for network mode
       - QDRANT_COLLECTION=nextcloud_content
 
       # Ollama configuration (optional - uses SimpleEmbeddingProvider if not set)
@@ -215,6 +219,8 @@ services:
       interval: 10s
       timeout: 5s
       retries: 10
+    profiles:
+      - qdrant
 
 volumes:
   nextcloud:
@@ -224,3 +230,4 @@ volumes:
   keycloak-tokens:
   keycloak-oauth-storage:
   qdrant-data:
+  mcp-data:
diff --git a/docs/configuration.md b/docs/configuration.md
index 72100e8..8ae452f 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -108,6 +108,158 @@ NEXTCLOUD_PASSWORD=your_app_password_or_password
 
 ---
 
+## Semantic Search Configuration (Optional)
+
+The MCP server includes semantic search capabilities powered by vector embeddings. This feature requires a vector database (Qdrant) and an embedding service.
+
+### Qdrant Vector Database Modes
+
+The server supports three Qdrant deployment modes:
+
+1. **In-Memory Mode** (Default) - Simplest for development and testing
+2. **Persistent Local Mode** - For single-instance deployments with persistence
+3. **Network Mode** - For production with dedicated Qdrant service
+
+#### 1. In-Memory Mode (Default)
+
+No configuration needed! If neither `QDRANT_URL` nor `QDRANT_LOCATION` is set, the server defaults to in-memory mode:
+
+```dotenv
+# No Qdrant configuration needed - defaults to :memory:
+VECTOR_SYNC_ENABLED=true
+```
+
+**Pros:**
+- Zero configuration
+- Fast startup
+- Perfect for testing
+
+**Cons:**
+- Data lost on restart
+- Limited to available RAM
+
+#### 2. Persistent Local Mode
+
+For single-instance deployments that need persistence without a separate Qdrant service:
+
+```dotenv
+# Local persistent storage
+QDRANT_LOCATION=/app/data/qdrant  # Or any writable path
+VECTOR_SYNC_ENABLED=true
+```
+
+**Pros:**
+- Data persists across restarts
+- No separate service needed
+- Suitable for small/medium deployments
+
+**Cons:**
+- Limited to single instance
+- Shares resources with MCP server
+
+#### 3. Network Mode
+
+For production deployments with a dedicated Qdrant service:
+
+```dotenv
+# Network mode configuration
+QDRANT_URL=http://qdrant:6333
+QDRANT_API_KEY=your-secret-api-key  # Optional
+QDRANT_COLLECTION=nextcloud_content  # Optional
+VECTOR_SYNC_ENABLED=true
+```
+
+**Pros:**
+- Scalable and performant
+- Can be shared across multiple MCP instances
+- Supports clustering and replication
+
+**Cons:**
+- Requires separate Qdrant service
+- More complex deployment
+
+### Vector Sync Configuration
+
+Control background indexing behavior:
+
+```dotenv
+# Vector sync settings (ADR-007)
+VECTOR_SYNC_ENABLED=true              # Enable background indexing
+VECTOR_SYNC_SCAN_INTERVAL=300         # Scan interval in seconds (default: 5 minutes)
+VECTOR_SYNC_PROCESSOR_WORKERS=3       # Concurrent indexing workers (default: 3)
+VECTOR_SYNC_QUEUE_MAX_SIZE=10000      # Max queued documents (default: 10000)
+```
+
+### Embedding Service Configuration
+
+The server uses an embedding service to generate vector representations. Two options are available:
+
+#### Ollama (Recommended)
+
+Use a local Ollama instance for embeddings:
+
+```dotenv
+OLLAMA_BASE_URL=http://ollama:11434
+OLLAMA_EMBEDDING_MODEL=nomic-embed-text  # Default model
+OLLAMA_VERIFY_SSL=true                   # Verify SSL certificates
+```
+
+#### Simple Embedding Provider (Fallback)
+
+If `OLLAMA_BASE_URL` is not set, the server uses a simple random embedding provider for testing. This is **not suitable for production** as it generates random embeddings with no semantic meaning.
+
+### Environment Variables Reference
+
+| Variable | Required | Default | Description |
+|----------|----------|---------|-------------|
+| `QDRANT_URL` | ⚠️ Optional | - | Qdrant service URL (network mode) - mutually exclusive with `QDRANT_LOCATION` |
+| `QDRANT_LOCATION` | ⚠️ Optional | `:memory:` | Local Qdrant path (`:memory:` or `/path/to/data`) - mutually exclusive with `QDRANT_URL` |
+| `QDRANT_API_KEY` | ⚠️ Optional | - | Qdrant API key (network mode only) |
+| `QDRANT_COLLECTION` | ⚠️ Optional | `nextcloud_content` | Qdrant collection name |
+| `VECTOR_SYNC_ENABLED` | ⚠️ Optional | `false` | Enable background vector indexing |
+| `VECTOR_SYNC_SCAN_INTERVAL` | ⚠️ Optional | `300` | Document scan interval (seconds) |
+| `VECTOR_SYNC_PROCESSOR_WORKERS` | ⚠️ Optional | `3` | Concurrent indexing workers |
+| `VECTOR_SYNC_QUEUE_MAX_SIZE` | ⚠️ Optional | `10000` | Max queued documents |
+| `OLLAMA_BASE_URL` | ⚠️ Optional | - | Ollama API endpoint for embeddings |
+| `OLLAMA_EMBEDDING_MODEL` | ⚠️ Optional | `nomic-embed-text` | Embedding model to use |
+| `OLLAMA_VERIFY_SSL` | ⚠️ Optional | `true` | Verify SSL certificates |
+
+### Docker Compose Example
+
+Enable network mode Qdrant with docker-compose:
+
+```yaml
+services:
+  mcp:
+    environment:
+      - QDRANT_URL=http://qdrant:6333
+      - VECTOR_SYNC_ENABLED=true
+
+  qdrant:
+    image: qdrant/qdrant:latest
+    ports:
+      - 127.0.0.1:6333:6333
+    volumes:
+      - qdrant-data:/qdrant/storage
+    profiles:
+      - qdrant  # Optional service
+
+volumes:
+  qdrant-data:
+```
+
+Start with Qdrant service:
+```bash
+docker-compose --profile qdrant up
+```
+
+Or use default in-memory mode (no `--profile` needed):
+```bash
+docker-compose up
+```
+
+---
+
 ## Loading Environment Variables
 
 After creating your `.env` file, load the environment variables:
diff --git a/nextcloud_mcp_server/config.py b/nextcloud_mcp_server/config.py
index fd50504..66cc2a2 100644
--- a/nextcloud_mcp_server/config.py
+++ b/nextcloud_mcp_server/config.py
@@ -1,3 +1,4 @@
+import logging
 import logging.config
 import os
 from dataclasses import dataclass
@@ -162,8 +163,9 @@ class Settings:
     vector_sync_processor_workers: int = 3
     vector_sync_queue_max_size: int = 10000
 
-    # Qdrant settings
-    qdrant_url: str = "http://qdrant:6333"
+    # Qdrant settings (mutually exclusive modes)
+    qdrant_url: Optional[str] = None  # Network mode: http://qdrant:6333
+    qdrant_location: Optional[str] = None  # Local mode: :memory: or /path/to/data
     qdrant_api_key: Optional[str] = None
     qdrant_collection: str = "nextcloud_content"
 
@@ -172,6 +174,29 @@ class Settings:
     ollama_embedding_model: str = "nomic-embed-text"
     ollama_verify_ssl: bool = True
 
+    def __post_init__(self):
+        """Validate Qdrant configuration and set defaults."""
+        logger = logging.getLogger(__name__)
+
+        # Ensure mutual exclusivity
+        if self.qdrant_url and self.qdrant_location:
+            raise ValueError(
+                "Cannot set both QDRANT_URL and QDRANT_LOCATION. "
+                "Use QDRANT_URL for network mode or QDRANT_LOCATION for local mode."
+            )
+
+        # Default to :memory: if neither set
+        if not self.qdrant_url and not self.qdrant_location:
+            self.qdrant_location = ":memory:"
+            logger.info("Using default Qdrant mode: in-memory (:memory:)")
+
+        # Warn if API key set in local mode
+        if self.qdrant_location and self.qdrant_api_key:
+            logger.warning(
+                "QDRANT_API_KEY is set but QDRANT_LOCATION is used (local mode). "
+                "API key is only relevant for network mode and will be ignored."
+            )
+
 
 def get_settings() -> Settings:
     """Get application settings from environment variables.
@@ -220,7 +245,8 @@ def get_settings() -> Settings:
             os.getenv("VECTOR_SYNC_QUEUE_MAX_SIZE", "10000")
         ),
         # Qdrant settings
-        qdrant_url=os.getenv("QDRANT_URL", "http://qdrant:6333"),
+        qdrant_url=os.getenv("QDRANT_URL"),
+        qdrant_location=os.getenv("QDRANT_LOCATION"),
         qdrant_api_key=os.getenv("QDRANT_API_KEY"),
         qdrant_collection=os.getenv("QDRANT_COLLECTION", "nextcloud_content"),
         # Ollama settings
diff --git a/nextcloud_mcp_server/vector/qdrant_client.py b/nextcloud_mcp_server/vector/qdrant_client.py
index 733d769..32664c4 100644
--- a/nextcloud_mcp_server/vector/qdrant_client.py
+++ b/nextcloud_mcp_server/vector/qdrant_client.py
@@ -1,11 +1,12 @@
 """Qdrant client wrapper."""
 
 import logging
-import os
 
 from qdrant_client import AsyncQdrantClient
 from qdrant_client.models import Distance, VectorParams
 
+from nextcloud_mcp_server.config import get_settings
+
 logger = logging.getLogger(__name__)
 
 
@@ -19,6 +20,11 @@ async def get_qdrant_client() -> AsyncQdrantClient:
 
     Automatically creates collection on first use if it doesn't exist.
 
+    Supports three Qdrant modes:
+    - Network mode: QDRANT_URL set (e.g., http://qdrant:6333)
+    - In-memory mode: QDRANT_LOCATION=:memory: (default if nothing configured)
+    - Persistent local mode: QDRANT_LOCATION=/path/to/data
+
     Returns:
         Configured AsyncQdrantClient instance
 
@@ -28,17 +34,33 @@ async def get_qdrant_client() -> AsyncQdrantClient:
     global _qdrant_client
 
     if _qdrant_client is None:
-        url = os.getenv("QDRANT_URL", "http://qdrant:6333")
-        api_key = os.getenv("QDRANT_API_KEY")
+        settings = get_settings()
 
-        _qdrant_client = AsyncQdrantClient(
-            url=url,
-            api_key=api_key,
-            timeout=30,
-        )
+        # Detect mode and initialize client accordingly
+        if settings.qdrant_url:
+            # Network mode
+            logger.info(f"Using Qdrant network mode: {settings.qdrant_url}")
+            _qdrant_client = AsyncQdrantClient(
+                url=settings.qdrant_url,
+                api_key=settings.qdrant_api_key,
+                timeout=30,
+            )
+        elif settings.qdrant_location:
+            # Local mode (either :memory: or persistent path)
+            if settings.qdrant_location == ":memory:":
+                logger.info("Using Qdrant in-memory mode: :memory:")
+                _qdrant_client = AsyncQdrantClient(":memory:")
+            else:
+                # Persistent local mode - use path parameter
+                logger.info(f"Using Qdrant persistent mode: {settings.qdrant_location}")
+                _qdrant_client = AsyncQdrantClient(path=settings.qdrant_location)
+        else:
+            # Should not happen due to __post_init__ validation, but handle gracefully
+            logger.warning("No Qdrant mode configured, defaulting to :memory:")
+            _qdrant_client = AsyncQdrantClient(":memory:")
 
         # Ensure collection exists
-        collection_name = os.getenv("QDRANT_COLLECTION", "nextcloud_content")
+        collection_name = settings.qdrant_collection
 
         # Import here to avoid circular dependency
         from nextcloud_mcp_server.embedding import get_embedding_service
diff --git a/tests/integration/test_semantic_search.py b/tests/integration/test_semantic_search.py
index 17ab66a..b241c98 100644
--- a/tests/integration/test_semantic_search.py
+++ b/tests/integration/test_semantic_search.py
@@ -10,6 +10,9 @@ Uses SimpleEmbeddingProvider for deterministic, in-process embeddings
 without requiring external services like Ollama.
 """
 
+import tempfile
+from pathlib import Path
+
 import pytest
 from qdrant_client import AsyncQdrantClient
 from qdrant_client.models import Distance, PointStruct, VectorParams
@@ -342,3 +345,88 @@ async def test_batch_embedding(simple_embedding_provider: SimpleEmbeddingProvide
     for emb in embeddings:
         norm = math.sqrt(sum(x * x for x in emb))
         assert abs(norm - 1.0) < 1e-6
+
+
+async def test_qdrant_persistent_mode(
+    simple_embedding_provider: SimpleEmbeddingProvider,
+    sample_notes: list[dict],
+):
+    """Test Qdrant in persistent local mode with file storage."""
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        storage_path = Path(tmpdir) / "qdrant_data"
+
+        # Create first client with persistent storage using path parameter
+        client1 = AsyncQdrantClient(path=str(storage_path))
+
+        try:
+            collection_name = "test_persistent"
+
+            # Create collection and index notes
+            await client1.create_collection(
+                collection_name=collection_name,
+                vectors_config=VectorParams(size=384, distance=Distance.COSINE),
+            )
+
+            # Index sample notes
+            points = []
+            for note in sample_notes:
+                content = f"{note['title']}\n\n{note['content']}"
+                embedding = await simple_embedding_provider.embed(content)
+
+                points.append(
+                    PointStruct(
+                        id=note["id"],
+                        vector=embedding,
+                        payload={
+                            "note_id": note["id"],
+                            "title": note["title"],
+                            "category": note["category"],
+                        },
+                    )
+                )
+
+            await client1.upsert(
+                collection_name=collection_name, points=points, wait=True
+            )
+
+            # Verify data was written
+            count_result = await client1.count(collection_name=collection_name)
+            assert count_result.count == len(sample_notes)
+
+            # Close first client
+            await client1.close()
+
+            # Create new client with same storage path
+            client2 = AsyncQdrantClient(path=str(storage_path))
+
+            try:
+                # Data should persist - verify collection exists
+                collections = await client2.get_collections()
+                collection_names = [c.name for c in collections.collections]
+                assert collection_name in collection_names
+
+                # Verify indexed data persisted
+                count_result = await client2.count(collection_name=collection_name)
+                assert count_result.count == len(sample_notes)
+
+                # Verify search still works
+                query = "Python programming"
+                query_embedding = await simple_embedding_provider.embed(query)
+
+                response = await client2.query_points(
+                    collection_name=collection_name,
+                    query=query_embedding,
+                    limit=3,
+                )
+
+                # Should find Python note as top result
+                assert len(response.points) > 0
+                assert response.points[0].payload["note_id"] == 1
+
+            finally:
+                await client2.close()
+
+        finally:
+            # Cleanup
+            await client1.close()
diff --git a/tests/unit/test_config.py b/tests/unit/test_config.py
new file mode 100644
index 0000000..f24e040
--- /dev/null
+++ b/tests/unit/test_config.py
@@ -0,0 +1,153 @@
+"""Tests for configuration validation."""
+
+import os
+from unittest.mock import patch
+
+import pytest
+
+from nextcloud_mcp_server.config import Settings, get_settings
+
+
+class TestQdrantConfigValidation:
+    """Test Qdrant configuration validation."""
+
+    def test_mutually_exclusive_url_and_location(self):
+        """Test that setting both QDRANT_URL and QDRANT_LOCATION raises ValueError."""
+        with pytest.raises(
+            ValueError,
+            match="Cannot set both QDRANT_URL and QDRANT_LOCATION",
+        ):
+            Settings(
+                qdrant_url="http://qdrant:6333",
+                qdrant_location="/app/data/qdrant",
+            )
+
+    def test_default_to_memory_mode(self):
+        """Test that :memory: is used when neither URL nor location is set."""
+        settings = Settings()
+        assert settings.qdrant_location == ":memory:"
+        assert settings.qdrant_url is None
+
+    def test_network_mode_only(self):
+        """Test network mode with only URL set."""
+        settings = Settings(qdrant_url="http://qdrant:6333")
+        assert settings.qdrant_url == "http://qdrant:6333"
+        assert settings.qdrant_location is None
+
+    def test_local_mode_only(self):
+        """Test local mode with only location set."""
+        settings = Settings(qdrant_location="/app/data/qdrant")
+        assert settings.qdrant_location == "/app/data/qdrant"
+        assert settings.qdrant_url is None
+
+    def test_in_memory_mode_explicit(self):
+        """Test explicit in-memory mode."""
+        settings = Settings(qdrant_location=":memory:")
+        assert settings.qdrant_location == ":memory:"
+        assert settings.qdrant_url is None
+
+    def test_api_key_warning_in_local_mode(self, caplog):
+        """Test that API key in local mode triggers warning."""
+        import logging
+
+        caplog.set_level(logging.WARNING, logger="nextcloud_mcp_server.config")
+        Settings(
+            qdrant_location=":memory:",
+            qdrant_api_key="test-api-key",
+        )
+        assert "API key is only relevant for network mode" in caplog.text
+
+    def test_api_key_no_warning_in_network_mode(self, caplog):
+        """Test that API key in network mode doesn't trigger warning."""
+        import logging
+
+        caplog.set_level(logging.WARNING, logger="nextcloud_mcp_server.config")
+        Settings(
+            qdrant_url="http://qdrant:6333",
+            qdrant_api_key="test-api-key",
+        )
+        assert "API key is only relevant for network mode" not in caplog.text
+
+
+class TestGetSettings:
+    """Test get_settings() function with environment variables."""
+
+    @patch.dict(os.environ, {}, clear=True)
+    def test_get_settings_defaults_to_memory(self):
+        """Test get_settings() defaults to :memory: when no env vars set."""
+        settings = get_settings()
+        assert settings.qdrant_location == ":memory:"
+        assert settings.qdrant_url is None
+
+    @patch.dict(
+        os.environ,
+        {
+            "QDRANT_URL": "http://qdrant:6333",
+            "QDRANT_API_KEY": "test-key",
+        },
+        clear=True,
+    )
+    def test_get_settings_network_mode(self):
+        """Test get_settings() with network mode env vars."""
+        settings = get_settings()
+        assert settings.qdrant_url == "http://qdrant:6333"
+        assert settings.qdrant_api_key == "test-key"
+        assert settings.qdrant_location is None
+
+    @patch.dict(
+        os.environ,
+        {"QDRANT_LOCATION": "/app/data/qdrant"},
+        clear=True,
+    )
+    def test_get_settings_persistent_mode(self):
+        """Test get_settings() with persistent local mode env vars."""
+        settings = get_settings()
+        assert settings.qdrant_location == "/app/data/qdrant"
+        assert settings.qdrant_url is None
+
+    @patch.dict(
+        os.environ,
+        {"QDRANT_LOCATION": ":memory:"},
+        clear=True,
+    )
+    def test_get_settings_explicit_memory(self):
+        """Test get_settings() with explicit :memory: env var."""
+        settings = get_settings()
+        assert settings.qdrant_location == ":memory:"
+        assert settings.qdrant_url is None
+
+    @patch.dict(
+        os.environ,
+        {
+            "QDRANT_URL": "http://qdrant:6333",
+            "QDRANT_LOCATION": "/app/data/qdrant",
+        },
+        clear=True,
+    )
+    def test_get_settings_mutual_exclusion_error(self):
+        """Test get_settings() raises error when both URL and location set."""
+        with pytest.raises(
+            ValueError,
+            match="Cannot set both QDRANT_URL and QDRANT_LOCATION",
+        ):
+            get_settings()
+
+    @patch.dict(
+        os.environ,
+        {
+            "QDRANT_COLLECTION": "test_collection",
+            "VECTOR_SYNC_ENABLED": "true",
+            "VECTOR_SYNC_SCAN_INTERVAL": "600",
+            "VECTOR_SYNC_PROCESSOR_WORKERS": "5",
+            "VECTOR_SYNC_QUEUE_MAX_SIZE": "5000",
+        },
+        clear=True,
+    )
+    def test_get_settings_vector_sync_config(self):
+        """Test get_settings() with vector sync configuration."""
+        settings = get_settings()
+        assert settings.qdrant_collection == "test_collection"
+        assert settings.vector_sync_enabled is True
+        assert settings.vector_sync_scan_interval == 600
+        assert settings.vector_sync_processor_workers == 5
+        assert settings.vector_sync_queue_max_size == 5000