fix: Move grafana_folder from labels to annotations

Fixes Kubernetes label validation error when deploying dashboard ConfigMap. Problem: - Kubernetes labels cannot contain spaces (validation regex: [A-Za-z0-9][-A-Za-z0-9_.]*[A-Za-z0-9]) - Previous implementation had grafana_folder: "Nextcloud MCP" as a label - Deployment failed with: "Invalid value: 'Nextcloud MCP'" Solution: - Move grafana_folder from labels to annotations (annotations allow spaces) - Keep grafana_dashboard="1" as label for ConfigMap discovery - Grafana sidecar reads folder name from folderAnnotation parameter Changes: - dashboard-configmap.yaml: Move grafana_folder to annotations section - dashboards/README.md: Fix kubectl commands to use annotations - values.yaml: Update comments to clarify annotation usage This follows the standard kube-prometheus-stack pattern where: - Labels are used for ConfigMap discovery (strict validation) - Annotations are used for metadata like folder names (relaxed validation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
bump: version 0.32.1 → 0.33.0
2025-11-13 13:08:45 +01:00 · 2025-11-13 10:58:05 +00:00 · 2025-11-13 11:57:40 +01:00 · 2025-11-13 11:49:20 +01:00 · 2025-11-13 08:02:35 +01:00 · 2025-11-12 23:11:29 +00:00
23 changed files with 2419 additions and 581 deletions
@@ -20,7 +20,7 @@ jobs:
      - name: Checkout
        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
      - name: Install uv
-        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
+        uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7.1.3
      - name: Install Python 3.11
        run: uv python install 3.11
      - name: Build
@@ -11,7 +11,7 @@ jobs:
    steps:
      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - name: Install the latest version of uv
-        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
+        uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7.1.3
      - name: Check format
        run: |
          uv run --frozen ruff format --diff
@@ -56,7 +56,7 @@ jobs:
          up-flags: "--build"
      - name: Install the latest version of uv
-        uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
+        uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7.1.3
      - name: Install Playwright dependencies
        run: |
@@ -5,6 +5,9 @@ __pycache__/
 .env.local
 .env.*.local
 # Git
 worktrees/
 docker-compose.override.yml
 # Generated by pytest used to login users
@@ -1,3 +1,33 @@
 ## v0.33.0 (2025-11-13)
 ### Feat
 - Add Grafana dashboard and vector sync metric instrumentation
 ## v0.32.1 (2025-11-12)
 ### Fix
 - add dynamic dimension detection for Ollama embedding models
 ## v0.32.0 (2025-11-11)
 ### Feat
 - **ollama**: Pull model on startup if not available in ollama
 - add dynamic vector sync status updates with htmx polling
 - add webhook management UI and BeforeNodeDeletedEvent support
 - validate Nextcloud webhook schemas and document findings
 ### Fix
 - improve webapp tab UI with CSS Grid and viewport-filling container
 ### Refactor
 - move webapp from /user/page to /app
 - consolidate database storage for webhooks and OAuth tokens
 ## v0.31.1 (2025-11-10)
 ### Refactor
@@ -1,4 +1,4 @@
-FROM ghcr.io/astral-sh/uv:0.9.8-python3.11-alpine@sha256:6c842c49ad032f46b62f32a7e7779f45f12671a8e0d82ea24c766ab62d58b396
+FROM ghcr.io/astral-sh/uv:0.9.9-python3.11-alpine@sha256:0faa7934fac1db7f5056f159c1224d144bab864fd2677a4066d25a686ae32edd
 # Install dependencies
 # 1. git (required for caldav dependency from git)
@@ -2,8 +2,8 @@ apiVersion: v2
 name: nextcloud-mcp-server
 description: A Helm chart for Nextcloud MCP Server - enables AI assistants to interact with Nextcloud
 type: application
-version: 0.31.1
+version: 0.33.0
-appVersion: "0.31.1"
+appVersion: "0.33.0"
 keywords:
  - nextcloud
  - mcp
@@ -21,6 +21,10 @@ home: https://github.com/cbcoutinho/nextcloud-mcp-server
 sources:
  - https://github.com/cbcoutinho/nextcloud-mcp-server
 icon: https://raw.githubusercontent.com/nextcloud/server/master/core/img/logo/logo.svg
 annotations:
  # Grafana dashboard support
  grafana_dashboard: "true"
  grafana_dashboard_folder: "Nextcloud MCP"
 dependencies:
  - name: qdrant
    version: "1.15.5"
@@ -280,6 +280,72 @@ Use OpenAI or any OpenAI-compatible API instead of Ollama.
 | `openai.secretKey` | Key in secret containing API key | `api-key` |
 | `openai.baseUrl` | Custom API endpoint (optional) | `""` |
 #### Observability & Monitoring
 The chart includes comprehensive observability features including Prometheus metrics, OpenTelemetry tracing, and Grafana dashboards.
 **Metrics Configuration:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `observability.metrics.enabled` | Enable Prometheus metrics | `true` |
 | `observability.metrics.port` | Metrics port | `9090` |
 | `observability.metrics.path` | Metrics endpoint path | `/metrics` |
 **Tracing Configuration:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `observability.tracing.enabled` | Enable OpenTelemetry tracing | `false` |
 | `observability.tracing.endpoint` | OTLP collector endpoint | `""` |
 | `observability.tracing.serviceName` | Service name in traces | `nextcloud-mcp-server` |
 | `observability.tracing.samplingRate` | Trace sampling rate (0.0-1.0) | `1.0` |
 **Logging Configuration:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `observability.logging.format` | Log format (json or text) | `json` |
 | `observability.logging.level` | Log level | `INFO` |
 | `observability.logging.includeTraceContext` | Include trace IDs in logs | `true` |
 **ServiceMonitor (Prometheus Operator):**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `serviceMonitor.enabled` | Create ServiceMonitor resource | `false` |
 | `serviceMonitor.interval` | Scrape interval | `30s` |
 | `serviceMonitor.scrapeTimeout` | Scrape timeout | `10s` |
 | `serviceMonitor.labels` | Additional labels for ServiceMonitor | `{}` |
 **PrometheusRule (Prometheus Operator):**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `prometheusRule.enabled` | Create PrometheusRule with alert rules | `false` |
 | `prometheusRule.labels` | Additional labels for PrometheusRule | `{}` |
 **Grafana Dashboards:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `dashboards.enabled` | Enable automatic dashboard provisioning | `false` |
 | `dashboards.grafanaFolder` | Grafana folder name for dashboards | `Nextcloud MCP` |
 | `dashboards.labels` | Additional labels for dashboard ConfigMap | `{}` |
 | `dashboards.annotations` | Additional annotations for dashboard ConfigMap | `{}` |
 When `dashboards.enabled` is `true`, a ConfigMap with the Grafana dashboard is created with the `grafana_dashboard: "1"` label. This enables automatic discovery by Grafana sidecar containers (commonly used with kube-prometheus-stack).
 The dashboard provides comprehensive monitoring including:
 - HTTP request metrics (RED pattern: Rate, Errors, Duration)
 - MCP tool performance and errors
 - Nextcloud API performance by app (notes, calendar, contacts, etc.)
 - OAuth token operations and cache hit rates
 - External dependency health (Nextcloud, Qdrant, Keycloak, Unstructured API)
 - Vector sync processing pipeline (when enabled)
 For manual import or more details, see `charts/nextcloud-mcp-server/dashboards/README.md`.
 ## Examples
 ### Example 1: Basic Auth with Ingress
@@ -6,14 +6,57 @@ This directory contains example Grafana dashboards for monitoring the Nextcloud
 ### nextcloud-mcp-server.json
-Comprehensive dashboard with the following panels:
+All-in-one Operations Dashboard with comprehensive monitoring across all system components.
- **Request Rate**: HTTP requests per second by method and endpoint
+#### Overview Row
- **Error Rate**: Percentage of 5xx errors
+High-level metrics for quick health assessment:
- **Request Latency**: P50 and P95 latency by endpoint
+- **Request Rate** (stat): Total requests per second
- **Top MCP Tools**: Most frequently called tools
+- **Error Rate** (stat): Percentage of 5xx errors with color thresholds
- **Nextcloud API Latency**: API call latency by app (notes, calendar, etc.)
+- **P95 Latency** (stat): 95th percentile request latency
- **Vector Sync Queue**: Queue size for background document processing
+- **Active Requests** (stat): Current in-flight requests
 #### HTTP Metrics (RED Pattern)
 Core request/error/duration metrics:
 - **Request Rate by Endpoint** (timeseries): RPS breakdown by endpoint
 - **Error Rate by Status Code** (timeseries): Error rates for 4xx/5xx codes
 - **Latency Percentiles** (timeseries): P50, P95, P99 latency trends
 - **Status Code Distribution** (piechart): Percentage breakdown of all status codes
 #### MCP Tools Row
 MCP-specific tool performance:
 - **Top Tools by Call Volume** (bargauge): Top 10 most-called tools
 - **Tool Error Rate** (timeseries): Error rates per tool
 - **Tool Execution Duration** (timeseries): P95 latency by tool
 #### Nextcloud API Row
 Backend API performance metrics:
 - **API Calls by App** (timeseries): Request rate per Nextcloud app (notes, calendar, contacts, etc.)
 - **API Latency by App** (timeseries): P95 latency per app
 - **API Retries by Reason** (timeseries): Retry patterns (429, timeout, connection errors)
 - **API Error Rate** (stat): Overall API error percentage
 #### OAuth & Authentication Row
 OAuth token operations and caching:
 - **Token Validations** (timeseries): Success/failure rates for token validation
 - **Token Exchange Operations** (timeseries): RFC 8693 token exchange operations
 - **Token Cache Hit Rate** (stat): Percentage of cache hits (color-coded: red<50%, yellow<80%, green≥80%)
 - **Refresh Token Operations** (timeseries): Refresh token storage operations by type
 #### Dependencies & Health Row
 External dependency status monitoring:
 - **Nextcloud Health** (stat): UP/DOWN status with color coding
 - **Qdrant Health** (stat): Vector database health status
 - **Keycloak Health** (stat): Identity provider health status
 - **Unstructured API Health** (stat): Document processing API status
 - **Health Check Duration** (timeseries): Health check latency by dependency
 - **Database Operation Latency** (timeseries): P95 latency for DB operations (SQLite, Qdrant)
 #### Vector Sync Row (when enabled)
 Document processing pipeline metrics:
 - **Documents Processed Rate** (timeseries): Processing throughput by status (success/failure)
 - **Processing Queue Depth** (gauge): Current queue size with thresholds (yellow>50, red>100)
 - **Qdrant Operations** (timeseries): Vector database operations by type
 - **Document Processing Duration** (timeseries): P95 processing latency
 ## Importing to Grafana
@@ -25,49 +68,77 @@ Comprehensive dashboard with the following panels:
 4. Select your Prometheus data source
 5. Click "Import"
-### Automated Import (Kubernetes)
+### Automated Import (Helm Chart)
-If using the Grafana Operator or kube-prometheus-stack, you can create a ConfigMap:
+The Helm chart now supports automatic dashboard provisioning via Grafana sidecar pattern.
 #### Option 1: Using Helm Chart (Recommended)
 Enable dashboard provisioning in your Helm values:
 ```yaml
 # values.yaml for nextcloud-mcp-server chart
 dashboards:
  enabled: true
  grafanaFolder: "Nextcloud MCP"  # Folder name in Grafana
  labels: {}  # Additional labels if needed
 ```
 Then deploy or upgrade:
 ```bash
-kubectl create configmap nextcloud-mcp-dashboards \
+helm upgrade --install nextcloud-mcp nextcloud-mcp-server \
  --set dashboards.enabled=true
 ```
 The dashboard will be automatically imported by Grafana if the sidecar is configured
 to watch for ConfigMaps with label `grafana_dashboard: "1"`.
 #### Option 2: Using kube-prometheus-stack
 If using kube-prometheus-stack with Grafana sidecar enabled, the dashboard will be
 automatically discovered and imported. Ensure your Grafana deployment has:
 ```yaml
 # kube-prometheus-stack values
 grafana:
  sidecar:
    dashboards:
      enabled: true
      label: grafana_dashboard
      folder: /tmp/dashboards
      provider:
        foldersFromFilesStructure: true
 ```
 #### Option 3: Manual ConfigMap Creation
 For other Grafana setups, create a ConfigMap manually:
 ```bash
 kubectl create configmap nextcloud-mcp-dashboard \
  --from-file=nextcloud-mcp-server.json \
  -n monitoring
-# Add label for Grafana sidecar to discover
+# Add sidecar discovery label
-kubectl label configmap nextcloud-mcp-dashboards \
+kubectl label configmap nextcloud-mcp-dashboard \
  grafana_dashboard=1 \
  -n monitoring
 ```
-Or add to your Helm values:
+# Add folder annotation (annotations support spaces, unlike labels)
-
+kubectl annotate configmap nextcloud-mcp-dashboard \
-```yaml
+  grafana_folder="Nextcloud MCP" \
-# values.yaml for kube-prometheus-stack
+  -n monitoring
 grafana:
  dashboardProviders:
    dashboardproviders.yaml:
      apiVersion: 1
      providers:
        - name: 'nextcloud-mcp'
          orgId: 1
          folder: 'Nextcloud MCP'
          type: file
          disableDeletion: false
          editable: true
          options:
            path: /var/lib/grafana/dashboards/nextcloud-mcp
  dashboardsConfigMaps:
    nextcloud-mcp: nextcloud-mcp-dashboards
 ```
 ## Dashboard Variables
-The dashboard includes two variables:
+The dashboard includes four template variables for dynamic filtering:
- **Data Source**: Select your Prometheus data source
+- **datasource**: Select your Prometheus data source
- **Namespace**: Filter metrics by Kubernetes namespace
+- **namespace**: Filter metrics by Kubernetes namespace (supports "All")
 - **pod**: Filter by specific pod(s) - multi-select enabled (supports "All")
 - **interval**: Query interval for rate calculations (1m, 5m, 10m, 30m, 1h - default: 5m)
 ## Customization
@@ -96,6 +96,30 @@ Your Nextcloud MCP Server has been deployed in {{ .Values.auth.mode }} authentic
   kubectl --namespace {{ .Release.Namespace }} exec -it deploy/{{ include "nextcloud-mcp-server.fullname" . }} -- curl -s http://localhost:{{ include "nextcloud-mcp-server.port" . }}/user/page | grep "Vector Sync"
 {{- end }}
 {{- if .Values.dashboards.enabled }}
 6. Grafana Dashboards:
   - Dashboard provisioning: Enabled
   - ConfigMap: {{ include "nextcloud-mcp-server.fullname" . }}-dashboard
   - Grafana Folder: {{ .Values.dashboards.grafanaFolder }}
   The dashboard will be automatically imported by Grafana if the sidecar is configured
   to watch for ConfigMaps with label "grafana_dashboard: 1".
   To manually import the dashboard:
   kubectl --namespace {{ .Release.Namespace }} get configmap {{ include "nextcloud-mcp-server.fullname" . }}-dashboard -o jsonpath='{.data.nextcloud-mcp-server\.json}' | jq . > dashboard.json
   Then import dashboard.json via Grafana UI (Dashboards → Import).
 {{- else }}
 6. Grafana Dashboards:
   - Dashboard provisioning: Disabled
   - To enable automatic dashboard provisioning, set: dashboards.enabled=true
   Manual import option:
   The dashboard JSON is available in the chart at charts/nextcloud-mcp-server/dashboards/nextcloud-mcp-server.json
 {{- end }}
 For more information and documentation:
 - GitHub: https://github.com/cbcoutinho/nextcloud-mcp-server
 - Documentation: https://github.com/cbcoutinho/nextcloud-mcp-server#readme
@@ -0,0 +1,25 @@
 {{- if .Values.dashboards.enabled }}
 apiVersion: v1
 kind: ConfigMap
 metadata:
  name: {{ include "nextcloud-mcp-server.fullname" . }}-dashboard
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "nextcloud-mcp-server.labels" . | nindent 4 }}
    {{- with .Values.dashboards.labels }}
    {{- toYaml . | nindent 4 }}
    {{- end }}
    # Grafana sidecar discovery label
    grafana_dashboard: "1"
  annotations:
    {{- with .Values.dashboards.annotations }}
    {{- toYaml . | nindent 4 }}
    {{- end }}
    # Grafana folder name (annotations support spaces, unlike labels)
    {{- if .Values.dashboards.grafanaFolder }}
    grafana_folder: {{ .Values.dashboards.grafanaFolder | quote }}
    {{- end }}
 data:
  nextcloud-mcp-server.json: |-
 {{ .Files.Get "dashboards/nextcloud-mcp-server.json" | indent 4 }}
 {{- end }}
@@ -205,6 +205,20 @@ prometheusRule:
  # Additional labels for PrometheusRule (e.g., for Prometheus selector)
  # Example: { prometheus: kube-prometheus }
 # Grafana dashboards (requires Grafana with sidecar enabled)
 dashboards:
  # Enable automatic dashboard provisioning via ConfigMap
  enabled: false
  # Grafana folder name where dashboards will be imported
  # The grafana-sidecar looks for ConfigMaps with label "grafana_dashboard: 1"
  # and reads the folder name from annotation "grafana_folder" (supports spaces)
  grafanaFolder: "Nextcloud MCP"
  # Additional labels for dashboard ConfigMap
  # These will be added alongside the required "grafana_dashboard: 1" label
  labels: {}
  # Additional annotations for dashboard ConfigMap
  annotations: {}
 service:
  type: ClusterIP
  port: 8000
@@ -3,7 +3,7 @@ services:
  # https://hub.docker.com/_/mariadb
  db:
    # Note: Check the recommend version here: https://docs.nextcloud.com/server/latest/admin_manual/installation/system_requirements.html#server
-    image: docker.io/library/mariadb:lts@sha256:ae6119716edac6998ae85508431b3d2e666530ddf4e94c61a10710caec9b0f71
+    image: docker.io/library/mariadb:lts@sha256:404ebf26ed7a56fbab05c29f6f1e70188e5eadb51bba8cee8d355775776deb08
    restart: always
    command: --transaction-isolation=READ-COMMITTED
    volumes:
@@ -418,6 +418,19 @@ async def app_lifespan_basic(server: FastMCP) -> AsyncIterator[AppContext]:
                "NEXTCLOUD_USERNAME is required for vector sync in BasicAuth mode"
            )
        # Initialize Qdrant collection before starting background tasks
        logger.info("Initializing Qdrant collection...")
        from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
        try:
            await get_qdrant_client()  # Triggers collection creation if needed
            logger.info("Qdrant collection ready")
        except Exception as e:
            logger.error(f"Failed to initialize Qdrant collection: {e}")
            raise RuntimeError(
                f"Cannot start vector sync - Qdrant initialization failed: {e}"
            ) from e
        # Initialize shared state
        send_stream, receive_stream = anyio.create_memory_object_stream(
            max_buffer_size=settings.vector_sync_queue_max_size
@@ -1086,6 +1099,19 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
                # Create client since we're outside FastMCP lifespan
                client = NextcloudClient.from_env()
                # Initialize Qdrant collection before starting background tasks
                logger.info("Initializing Qdrant collection...")
                from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
                try:
                    await get_qdrant_client()  # Triggers collection creation if needed
                    logger.info("Qdrant collection ready")
                except Exception as e:
                    logger.error(f"Failed to initialize Qdrant collection: {e}")
                    raise RuntimeError(
                        f"Cannot start vector sync - Qdrant initialization failed: {e}"
                    ) from e
                # Initialize shared state
                send_stream, receive_stream = anyio_module.create_memory_object_stream(
                    max_buffer_size=settings.vector_sync_queue_max_size
@@ -17,6 +17,7 @@ class OllamaEmbeddingProvider(EmbeddingProvider):
        base_url: str,
        model: str = "nomic-embed-text",
        verify_ssl: bool = True,
        timeout=httpx.Timeout(timeout=120, connect=5),
    ):
        """
        Initialize Ollama embedding provider.
@@ -29,8 +30,8 @@ class OllamaEmbeddingProvider(EmbeddingProvider):
        self.base_url = base_url.rstrip("/")
        self.model = model
        self.verify_ssl = verify_ssl
-        self.client = httpx.AsyncClient(verify=verify_ssl, timeout=30.0)
+        self.client = httpx.AsyncClient(verify=verify_ssl, timeout=timeout)
-        self._dimension = 768  # nomic-embed-text default
+        self._dimension: int | None = None  # Will be detected dynamically
        logger.info(
            f"Initialized Ollama provider: {base_url} (model={model}, verify_ssl={verify_ssl})"
        )
@@ -73,13 +74,36 @@ class OllamaEmbeddingProvider(EmbeddingProvider):
            embeddings.append(embedding)
        return embeddings
    async def _detect_dimension(self):
        """
        Detect embedding dimension by generating a test embedding.
        This method queries the model to determine the actual dimension
        instead of relying on hardcoded values.
        """
        if self._dimension is None:
            logger.debug(f"Detecting embedding dimension for model {self.model}...")
            test_embedding = await self.embed("test")
            self._dimension = len(test_embedding)
            logger.info(
                f"Detected embedding dimension: {self._dimension} for model {self.model}"
            )
    def get_dimension(self) -> int:
        """
        Get embedding dimension.
        Returns:
-            Vector dimension (768 for nomic-embed-text)
+            Vector dimension for the configured model
        Raises:
            RuntimeError: If dimension not detected yet (call _detect_dimension first)
        """
        if self._dimension is None:
            raise RuntimeError(
                f"Embedding dimension not detected yet for model {self.model}. "
                "Call _detect_dimension() first or generate an embedding."
            )
        return self._dimension
    def _check_model_is_loaded(self, autoload: bool = True):
@@ -352,3 +352,46 @@ def record_dependency_check(dependency: str, duration: float) -> None:
        duration: Check duration in seconds
    """
    dependency_check_duration_seconds.labels(dependency=dependency).observe(duration)
 def record_vector_sync_scan(documents_found: int) -> None:
    """
    Record documents scanned during vector sync.
    Args:
        documents_found: Number of documents discovered in scan
    """
    vector_sync_documents_scanned_total.inc(documents_found)
 def record_vector_sync_processing(duration: float, status: str = "success") -> None:
    """
    Record document processing with duration and status.
    Args:
        duration: Processing duration in seconds
        status: "success" or "error"
    """
    vector_sync_documents_processed_total.labels(status=status).inc()
    vector_sync_processing_duration_seconds.observe(duration)
 def record_qdrant_operation(operation: str, status: str = "success") -> None:
    """
    Record Qdrant vector database operation.
    Args:
        operation: Operation type ("upsert", "search", "delete")
        status: "success" or "error"
    """
    qdrant_operations_total.labels(operation=operation, status=status).inc()
 def update_vector_sync_queue_size(size: int) -> None:
    """
    Update vector sync queue size gauge.
    Args:
        size: Current queue size
    """
    vector_sync_queue_size.set(size)
@@ -21,6 +21,7 @@ from nextcloud_mcp_server.models.semantic import (
    SemanticSearchResult,
    VectorSyncStatusResponse,
 )
 from nextcloud_mcp_server.observability.metrics import record_qdrant_operation
 logger = logging.getLogger(__name__)
@@ -85,26 +86,33 @@ def configure_semantic_tools(mcp: FastMCP):
            # Note: Currently only searching notes (doc_type="note")
            # Future: Remove doc_type filter to search all apps
            qdrant_client = await get_qdrant_client()
-            search_response = await qdrant_client.query_points(
+            try:
-                collection_name=settings.get_collection_name(),
+                search_response = await qdrant_client.query_points(
-                query=query_embedding,
+                    collection_name=settings.get_collection_name(),
-                query_filter=Filter(
+                    query=query_embedding,
-                    must=[
+                    query_filter=Filter(
-                        FieldCondition(
+                        must=[
-                            key="user_id",
+                            FieldCondition(
-                            match=MatchValue(value=username),
+                                key="user_id",
-                        ),
+                                match=MatchValue(value=username),
-                        FieldCondition(
+                            ),
-                            key="doc_type",
+                            FieldCondition(
-                            match=MatchValue(value="note"),
+                                key="doc_type",
-                        ),
+                                match=MatchValue(value="note"),
-                    ]
+                            ),
-                ),
+                        ]
-                limit=limit * 2,  # Get extra for filtering
+                    ),
-                score_threshold=score_threshold,
+                    limit=limit * 2,  # Get extra for filtering
-                with_payload=True,
+                    score_threshold=score_threshold,
-                with_vectors=False,  # Don't return vectors to save bandwidth
+                    with_payload=True,
-            )
+                    with_vectors=False,  # Don't return vectors to save bandwidth
                )
                # Record successful search operation
                record_qdrant_operation("search", "success")
            except Exception:
                # Record failed search operation
                record_qdrant_operation("search", "error")
                raise
            logger.info(
                f"Qdrant returned {len(search_response.points)} results "
@@ -331,21 +339,71 @@ def configure_semantic_tools(mcp: FastMCP):
                success=True,
            )
-        # 4. Construct context from retrieved documents
+        # 4. Fetch full content for notes to provide complete context to LLM
        # Filter out inaccessible notes (deleted or permissions changed)
        client = await get_client(ctx)
        accessible_results = []
        full_contents = []  # Full content for accessible notes
        for result in search_response.results:
            if result.doc_type == "note":
                try:
                    note = await client.notes.get_note(result.id)
                    # Note is accessible, store full content
                    accessible_results.append(result)
                    full_contents.append(note.get("content", ""))
                    logger.debug(
                        f"Fetched full content for note {result.id} "
                        f"(length: {len(full_contents[-1])} chars)"
                    )
                except Exception as e:
                    # Note might have been deleted or permissions changed
                    # Filter it out to avoid corrupting LLM with inaccessible data
                    logger.warning(
                        f"Failed to fetch full content for note {result.id}: {e}. "
                        f"Excluding from results."
                    )
            else:
                # Non-note document types (future: calendar, deck, files)
                # For now, keep them with excerpts
                accessible_results.append(result)
                full_contents.append(None)
        # Check if we filtered out all results
        if not accessible_results:
            logger.warning(f"All search results became inaccessible for query: {query}")
            return SamplingSearchResponse(
                query=query,
                generated_answer="All matching documents are no longer accessible.",
                sources=[],
                total_found=0,
                search_method="semantic_sampling",
                success=True,
            )
        # 5. Construct context from accessible documents with full content
        context_parts = []
-        for idx, result in enumerate(search_response.results, 1):
+        for idx, (result, content) in enumerate(
            zip(accessible_results, full_contents), 1
        ):
            # Use full content if available (notes), otherwise use excerpt
            if content is not None:
                content_field = f"Content: {content}"
            else:
                content_field = f"Excerpt: {result.excerpt}"
            context_parts.append(
                f"[Document {idx}]\n"
                f"Type: {result.doc_type}\n"
                f"Title: {result.title}\n"
                f"Category: {result.category}\n"
-                f"Excerpt: {result.excerpt}\n"
+                f"{content_field}\n"
                f"Relevance Score: {result.score:.2f}\n"
            )
        context = "\n".join(context_parts)
-        # 5. Construct prompt - reuse user's query, add context and instructions
+        # 6. Construct prompt - reuse user's query, add context and instructions
        prompt = (
            f"{query}\n\n"
            f"Here are relevant documents from Nextcloud (notes, calendar events, deck cards, files, contacts):\n\n"
@@ -401,8 +459,8 @@ def configure_semantic_tools(mcp: FastMCP):
            return SamplingSearchResponse(
                query=query,
                generated_answer=generated_answer,
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method="semantic_sampling",
                model_used=sampling_result.model,
                stop_reason=sampling_result.stopReason,
@@ -419,11 +477,11 @@ def configure_semantic_tools(mcp: FastMCP):
                generated_answer=(
                    f"[Sampling request timed out]\n\n"
                    f"The answer generation took too long (>30s). "
-                    f"Found {search_response.total_found} relevant documents. "
+                    f"Found {len(accessible_results)} relevant documents. "
                    f"Please review the sources below or try a simpler query."
                ),
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method="semantic_sampling_timeout",
                success=True,
            )
@@ -454,11 +512,11 @@ def configure_semantic_tools(mcp: FastMCP):
                query=query,
                generated_answer=(
                    f"[{user_message}]\n\n"
-                    f"Found {search_response.total_found} relevant documents. "
+                    f"Found {len(accessible_results)} relevant documents. "
                    f"Please review the sources below."
                ),
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method=search_method,
                success=True,
            )
@@ -475,11 +533,11 @@ def configure_semantic_tools(mcp: FastMCP):
                query=query,
                generated_answer=(
                    f"[Unexpected error during sampling]\n\n"
-                    f"Found {search_response.total_found} relevant documents. "
+                    f"Found {len(accessible_results)} relevant documents. "
                    f"Please review the sources below."
                ),
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method="semantic_sampling_error",
                success=True,
            )
@@ -15,6 +15,10 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue, PointStruct
 from nextcloud_mcp_server.client import NextcloudClient
 from nextcloud_mcp_server.config import get_settings
 from nextcloud_mcp_server.embedding import get_embedding_service
 from nextcloud_mcp_server.observability.metrics import (
    record_qdrant_operation,
    record_vector_sync_processing,
 )
 from nextcloud_mcp_server.observability.tracing import trace_operation
 from nextcloud_mcp_server.vector.document_chunker import DocumentChunker
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
@@ -90,6 +94,8 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
        doc_task: Document task to process
        nc_client: Authenticated Nextcloud client
    """
    start_time = time.time()
    logger.debug(
        f"Processing {doc_task.doc_type}_{doc_task.doc_id} "
        f"for {doc_task.user_id} ({doc_task.operation})"
@@ -105,58 +111,79 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
            "vector_sync.doc_operation": doc_task.operation,
        },
    ):
-        qdrant_client = await get_qdrant_client()
+        try:
-        settings = get_settings()
+            qdrant_client = await get_qdrant_client()
            settings = get_settings()
-        # Handle deletion
+            # Handle deletion
-        if doc_task.operation == "delete":
+            if doc_task.operation == "delete":
-            await qdrant_client.delete(
+                await qdrant_client.delete(
-                collection_name=settings.get_collection_name(),
+                    collection_name=settings.get_collection_name(),
-                points_selector=Filter(
+                    points_selector=Filter(
-                    must=[
+                        must=[
-                        FieldCondition(
+                            FieldCondition(
-                            key="user_id",
+                                key="user_id",
-                            match=MatchValue(value=doc_task.user_id),
+                                match=MatchValue(value=doc_task.user_id),
-                        ),
+                            ),
-                        FieldCondition(
+                            FieldCondition(
-                            key="doc_id",
+                                key="doc_id",
-                            match=MatchValue(value=doc_task.doc_id),
+                                match=MatchValue(value=doc_task.doc_id),
-                        ),
+                            ),
-                        FieldCondition(
+                            FieldCondition(
-                            key="doc_type",
+                                key="doc_type",
-                            match=MatchValue(value=doc_task.doc_type),
+                                match=MatchValue(value=doc_task.doc_type),
-                        ),
+                            ),
-                    ]
+                        ]
-                ),
+                    ),
-            )
+                )
-            logger.info(
+                logger.info(
-                f"Deleted {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id}"
+                    f"Deleted {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id}"
-            )
+                )
            return
-        # Handle indexing with retry
+                # Record successful deletion metrics
-        max_retries = 3
+                duration = time.time() - start_time
-        retry_delay = 1.0
+                record_qdrant_operation("delete", "success")
                record_vector_sync_processing(duration, "success")
                return
-        for attempt in range(max_retries):
+            # Handle indexing with retry
-            try:
+            max_retries = 3
-                await _index_document(doc_task, nc_client, qdrant_client)
+            retry_delay = 1.0
                return  # Success
-            except (HTTPStatusError, Exception) as e:
+            for attempt in range(max_retries):
-                if attempt < max_retries - 1:
+                try:
-                    logger.warning(
+                    await _index_document(doc_task, nc_client, qdrant_client)
-                        f"Retry {attempt + 1}/{max_retries} for "
+
-                        f"{doc_task.doc_type}_{doc_task.doc_id}: {e}"
+                    # Record successful processing metrics
-                    )
+                    duration = time.time() - start_time
-                    await anyio.sleep(retry_delay)
+                    record_qdrant_operation("upsert", "success")
-                    retry_delay *= 2  # Exponential backoff
+                    record_vector_sync_processing(duration, "success")
-                else:
+                    return  # Success
-                    logger.error(
+
-                        f"Failed to index {doc_task.doc_type}_{doc_task.doc_id} "
+                except (HTTPStatusError, Exception) as e:
-                        f"after {max_retries} retries: {e}"
+                    if attempt < max_retries - 1:
-                    )
+                        logger.warning(
-                    raise
+                            f"Retry {attempt + 1}/{max_retries} for "
                            f"{doc_task.doc_type}_{doc_task.doc_id}: {e}"
                        )
                        await anyio.sleep(retry_delay)
                        retry_delay *= 2  # Exponential backoff
                    else:
                        logger.error(
                            f"Failed to index {doc_task.doc_type}_{doc_task.doc_id} "
                            f"after {max_retries} retries: {e}"
                        )
                        # Record failed processing metrics
                        duration = time.time() - start_time
                        record_qdrant_operation("upsert", "error")
                        record_vector_sync_processing(duration, "error")
                        raise
        except Exception:
            # Catch any other unexpected errors
            duration = time.time() - start_time
            record_vector_sync_processing(duration, "error")
            raise
 async def _index_document(
@@ -66,10 +66,23 @@ async def get_qdrant_client() -> AsyncQdrantClient:
        from nextcloud_mcp_server.embedding import get_embedding_service
        embedding_service = get_embedding_service()
        # Detect dimension dynamically (for OllamaEmbeddingProvider)
        if hasattr(embedding_service.provider, "_detect_dimension"):
            await embedding_service.provider._detect_dimension()  # type: ignore[call-non-callable]
        expected_dimension = embedding_service.get_dimension()
-        try:
+        # Explicitly check if collection exists
-            # Get existing collection
+        logger.debug(f"Checking if collection '{collection_name}' exists...")
        collections = await _qdrant_client.get_collections()
        collection_names = [c.name for c in collections.collections]
        if collection_name in collection_names:
            # Collection exists - validate dimensions
            logger.debug(
                f"Collection '{collection_name}' found, validating dimensions..."
            )
            collection_info = await _qdrant_client.get_collection(collection_name)
            actual_dimension = collection_info.config.params.vectors.size
@@ -91,12 +104,12 @@ async def get_qdrant_client() -> AsyncQdrantClient:
                f"(dimension={actual_dimension}, model={settings.ollama_embedding_model})"
            )
-        except Exception as e:
+        else:
-            # Check if it's a dimension mismatch error (re-raise it)
+            # Collection doesn't exist - create it
-            if isinstance(e, ValueError) and "Dimension mismatch" in str(e):
+            logger.info(
-                raise
+                f"Collection '{collection_name}' not found, creating with "
-
+                f"dimension={expected_dimension}, model={settings.ollama_embedding_model}..."
-            # Collection doesn't exist or other error, create it
+            )
            await _qdrant_client.create_collection(
                collection_name=collection_name,
                vectors_config=VectorParams(
@@ -13,6 +13,7 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue
 from nextcloud_mcp_server.client import NextcloudClient
 from nextcloud_mcp_server.config import get_settings
 from nextcloud_mcp_server.observability.metrics import record_vector_sync_scan
 from nextcloud_mcp_server.observability.tracing import trace_operation
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
@@ -181,6 +182,9 @@ async def scan_user_documents(
        ]
        logger.info(f"[SCAN-{scan_id}] Found {len(notes)} notes for {user_id}")
        # Record documents scanned
        record_vector_sync_scan(len(notes))
        if initial_sync:
            # Send everything on first sync
            for note in notes:
@@ -1,6 +1,6 @@
 [project]
 name = "nextcloud-mcp-server"
-version = "0.31.1"
+version = "0.33.0"
 description = "Model Context Protocol (MCP) server for Nextcloud integration - enables AI assistants to interact with Nextcloud data"
 authors = [
    {name = "Chris Coutinho", email = "chris@coutinho.io"}
@@ -0,0 +1,322 @@
 """Integration tests for Qdrant collection auto-creation.
 These tests validate that:
 1. Collections are automatically created on first access
 2. Dimension validation detects mismatches
 3. Idempotent initialization (multiple calls don't fail)
 4. Proper error handling and logging
 """
 from unittest.mock import Mock
 import pytest
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
 pytestmark = pytest.mark.integration
@pytest.fixture(autouse=True)
 async def reset_singleton():
    """Reset the global Qdrant client singleton between tests."""
    global _qdrant_client
    import nextcloud_mcp_server.vector.qdrant_client as qdrant_module
    # Store original
    original = qdrant_module._qdrant_client
    # Reset for test
    qdrant_module._qdrant_client = None
    yield
    # Restore original
    qdrant_module._qdrant_client = original
@pytest.mark.integration
 async def test_collection_auto_created_on_first_access(monkeypatch):
    """Test that collection is automatically created if it doesn't exist."""
    # Mock settings
    from nextcloud_mcp_server.config import Settings
    mock_settings = Settings(
        qdrant_location=":memory:",
        ollama_embedding_model="nomic-embed-text",
        vector_sync_enabled=False,  # Disable background sync for test
    )
    monkeypatch.setattr(
        "nextcloud_mcp_server.vector.qdrant_client.get_settings", lambda: mock_settings
    )
    # Mock embedding service - must have .provider attribute
    from nextcloud_mcp_server.embedding import SimpleEmbeddingProvider
    mock_provider = SimpleEmbeddingProvider(dimension=384)
    mock_embedding_service = Mock()
    mock_embedding_service.provider = mock_provider
    mock_embedding_service.get_dimension = lambda: mock_provider.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service,
    )
    # Get client (should trigger collection creation)
    client = await get_qdrant_client()
    # Verify client is initialized
    assert client is not None
    # Verify collection was created
    collection_name = mock_settings.get_collection_name()
    collections = await client.get_collections()
    collection_names = [c.name for c in collections.collections]
    assert collection_name in collection_names
    # Verify collection has correct dimensions
    collection_info = await client.get_collection(collection_name)
    assert collection_info.config.params.vectors.size == 384
@pytest.mark.integration
 async def test_existing_collection_reused(monkeypatch):
    """Test that existing collection is reused without error."""
    # Mock settings
    from nextcloud_mcp_server.config import Settings
    mock_settings = Settings(
        qdrant_location=":memory:",
        ollama_embedding_model="nomic-embed-text",
        vector_sync_enabled=False,
    )
    monkeypatch.setattr(
        "nextcloud_mcp_server.vector.qdrant_client.get_settings", lambda: mock_settings
    )
    # Mock embedding service - must have .provider attribute
    from nextcloud_mcp_server.embedding import SimpleEmbeddingProvider
    mock_provider = SimpleEmbeddingProvider(dimension=384)
    mock_embedding_service = Mock()
    mock_embedding_service.provider = mock_provider
    mock_embedding_service.get_dimension = lambda: mock_provider.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service,
    )
    # First call - creates collection
    _ = await get_qdrant_client()
    collection_name = mock_settings.get_collection_name()
    # Reset singleton to simulate second initialization
    import nextcloud_mcp_server.vector.qdrant_client as qdrant_module
    qdrant_module._qdrant_client = None
    # Second call - should reuse existing collection
    client2 = await get_qdrant_client()
    # Verify both clients work
    assert client2 is not None
    # Verify collection still exists and wasn't recreated
    collections = await client2.get_collections()
    collection_names = [c.name for c in collections.collections]
    assert collection_name in collection_names
    # Verify dimensions unchanged
    collection_info = await client2.get_collection(collection_name)
    assert collection_info.config.params.vectors.size == 384
@pytest.mark.integration
 async def test_dimension_mismatch_detected(monkeypatch, tmp_path):
    """Test that dimension mismatch raises clear error."""
    # Use persistent temp directory so collection survives client reset
    from nextcloud_mcp_server.config import Settings
    qdrant_path = str(tmp_path / "qdrant_data")
    mock_settings = Settings(
        qdrant_location=qdrant_path,
        ollama_embedding_model="nomic-embed-text",
        vector_sync_enabled=False,
    )
    monkeypatch.setattr(
        "nextcloud_mcp_server.vector.qdrant_client.get_settings", lambda: mock_settings
    )
    # First embedding service: 384 dimensions
    from nextcloud_mcp_server.embedding import SimpleEmbeddingProvider
    mock_provider_1 = SimpleEmbeddingProvider(dimension=384)
    mock_embedding_service_1 = Mock()
    mock_embedding_service_1.provider = mock_provider_1
    mock_embedding_service_1.get_dimension = lambda: mock_provider_1.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service_1,
    )
    # First call - creates collection with 384 dimensions
    client1 = await get_qdrant_client()
    collection_name = mock_settings.get_collection_name()
    # Verify collection created
    collection_info = await client1.get_collection(collection_name)
    assert collection_info.config.params.vectors.size == 384
    # Close client1 to release file lock
    await client1.close()
    # Reset singleton (but collection persists in temp directory)
    import nextcloud_mcp_server.vector.qdrant_client as qdrant_module
    qdrant_module._qdrant_client = None
    # Change embedding service to different dimension (768)
    mock_provider_2 = SimpleEmbeddingProvider(dimension=768)
    mock_embedding_service_2 = Mock()
    mock_embedding_service_2.provider = mock_provider_2
    mock_embedding_service_2.get_dimension = lambda: mock_provider_2.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service_2,
    )
    # Second call - should detect dimension mismatch and raise error
    with pytest.raises(ValueError) as exc_info:
        await get_qdrant_client()
    # Verify error message is helpful
    error_msg = str(exc_info.value)
    assert "Dimension mismatch" in error_msg
    assert "384" in error_msg  # Old dimension
    assert "768" in error_msg  # New dimension
    assert "Solutions:" in error_msg  # Includes helpful solutions
@pytest.mark.integration
 async def test_idempotent_initialization(monkeypatch):
    """Test that multiple calls to get_qdrant_client() are idempotent."""
    # Mock settings
    from nextcloud_mcp_server.config import Settings
    mock_settings = Settings(
        qdrant_location=":memory:",
        ollama_embedding_model="nomic-embed-text",
        vector_sync_enabled=False,
    )
    monkeypatch.setattr(
        "nextcloud_mcp_server.vector.qdrant_client.get_settings", lambda: mock_settings
    )
    # Mock embedding service - must have .provider attribute
    from nextcloud_mcp_server.embedding import SimpleEmbeddingProvider
    mock_provider = SimpleEmbeddingProvider(dimension=384)
    mock_embedding_service = Mock()
    mock_embedding_service.provider = mock_provider
    mock_embedding_service.get_dimension = lambda: mock_provider.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service,
    )
    # Call multiple times
    client1 = await get_qdrant_client()
    client2 = await get_qdrant_client()
    client3 = await get_qdrant_client()
    # All should return same singleton instance
    assert client1 is client2
    assert client2 is client3
    # Collection should exist
    collection_name = mock_settings.get_collection_name()
    collections = await client1.get_collections()
    collection_names = [c.name for c in collections.collections]
    assert collection_name in collection_names
@pytest.mark.integration
 async def test_collection_name_generation(monkeypatch):
    """Test that collection name is correctly generated from deployment ID and model."""
    # Mock settings with custom deployment ID
    from nextcloud_mcp_server.config import Settings
    mock_settings = Settings(
        qdrant_location=":memory:",
        ollama_embedding_model="test-model",
        vector_sync_enabled=False,
    )
    # Mock deployment ID
    monkeypatch.setenv("MCP_DEPLOYMENT_ID", "test-deployment")
    monkeypatch.setattr(
        "nextcloud_mcp_server.vector.qdrant_client.get_settings", lambda: mock_settings
    )
    # Mock embedding service - must have .provider attribute
    from nextcloud_mcp_server.embedding import SimpleEmbeddingProvider
    mock_provider = SimpleEmbeddingProvider(dimension=384)
    mock_embedding_service = Mock()
    mock_embedding_service.provider = mock_provider
    mock_embedding_service.get_dimension = lambda: mock_provider.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service,
    )
    # Get client
    client = await get_qdrant_client()
    # Verify collection name includes deployment ID and model
    collection_name = mock_settings.get_collection_name()
    assert "test-deployment" in collection_name or "test-model" in collection_name
    # Verify collection was created with that name
    collections = await client.get_collections()
    collection_names = [c.name for c in collections.collections]
    assert collection_name in collection_names
@pytest.mark.integration
 async def test_collection_uses_cosine_distance(monkeypatch):
    """Test that created collection uses COSINE distance metric."""
    # Mock settings
    from nextcloud_mcp_server.config import Settings
    mock_settings = Settings(
        qdrant_location=":memory:",
        ollama_embedding_model="nomic-embed-text",
        vector_sync_enabled=False,
    )
    monkeypatch.setattr(
        "nextcloud_mcp_server.vector.qdrant_client.get_settings", lambda: mock_settings
    )
    # Mock embedding service - must have .provider attribute
    from nextcloud_mcp_server.embedding import SimpleEmbeddingProvider
    mock_provider = SimpleEmbeddingProvider(dimension=384)
    mock_embedding_service = Mock()
    mock_embedding_service.provider = mock_provider
    mock_embedding_service.get_dimension = lambda: mock_provider.get_dimension()
    monkeypatch.setattr(
        "nextcloud_mcp_server.embedding.get_embedding_service",
        lambda: mock_embedding_service,
    )
    # Get client (creates collection)
    client = await get_qdrant_client()
    # Verify collection uses COSINE distance
    collection_name = mock_settings.get_collection_name()
    collection_info = await client.get_collection(collection_name)
    from qdrant_client.models import Distance
    assert collection_info.config.params.vectors.distance == Distance.COSINE
@@ -1053,7 +1053,7 @@ wheels = [
 [[package]]
 name = "nextcloud-mcp-server"
-version = "0.31.1"
+version = "0.33.0"
 source = { editable = "." }
 dependencies = [
    { name = "aiosqlite" },
Author	SHA1	Message	Date
Chris Coutinho	4e43d15153	fix: Move grafana_folder from labels to annotations Fixes Kubernetes label validation error when deploying dashboard ConfigMap. Problem: - Kubernetes labels cannot contain spaces (validation regex: [A-Za-z0-9][-A-Za-z0-9_.]*[A-Za-z0-9]) - Previous implementation had grafana_folder: "Nextcloud MCP" as a label - Deployment failed with: "Invalid value: 'Nextcloud MCP'" Solution: - Move grafana_folder from labels to annotations (annotations allow spaces) - Keep grafana_dashboard="1" as label for ConfigMap discovery - Grafana sidecar reads folder name from folderAnnotation parameter Changes: - dashboard-configmap.yaml: Move grafana_folder to annotations section - dashboards/README.md: Fix kubectl commands to use annotations - values.yaml: Update comments to clarify annotation usage This follows the standard kube-prometheus-stack pattern where: - Labels are used for ConfigMap discovery (strict validation) - Annotations are used for metadata like folder names (relaxed validation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 13:08:45 +01:00
github-actions[bot]	15951c38fa	bump: version 0.32.1 → 0.33.0	2025-11-13 10:58:05 +00:00
Chris Coutinho	2de0590839	Merge pull request #292 from cbcoutinho/feat/grafana-dashboard-and-vector-metrics feat: Add Grafana dashboard and vector sync metric instrumentation	2025-11-13 11:57:40 +01:00
Chris Coutinho	4ea5ed72d4	feat: Add Grafana dashboard and vector sync metric instrumentation Implement comprehensive observability for vector database synchronization with Grafana dashboard and Prometheus metrics. ## Part 1: Grafana Dashboard Created all-in-one operations dashboard with 7 rows and 34 panels: ### Dashboard Structure: - Overview Row: Request rate, error rate, P95 latency, active requests - HTTP Metrics (RED): Request/error rates by endpoint, latency percentiles - MCP Tools: Call volume, error rates, execution duration by tool - Nextcloud API: API calls/latency by app, retry patterns - OAuth & Authentication: Token validations, exchanges, cache hit rate - Dependencies & Health: Status for Nextcloud/Qdrant/Keycloak/Unstructured - Vector Sync: Processing throughput, queue depth, Qdrant operations ### Helm Chart Integration: - Added dashboard-configmap.yaml template for automatic provisioning - Configured Grafana sidecar auto-discovery (label: grafana_dashboard="1") - Added dashboards configuration section in values.yaml (opt-in) - Updated Chart.yaml with dashboard annotations - Enhanced NOTES.txt with dashboard deployment instructions - Comprehensive documentation in dashboards/README.md Dashboard supports dynamic filtering via variables: - datasource: Prometheus data source selection - namespace: Filter by Kubernetes namespace - pod: Multi-select pod filtering - interval: Query interval (1m/5m/10m/30m/1h) ## Part 2: Vector Sync Metric Instrumentation Implemented metric recording throughout vector sync pipeline: ### metrics.py: Added convenience functions: - record_vector_sync_scan() - Track documents per scan - record_vector_sync_processing() - Track processing duration/status - record_qdrant_operation() - Track database operations - update_vector_sync_queue_size() - Track queue depth ### scanner.py: - Record number of documents found in each scan - Enables monitoring of scan throughput ### processor.py: - Record processing duration for each document - Track success/failure status with timing - Record Qdrant upsert/delete operations - Handle all code paths (success, deletion, error) ### semantic.py: - Wrap Qdrant query_points with try/except - Record search operation success/failure ## Metrics Exposed: - mcp_vector_sync_documents_scanned_total - mcp_vector_sync_documents_processed_total{status} - mcp_vector_sync_processing_duration_seconds (histogram) - mcp_vector_sync_queue_size (gauge) - mcp_qdrant_operations_total{operation,status} This enables monitoring of: - Scan and processing throughput - Processing latency (P50/P95/P99) - Error rates for processing and Qdrant operations - Queue depth trends - Complete observability of vector sync pipeline ## Testing: Verified locally that metrics are recorded correctly: - 36 documents scanned - 3 documents processed (avg 7.5s each) - 3 successful Qdrant upsert operations - Search operations tracked ## Deployment: Enable dashboard provisioning in Helm values: ```yaml dashboards: enabled: true grafanaFolder: "Nextcloud MCP" ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 11:49:20 +01:00
Chris Coutinho	d1829fbbd6	Merge pull request #291 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.9	2025-11-13 08:02:35 +01:00
renovate-bot-cbcoutinho[bot]	8332542959	chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.9	2025-11-12 23:11:29 +00:00
Chris Coutinho	619ba5684d	build: Add ./worktrees to .gitignore	2025-11-12 08:27:33 +01:00
github-actions[bot]	747d297008	bump: version 0.32.0 → 0.32.1	2025-11-12 02:16:57 +00:00
Chris Coutinho	ba8486b73b	Merge pull request #289 from cbcoutinho/fix/dynamic-embedding-dimensions fix: add dynamic dimension detection for Ollama embedding models	2025-11-12 03:16:29 +01:00
Chris Coutinho	6812e1aca7	fix: add dynamic dimension detection for Ollama embedding models This fixes dimension mismatch errors when using embedding models with non-standard dimensions (e.g., qwen3-embedding:4b produces 2560-dim vectors instead of the hardcoded 768). Changes: - OllamaEmbeddingProvider: Detect dimensions dynamically by generating test embedding instead of hardcoding to 768 - qdrant_client: Call dimension detection before collection creation - app.py: Initialize Qdrant collection before starting background tasks in streamable-http transport path - tests: Fix integration tests to properly mock EmbeddingService wrapper Fixes dimension mismatch error: "could not broadcast input array from shape (2560,) into shape (768,)" All integration tests passing (6/6). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-12 02:46:30 +01:00
github-actions[bot]	49a9dd43c6	bump: version 0.31.1 → 0.32.0	2025-11-11 23:54:43 +00:00
Chris Coutinho	f6656fee06	Merge pull request #288 from cbcoutinho/feat/webhook-testing-validation feat: webhook-based vector sync with management UI and validation	2025-11-12 00:54:20 +01:00
Chris Coutinho	0005e0dce0	Merge pull request #286 from cbcoutinho/renovate/docker.io-library-mariadb-lts chore(deps): update docker.io/library/mariadb:lts docker digest to 404ebf2	2025-11-11 09:17:23 +01:00
Chris Coutinho	636e5105c3	Merge pull request #287 from cbcoutinho/renovate/astral-sh-setup-uv-7.x chore(deps): update astral-sh/setup-uv action to v7.1.3	2025-11-11 09:17:16 +01:00
renovate-bot-cbcoutinho[bot]	ee7080afb3	chore(deps): update astral-sh/setup-uv action to v7.1.3	2025-11-10 23:10:10 +00:00
renovate-bot-cbcoutinho[bot]	b52f482a51	chore(deps): update docker.io/library/mariadb:lts docker digest to 404ebf2	2025-11-10 23:10:04 +00:00