bump: version 0.33.0 → 0.33.1

Merge pull request #293 from cbcoutinho/fix/grafana-folder-label-validation
fix: Move grafana_folder from labels to annotations
2025-11-13 12:10:42 +00:00 · 2025-11-13 13:10:15 +01:00 · 2025-11-13 13:08:45 +01:00 · 2025-11-13 10:58:05 +00:00 · 2025-11-13 11:57:40 +01:00 · 2025-11-13 11:49:20 +01:00
16 changed files with 1998 additions and 1800 deletions
@@ -1,3 +1,15 @@
 ## v0.33.1 (2025-11-13)
 ### Fix
 - Move grafana_folder from labels to annotations
 ## v0.33.0 (2025-11-13)
 ### Feat
 - Add Grafana dashboard and vector sync metric instrumentation
 ## v0.32.1 (2025-11-12)
 ### Fix
@@ -1,4 +1,4 @@
-FROM ghcr.io/astral-sh/uv:0.9.8-python3.11-alpine@sha256:6c842c49ad032f46b62f32a7e7779f45f12671a8e0d82ea24c766ab62d58b396
+FROM ghcr.io/astral-sh/uv:0.9.9-python3.11-alpine@sha256:0faa7934fac1db7f5056f159c1224d144bab864fd2677a4066d25a686ae32edd
 # Install dependencies
 # 1. git (required for caldav dependency from git)
@@ -2,8 +2,8 @@ apiVersion: v2
 name: nextcloud-mcp-server
 description: A Helm chart for Nextcloud MCP Server - enables AI assistants to interact with Nextcloud
 type: application
-version: 0.32.1
+version: 0.33.1
-appVersion: "0.32.1"
+appVersion: "0.33.1"
 keywords:
  - nextcloud
  - mcp
@@ -21,6 +21,10 @@ home: https://github.com/cbcoutinho/nextcloud-mcp-server
 sources:
  - https://github.com/cbcoutinho/nextcloud-mcp-server
 icon: https://raw.githubusercontent.com/nextcloud/server/master/core/img/logo/logo.svg
 annotations:
  # Grafana dashboard support
  grafana_dashboard: "true"
  grafana_dashboard_folder: "Nextcloud MCP"
 dependencies:
  - name: qdrant
    version: "1.15.5"
@@ -280,6 +280,72 @@ Use OpenAI or any OpenAI-compatible API instead of Ollama.
 | `openai.secretKey` | Key in secret containing API key | `api-key` |
 | `openai.baseUrl` | Custom API endpoint (optional) | `""` |
 #### Observability & Monitoring
 The chart includes comprehensive observability features including Prometheus metrics, OpenTelemetry tracing, and Grafana dashboards.
 **Metrics Configuration:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `observability.metrics.enabled` | Enable Prometheus metrics | `true` |
 | `observability.metrics.port` | Metrics port | `9090` |
 | `observability.metrics.path` | Metrics endpoint path | `/metrics` |
 **Tracing Configuration:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `observability.tracing.enabled` | Enable OpenTelemetry tracing | `false` |
 | `observability.tracing.endpoint` | OTLP collector endpoint | `""` |
 | `observability.tracing.serviceName` | Service name in traces | `nextcloud-mcp-server` |
 | `observability.tracing.samplingRate` | Trace sampling rate (0.0-1.0) | `1.0` |
 **Logging Configuration:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `observability.logging.format` | Log format (json or text) | `json` |
 | `observability.logging.level` | Log level | `INFO` |
 | `observability.logging.includeTraceContext` | Include trace IDs in logs | `true` |
 **ServiceMonitor (Prometheus Operator):**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `serviceMonitor.enabled` | Create ServiceMonitor resource | `false` |
 | `serviceMonitor.interval` | Scrape interval | `30s` |
 | `serviceMonitor.scrapeTimeout` | Scrape timeout | `10s` |
 | `serviceMonitor.labels` | Additional labels for ServiceMonitor | `{}` |
 **PrometheusRule (Prometheus Operator):**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `prometheusRule.enabled` | Create PrometheusRule with alert rules | `false` |
 | `prometheusRule.labels` | Additional labels for PrometheusRule | `{}` |
 **Grafana Dashboards:**
 | Parameter | Description | Default |
 |-----------|-------------|---------|
 | `dashboards.enabled` | Enable automatic dashboard provisioning | `false` |
 | `dashboards.grafanaFolder` | Grafana folder name for dashboards | `Nextcloud MCP` |
 | `dashboards.labels` | Additional labels for dashboard ConfigMap | `{}` |
 | `dashboards.annotations` | Additional annotations for dashboard ConfigMap | `{}` |
 When `dashboards.enabled` is `true`, a ConfigMap with the Grafana dashboard is created with the `grafana_dashboard: "1"` label. This enables automatic discovery by Grafana sidecar containers (commonly used with kube-prometheus-stack).
 The dashboard provides comprehensive monitoring including:
 - HTTP request metrics (RED pattern: Rate, Errors, Duration)
 - MCP tool performance and errors
 - Nextcloud API performance by app (notes, calendar, contacts, etc.)
 - OAuth token operations and cache hit rates
 - External dependency health (Nextcloud, Qdrant, Keycloak, Unstructured API)
 - Vector sync processing pipeline (when enabled)
 For manual import or more details, see `charts/nextcloud-mcp-server/dashboards/README.md`.
 ## Examples
 ### Example 1: Basic Auth with Ingress
@@ -6,14 +6,57 @@ This directory contains example Grafana dashboards for monitoring the Nextcloud
 ### nextcloud-mcp-server.json
-Comprehensive dashboard with the following panels:
+All-in-one Operations Dashboard with comprehensive monitoring across all system components.
- **Request Rate**: HTTP requests per second by method and endpoint
+#### Overview Row
- **Error Rate**: Percentage of 5xx errors
+High-level metrics for quick health assessment:
- **Request Latency**: P50 and P95 latency by endpoint
+- **Request Rate** (stat): Total requests per second
- **Top MCP Tools**: Most frequently called tools
+- **Error Rate** (stat): Percentage of 5xx errors with color thresholds
- **Nextcloud API Latency**: API call latency by app (notes, calendar, etc.)
+- **P95 Latency** (stat): 95th percentile request latency
- **Vector Sync Queue**: Queue size for background document processing
+- **Active Requests** (stat): Current in-flight requests
 #### HTTP Metrics (RED Pattern)
 Core request/error/duration metrics:
 - **Request Rate by Endpoint** (timeseries): RPS breakdown by endpoint
 - **Error Rate by Status Code** (timeseries): Error rates for 4xx/5xx codes
 - **Latency Percentiles** (timeseries): P50, P95, P99 latency trends
 - **Status Code Distribution** (piechart): Percentage breakdown of all status codes
 #### MCP Tools Row
 MCP-specific tool performance:
 - **Top Tools by Call Volume** (bargauge): Top 10 most-called tools
 - **Tool Error Rate** (timeseries): Error rates per tool
 - **Tool Execution Duration** (timeseries): P95 latency by tool
 #### Nextcloud API Row
 Backend API performance metrics:
 - **API Calls by App** (timeseries): Request rate per Nextcloud app (notes, calendar, contacts, etc.)
 - **API Latency by App** (timeseries): P95 latency per app
 - **API Retries by Reason** (timeseries): Retry patterns (429, timeout, connection errors)
 - **API Error Rate** (stat): Overall API error percentage
 #### OAuth & Authentication Row
 OAuth token operations and caching:
 - **Token Validations** (timeseries): Success/failure rates for token validation
 - **Token Exchange Operations** (timeseries): RFC 8693 token exchange operations
 - **Token Cache Hit Rate** (stat): Percentage of cache hits (color-coded: red<50%, yellow<80%, green≥80%)
 - **Refresh Token Operations** (timeseries): Refresh token storage operations by type
 #### Dependencies & Health Row
 External dependency status monitoring:
 - **Nextcloud Health** (stat): UP/DOWN status with color coding
 - **Qdrant Health** (stat): Vector database health status
 - **Keycloak Health** (stat): Identity provider health status
 - **Unstructured API Health** (stat): Document processing API status
 - **Health Check Duration** (timeseries): Health check latency by dependency
 - **Database Operation Latency** (timeseries): P95 latency for DB operations (SQLite, Qdrant)
 #### Vector Sync Row (when enabled)
 Document processing pipeline metrics:
 - **Documents Processed Rate** (timeseries): Processing throughput by status (success/failure)
 - **Processing Queue Depth** (gauge): Current queue size with thresholds (yellow>50, red>100)
 - **Qdrant Operations** (timeseries): Vector database operations by type
 - **Document Processing Duration** (timeseries): P95 processing latency
 ## Importing to Grafana
@@ -25,49 +68,77 @@ Comprehensive dashboard with the following panels:
 4. Select your Prometheus data source
 5. Click "Import"
-### Automated Import (Kubernetes)
+### Automated Import (Helm Chart)
-If using the Grafana Operator or kube-prometheus-stack, you can create a ConfigMap:
+The Helm chart now supports automatic dashboard provisioning via Grafana sidecar pattern.
 #### Option 1: Using Helm Chart (Recommended)
 Enable dashboard provisioning in your Helm values:
 ```yaml
 # values.yaml for nextcloud-mcp-server chart
 dashboards:
  enabled: true
  grafanaFolder: "Nextcloud MCP"  # Folder name in Grafana
  labels: {}  # Additional labels if needed
 ```
 Then deploy or upgrade:
 ```bash
-kubectl create configmap nextcloud-mcp-dashboards \
+helm upgrade --install nextcloud-mcp nextcloud-mcp-server \
  --set dashboards.enabled=true
 ```
 The dashboard will be automatically imported by Grafana if the sidecar is configured
 to watch for ConfigMaps with label `grafana_dashboard: "1"`.
 #### Option 2: Using kube-prometheus-stack
 If using kube-prometheus-stack with Grafana sidecar enabled, the dashboard will be
 automatically discovered and imported. Ensure your Grafana deployment has:
 ```yaml
 # kube-prometheus-stack values
 grafana:
  sidecar:
    dashboards:
      enabled: true
      label: grafana_dashboard
      folder: /tmp/dashboards
      provider:
        foldersFromFilesStructure: true
 ```
 #### Option 3: Manual ConfigMap Creation
 For other Grafana setups, create a ConfigMap manually:
 ```bash
 kubectl create configmap nextcloud-mcp-dashboard \
  --from-file=nextcloud-mcp-server.json \
  -n monitoring
-# Add label for Grafana sidecar to discover
+# Add sidecar discovery label
-kubectl label configmap nextcloud-mcp-dashboards \
+kubectl label configmap nextcloud-mcp-dashboard \
  grafana_dashboard=1 \
  -n monitoring
 ```
-Or add to your Helm values:
+# Add folder annotation (annotations support spaces, unlike labels)
-
+kubectl annotate configmap nextcloud-mcp-dashboard \
-```yaml
+  grafana_folder="Nextcloud MCP" \
-# values.yaml for kube-prometheus-stack
+  -n monitoring
 grafana:
  dashboardProviders:
    dashboardproviders.yaml:
      apiVersion: 1
      providers:
        - name: 'nextcloud-mcp'
          orgId: 1
          folder: 'Nextcloud MCP'
          type: file
          disableDeletion: false
          editable: true
          options:
            path: /var/lib/grafana/dashboards/nextcloud-mcp
  dashboardsConfigMaps:
    nextcloud-mcp: nextcloud-mcp-dashboards
 ```
 ## Dashboard Variables
-The dashboard includes two variables:
+The dashboard includes four template variables for dynamic filtering:
- **Data Source**: Select your Prometheus data source
+- **datasource**: Select your Prometheus data source
- **Namespace**: Filter metrics by Kubernetes namespace
+- **namespace**: Filter metrics by Kubernetes namespace (supports "All")
 - **pod**: Filter by specific pod(s) - multi-select enabled (supports "All")
 - **interval**: Query interval for rate calculations (1m, 5m, 10m, 30m, 1h - default: 5m)
 ## Customization
@@ -96,6 +96,30 @@ Your Nextcloud MCP Server has been deployed in {{ .Values.auth.mode }} authentic
   kubectl --namespace {{ .Release.Namespace }} exec -it deploy/{{ include "nextcloud-mcp-server.fullname" . }} -- curl -s http://localhost:{{ include "nextcloud-mcp-server.port" . }}/user/page | grep "Vector Sync"
 {{- end }}
 {{- if .Values.dashboards.enabled }}
 6. Grafana Dashboards:
   - Dashboard provisioning: Enabled
   - ConfigMap: {{ include "nextcloud-mcp-server.fullname" . }}-dashboard
   - Grafana Folder: {{ .Values.dashboards.grafanaFolder }}
   The dashboard will be automatically imported by Grafana if the sidecar is configured
   to watch for ConfigMaps with label "grafana_dashboard: 1".
   To manually import the dashboard:
   kubectl --namespace {{ .Release.Namespace }} get configmap {{ include "nextcloud-mcp-server.fullname" . }}-dashboard -o jsonpath='{.data.nextcloud-mcp-server\.json}' | jq . > dashboard.json
   Then import dashboard.json via Grafana UI (Dashboards → Import).
 {{- else }}
 6. Grafana Dashboards:
   - Dashboard provisioning: Disabled
   - To enable automatic dashboard provisioning, set: dashboards.enabled=true
   Manual import option:
   The dashboard JSON is available in the chart at charts/nextcloud-mcp-server/dashboards/nextcloud-mcp-server.json
 {{- end }}
 For more information and documentation:
 - GitHub: https://github.com/cbcoutinho/nextcloud-mcp-server
 - Documentation: https://github.com/cbcoutinho/nextcloud-mcp-server#readme
@@ -0,0 +1,25 @@
 {{- if .Values.dashboards.enabled }}
 apiVersion: v1
 kind: ConfigMap
 metadata:
  name: {{ include "nextcloud-mcp-server.fullname" . }}-dashboard
  namespace: {{ .Release.Namespace }}
  labels:
    {{- include "nextcloud-mcp-server.labels" . | nindent 4 }}
    {{- with .Values.dashboards.labels }}
    {{- toYaml . | nindent 4 }}
    {{- end }}
    # Grafana sidecar discovery label
    grafana_dashboard: "1"
  annotations:
    {{- with .Values.dashboards.annotations }}
    {{- toYaml . | nindent 4 }}
    {{- end }}
    # Grafana folder name (annotations support spaces, unlike labels)
    {{- if .Values.dashboards.grafanaFolder }}
    grafana_folder: {{ .Values.dashboards.grafanaFolder | quote }}
    {{- end }}
 data:
  nextcloud-mcp-server.json: |-
 {{ .Files.Get "dashboards/nextcloud-mcp-server.json" | indent 4 }}
 {{- end }}
@@ -205,6 +205,20 @@ prometheusRule:
  # Additional labels for PrometheusRule (e.g., for Prometheus selector)
  # Example: { prometheus: kube-prometheus }
 # Grafana dashboards (requires Grafana with sidecar enabled)
 dashboards:
  # Enable automatic dashboard provisioning via ConfigMap
  enabled: false
  # Grafana folder name where dashboards will be imported
  # The grafana-sidecar looks for ConfigMaps with label "grafana_dashboard: 1"
  # and reads the folder name from annotation "grafana_folder" (supports spaces)
  grafanaFolder: "Nextcloud MCP"
  # Additional labels for dashboard ConfigMap
  # These will be added alongside the required "grafana_dashboard: 1" label
  labels: {}
  # Additional annotations for dashboard ConfigMap
  annotations: {}
 service:
  type: ClusterIP
  port: 8000
@@ -352,3 +352,46 @@ def record_dependency_check(dependency: str, duration: float) -> None:
        duration: Check duration in seconds
    """
    dependency_check_duration_seconds.labels(dependency=dependency).observe(duration)
 def record_vector_sync_scan(documents_found: int) -> None:
    """
    Record documents scanned during vector sync.
    Args:
        documents_found: Number of documents discovered in scan
    """
    vector_sync_documents_scanned_total.inc(documents_found)
 def record_vector_sync_processing(duration: float, status: str = "success") -> None:
    """
    Record document processing with duration and status.
    Args:
        duration: Processing duration in seconds
        status: "success" or "error"
    """
    vector_sync_documents_processed_total.labels(status=status).inc()
    vector_sync_processing_duration_seconds.observe(duration)
 def record_qdrant_operation(operation: str, status: str = "success") -> None:
    """
    Record Qdrant vector database operation.
    Args:
        operation: Operation type ("upsert", "search", "delete")
        status: "success" or "error"
    """
    qdrant_operations_total.labels(operation=operation, status=status).inc()
 def update_vector_sync_queue_size(size: int) -> None:
    """
    Update vector sync queue size gauge.
    Args:
        size: Current queue size
    """
    vector_sync_queue_size.set(size)
@@ -21,6 +21,7 @@ from nextcloud_mcp_server.models.semantic import (
    SemanticSearchResult,
    VectorSyncStatusResponse,
 )
 from nextcloud_mcp_server.observability.metrics import record_qdrant_operation
 logger = logging.getLogger(__name__)
@@ -85,26 +86,33 @@ def configure_semantic_tools(mcp: FastMCP):
            # Note: Currently only searching notes (doc_type="note")
            # Future: Remove doc_type filter to search all apps
            qdrant_client = await get_qdrant_client()
-            search_response = await qdrant_client.query_points(
+            try:
-                collection_name=settings.get_collection_name(),
+                search_response = await qdrant_client.query_points(
-                query=query_embedding,
+                    collection_name=settings.get_collection_name(),
-                query_filter=Filter(
+                    query=query_embedding,
-                    must=[
+                    query_filter=Filter(
-                        FieldCondition(
+                        must=[
-                            key="user_id",
+                            FieldCondition(
-                            match=MatchValue(value=username),
+                                key="user_id",
-                        ),
+                                match=MatchValue(value=username),
-                        FieldCondition(
+                            ),
-                            key="doc_type",
+                            FieldCondition(
-                            match=MatchValue(value="note"),
+                                key="doc_type",
-                        ),
+                                match=MatchValue(value="note"),
-                    ]
+                            ),
-                ),
+                        ]
-                limit=limit * 2,  # Get extra for filtering
+                    ),
-                score_threshold=score_threshold,
+                    limit=limit * 2,  # Get extra for filtering
-                with_payload=True,
+                    score_threshold=score_threshold,
-                with_vectors=False,  # Don't return vectors to save bandwidth
+                    with_payload=True,
-            )
+                    with_vectors=False,  # Don't return vectors to save bandwidth
                )
                # Record successful search operation
                record_qdrant_operation("search", "success")
            except Exception:
                # Record failed search operation
                record_qdrant_operation("search", "error")
                raise
            logger.info(
                f"Qdrant returned {len(search_response.points)} results "
@@ -331,21 +339,71 @@ def configure_semantic_tools(mcp: FastMCP):
                success=True,
            )
-        # 4. Construct context from retrieved documents
+        # 4. Fetch full content for notes to provide complete context to LLM
        # Filter out inaccessible notes (deleted or permissions changed)
        client = await get_client(ctx)
        accessible_results = []
        full_contents = []  # Full content for accessible notes
        for result in search_response.results:
            if result.doc_type == "note":
                try:
                    note = await client.notes.get_note(result.id)
                    # Note is accessible, store full content
                    accessible_results.append(result)
                    full_contents.append(note.get("content", ""))
                    logger.debug(
                        f"Fetched full content for note {result.id} "
                        f"(length: {len(full_contents[-1])} chars)"
                    )
                except Exception as e:
                    # Note might have been deleted or permissions changed
                    # Filter it out to avoid corrupting LLM with inaccessible data
                    logger.warning(
                        f"Failed to fetch full content for note {result.id}: {e}. "
                        f"Excluding from results."
                    )
            else:
                # Non-note document types (future: calendar, deck, files)
                # For now, keep them with excerpts
                accessible_results.append(result)
                full_contents.append(None)
        # Check if we filtered out all results
        if not accessible_results:
            logger.warning(f"All search results became inaccessible for query: {query}")
            return SamplingSearchResponse(
                query=query,
                generated_answer="All matching documents are no longer accessible.",
                sources=[],
                total_found=0,
                search_method="semantic_sampling",
                success=True,
            )
        # 5. Construct context from accessible documents with full content
        context_parts = []
-        for idx, result in enumerate(search_response.results, 1):
+        for idx, (result, content) in enumerate(
            zip(accessible_results, full_contents), 1
        ):
            # Use full content if available (notes), otherwise use excerpt
            if content is not None:
                content_field = f"Content: {content}"
            else:
                content_field = f"Excerpt: {result.excerpt}"
            context_parts.append(
                f"[Document {idx}]\n"
                f"Type: {result.doc_type}\n"
                f"Title: {result.title}\n"
                f"Category: {result.category}\n"
-                f"Excerpt: {result.excerpt}\n"
+                f"{content_field}\n"
                f"Relevance Score: {result.score:.2f}\n"
            )
        context = "\n".join(context_parts)
-        # 5. Construct prompt - reuse user's query, add context and instructions
+        # 6. Construct prompt - reuse user's query, add context and instructions
        prompt = (
            f"{query}\n\n"
            f"Here are relevant documents from Nextcloud (notes, calendar events, deck cards, files, contacts):\n\n"
@@ -401,8 +459,8 @@ def configure_semantic_tools(mcp: FastMCP):
            return SamplingSearchResponse(
                query=query,
                generated_answer=generated_answer,
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method="semantic_sampling",
                model_used=sampling_result.model,
                stop_reason=sampling_result.stopReason,
@@ -419,11 +477,11 @@ def configure_semantic_tools(mcp: FastMCP):
                generated_answer=(
                    f"[Sampling request timed out]\n\n"
                    f"The answer generation took too long (>30s). "
-                    f"Found {search_response.total_found} relevant documents. "
+                    f"Found {len(accessible_results)} relevant documents. "
                    f"Please review the sources below or try a simpler query."
                ),
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method="semantic_sampling_timeout",
                success=True,
            )
@@ -454,11 +512,11 @@ def configure_semantic_tools(mcp: FastMCP):
                query=query,
                generated_answer=(
                    f"[{user_message}]\n\n"
-                    f"Found {search_response.total_found} relevant documents. "
+                    f"Found {len(accessible_results)} relevant documents. "
                    f"Please review the sources below."
                ),
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method=search_method,
                success=True,
            )
@@ -475,11 +533,11 @@ def configure_semantic_tools(mcp: FastMCP):
                query=query,
                generated_answer=(
                    f"[Unexpected error during sampling]\n\n"
-                    f"Found {search_response.total_found} relevant documents. "
+                    f"Found {len(accessible_results)} relevant documents. "
                    f"Please review the sources below."
                ),
-                sources=search_response.results,
+                sources=accessible_results,
-                total_found=search_response.total_found,
+                total_found=len(accessible_results),
                search_method="semantic_sampling_error",
                success=True,
            )
@@ -15,6 +15,10 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue, PointStruct
 from nextcloud_mcp_server.client import NextcloudClient
 from nextcloud_mcp_server.config import get_settings
 from nextcloud_mcp_server.embedding import get_embedding_service
 from nextcloud_mcp_server.observability.metrics import (
    record_qdrant_operation,
    record_vector_sync_processing,
 )
 from nextcloud_mcp_server.observability.tracing import trace_operation
 from nextcloud_mcp_server.vector.document_chunker import DocumentChunker
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
@@ -90,6 +94,8 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
        doc_task: Document task to process
        nc_client: Authenticated Nextcloud client
    """
    start_time = time.time()
    logger.debug(
        f"Processing {doc_task.doc_type}_{doc_task.doc_id} "
        f"for {doc_task.user_id} ({doc_task.operation})"
@@ -105,58 +111,79 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
            "vector_sync.doc_operation": doc_task.operation,
        },
    ):
-        qdrant_client = await get_qdrant_client()
+        try:
-        settings = get_settings()
+            qdrant_client = await get_qdrant_client()
            settings = get_settings()
-        # Handle deletion
+            # Handle deletion
-        if doc_task.operation == "delete":
+            if doc_task.operation == "delete":
-            await qdrant_client.delete(
+                await qdrant_client.delete(
-                collection_name=settings.get_collection_name(),
+                    collection_name=settings.get_collection_name(),
-                points_selector=Filter(
+                    points_selector=Filter(
-                    must=[
+                        must=[
-                        FieldCondition(
+                            FieldCondition(
-                            key="user_id",
+                                key="user_id",
-                            match=MatchValue(value=doc_task.user_id),
+                                match=MatchValue(value=doc_task.user_id),
-                        ),
+                            ),
-                        FieldCondition(
+                            FieldCondition(
-                            key="doc_id",
+                                key="doc_id",
-                            match=MatchValue(value=doc_task.doc_id),
+                                match=MatchValue(value=doc_task.doc_id),
-                        ),
+                            ),
-                        FieldCondition(
+                            FieldCondition(
-                            key="doc_type",
+                                key="doc_type",
-                            match=MatchValue(value=doc_task.doc_type),
+                                match=MatchValue(value=doc_task.doc_type),
-                        ),
+                            ),
-                    ]
+                        ]
-                ),
+                    ),
-            )
+                )
-            logger.info(
+                logger.info(
-                f"Deleted {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id}"
+                    f"Deleted {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id}"
-            )
+                )
            return
-        # Handle indexing with retry
+                # Record successful deletion metrics
-        max_retries = 3
+                duration = time.time() - start_time
-        retry_delay = 1.0
+                record_qdrant_operation("delete", "success")
                record_vector_sync_processing(duration, "success")
                return
-        for attempt in range(max_retries):
+            # Handle indexing with retry
-            try:
+            max_retries = 3
-                await _index_document(doc_task, nc_client, qdrant_client)
+            retry_delay = 1.0
                return  # Success
-            except (HTTPStatusError, Exception) as e:
+            for attempt in range(max_retries):
-                if attempt < max_retries - 1:
+                try:
-                    logger.warning(
+                    await _index_document(doc_task, nc_client, qdrant_client)
-                        f"Retry {attempt + 1}/{max_retries} for "
+
-                        f"{doc_task.doc_type}_{doc_task.doc_id}: {e}"
+                    # Record successful processing metrics
-                    )
+                    duration = time.time() - start_time
-                    await anyio.sleep(retry_delay)
+                    record_qdrant_operation("upsert", "success")
-                    retry_delay *= 2  # Exponential backoff
+                    record_vector_sync_processing(duration, "success")
-                else:
+                    return  # Success
-                    logger.error(
+
-                        f"Failed to index {doc_task.doc_type}_{doc_task.doc_id} "
+                except (HTTPStatusError, Exception) as e:
-                        f"after {max_retries} retries: {e}"
+                    if attempt < max_retries - 1:
-                    )
+                        logger.warning(
-                    raise
+                            f"Retry {attempt + 1}/{max_retries} for "
                            f"{doc_task.doc_type}_{doc_task.doc_id}: {e}"
                        )
                        await anyio.sleep(retry_delay)
                        retry_delay *= 2  # Exponential backoff
                    else:
                        logger.error(
                            f"Failed to index {doc_task.doc_type}_{doc_task.doc_id} "
                            f"after {max_retries} retries: {e}"
                        )
                        # Record failed processing metrics
                        duration = time.time() - start_time
                        record_qdrant_operation("upsert", "error")
                        record_vector_sync_processing(duration, "error")
                        raise
        except Exception:
            # Catch any other unexpected errors
            duration = time.time() - start_time
            record_vector_sync_processing(duration, "error")
            raise
 async def _index_document(
@@ -13,6 +13,7 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue
 from nextcloud_mcp_server.client import NextcloudClient
 from nextcloud_mcp_server.config import get_settings
 from nextcloud_mcp_server.observability.metrics import record_vector_sync_scan
 from nextcloud_mcp_server.observability.tracing import trace_operation
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
@@ -181,6 +182,9 @@ async def scan_user_documents(
        ]
        logger.info(f"[SCAN-{scan_id}] Found {len(notes)} notes for {user_id}")
        # Record documents scanned
        record_vector_sync_scan(len(notes))
        if initial_sync:
            # Send everything on first sync
            for note in notes:
@@ -1,6 +1,6 @@
 [project]
 name = "nextcloud-mcp-server"
-version = "0.32.1"
+version = "0.33.1"
 description = "Model Context Protocol (MCP) server for Nextcloud integration - enables AI assistants to interact with Nextcloud data"
 authors = [
    {name = "Chris Coutinho", email = "chris@coutinho.io"}
@@ -1053,7 +1053,7 @@ wheels = [
 [[package]]
 name = "nextcloud-mcp-server"
-version = "0.32.1"
+version = "0.33.1"
 source = { editable = "." }
 dependencies = [
    { name = "aiosqlite" },
Author	SHA1	Message	Date
github-actions[bot]	bd76902932	bump: version 0.33.0 → 0.33.1	2025-11-13 12:10:42 +00:00
Chris Coutinho	da65155cde	Merge pull request #293 from cbcoutinho/fix/grafana-folder-label-validation fix: Move grafana_folder from labels to annotations	2025-11-13 13:10:15 +01:00
Chris Coutinho	4e43d15153	fix: Move grafana_folder from labels to annotations Fixes Kubernetes label validation error when deploying dashboard ConfigMap. Problem: - Kubernetes labels cannot contain spaces (validation regex: [A-Za-z0-9][-A-Za-z0-9_.]*[A-Za-z0-9]) - Previous implementation had grafana_folder: "Nextcloud MCP" as a label - Deployment failed with: "Invalid value: 'Nextcloud MCP'" Solution: - Move grafana_folder from labels to annotations (annotations allow spaces) - Keep grafana_dashboard="1" as label for ConfigMap discovery - Grafana sidecar reads folder name from folderAnnotation parameter Changes: - dashboard-configmap.yaml: Move grafana_folder to annotations section - dashboards/README.md: Fix kubectl commands to use annotations - values.yaml: Update comments to clarify annotation usage This follows the standard kube-prometheus-stack pattern where: - Labels are used for ConfigMap discovery (strict validation) - Annotations are used for metadata like folder names (relaxed validation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 13:08:45 +01:00
github-actions[bot]	15951c38fa	bump: version 0.32.1 → 0.33.0	2025-11-13 10:58:05 +00:00
Chris Coutinho	2de0590839	Merge pull request #292 from cbcoutinho/feat/grafana-dashboard-and-vector-metrics feat: Add Grafana dashboard and vector sync metric instrumentation	2025-11-13 11:57:40 +01:00
Chris Coutinho	4ea5ed72d4	feat: Add Grafana dashboard and vector sync metric instrumentation Implement comprehensive observability for vector database synchronization with Grafana dashboard and Prometheus metrics. ## Part 1: Grafana Dashboard Created all-in-one operations dashboard with 7 rows and 34 panels: ### Dashboard Structure: - Overview Row: Request rate, error rate, P95 latency, active requests - HTTP Metrics (RED): Request/error rates by endpoint, latency percentiles - MCP Tools: Call volume, error rates, execution duration by tool - Nextcloud API: API calls/latency by app, retry patterns - OAuth & Authentication: Token validations, exchanges, cache hit rate - Dependencies & Health: Status for Nextcloud/Qdrant/Keycloak/Unstructured - Vector Sync: Processing throughput, queue depth, Qdrant operations ### Helm Chart Integration: - Added dashboard-configmap.yaml template for automatic provisioning - Configured Grafana sidecar auto-discovery (label: grafana_dashboard="1") - Added dashboards configuration section in values.yaml (opt-in) - Updated Chart.yaml with dashboard annotations - Enhanced NOTES.txt with dashboard deployment instructions - Comprehensive documentation in dashboards/README.md Dashboard supports dynamic filtering via variables: - datasource: Prometheus data source selection - namespace: Filter by Kubernetes namespace - pod: Multi-select pod filtering - interval: Query interval (1m/5m/10m/30m/1h) ## Part 2: Vector Sync Metric Instrumentation Implemented metric recording throughout vector sync pipeline: ### metrics.py: Added convenience functions: - record_vector_sync_scan() - Track documents per scan - record_vector_sync_processing() - Track processing duration/status - record_qdrant_operation() - Track database operations - update_vector_sync_queue_size() - Track queue depth ### scanner.py: - Record number of documents found in each scan - Enables monitoring of scan throughput ### processor.py: - Record processing duration for each document - Track success/failure status with timing - Record Qdrant upsert/delete operations - Handle all code paths (success, deletion, error) ### semantic.py: - Wrap Qdrant query_points with try/except - Record search operation success/failure ## Metrics Exposed: - mcp_vector_sync_documents_scanned_total - mcp_vector_sync_documents_processed_total{status} - mcp_vector_sync_processing_duration_seconds (histogram) - mcp_vector_sync_queue_size (gauge) - mcp_qdrant_operations_total{operation,status} This enables monitoring of: - Scan and processing throughput - Processing latency (P50/P95/P99) - Error rates for processing and Qdrant operations - Queue depth trends - Complete observability of vector sync pipeline ## Testing: Verified locally that metrics are recorded correctly: - 36 documents scanned - 3 documents processed (avg 7.5s each) - 3 successful Qdrant upsert operations - Search operations tracked ## Deployment: Enable dashboard provisioning in Helm values: ```yaml dashboards: enabled: true grafanaFolder: "Nextcloud MCP" ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 11:49:20 +01:00
Chris Coutinho	d1829fbbd6	Merge pull request #291 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.9	2025-11-13 08:02:35 +01:00
renovate-bot-cbcoutinho[bot]	8332542959	chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.9	2025-11-12 23:11:29 +00:00