docs: Update ADR-011 to rejected status with Context Agent validation

After comprehensive research, the hybrid OAuth + AppAPI architecture is NOT being implemented due to fundamental architectural incompatibilities. Key updates: - Status: Proposed → Not Planned - Added validation from Nextcloud Context Agent project - Context Agent (official NC ExApp with MCP) faces IDENTICAL limitations - Proves constraints are architectural, not implementation-specific Context Agent findings: - ExApp with MCP server endpoint (~28 tools exposed) - Uses Task Processing API for confirmations (NOT MCP elicitation) - Works around AppAPI proxy limitations by changing protocol - MCP endpoint is secondary feature with documented constraints - Primary use: In-app Assistant integration, not external MCP clients Critical features impossible through AppAPI proxy: - ❌ MCP sampling (eliminates RAG/LLM features) - ❌ MCP elicitation (user prompts) - ❌ Real-time progress updates - ❌ Bidirectional streaming - Validated by Context Agent facing same limitations Decision rationale: - MCP requires multi-turn nested interactions - AppAPI provides stateless request/response proxy only - No implementation effort can bridge this fundamental gap - Would require complete AppAPI redesign (WebSocket, message routing) - Even official Nextcloud projects work around these limitations Alternative considered for future: - Register as Task Processing provider (different product) - Use Nextcloud Assistant UI (not external MCP clients) - Accept different capabilities (no sampling, custom flows) OAuth mode remains sole solution for external MCP client integration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
docs: Add ADR-011 for hybrid OAuth + AppAPI deployment architecture
2025-11-13 23:30:14 +01:00 · 2025-11-13 13:10:21 +01:00
31 changed files with 1868 additions and 3223 deletions
@@ -85,4 +85,4 @@ jobs:
          NEXTCLOUD_USERNAME: "admin"
          NEXTCLOUD_PASSWORD: "admin"
        run: |
-          uv run pytest -v --log-cli-level=WARN -m smoke
+          uv run pytest -v --log-cli-level=WARN --ignore=tests/manual
@@ -1,15 +1,3 @@
-## v0.33.1 (2025-11-13)
-
-### Fix
-
- Move grafana_folder from labels to annotations
-
-## v0.33.0 (2025-11-13)
-
-### Feat
-
- Add Grafana dashboard and vector sync metric instrumentation
-
 ## v0.32.1 (2025-11-12)

 ### Fix
@@ -1,4 +1,4 @@
-FROM ghcr.io/astral-sh/uv:0.9.9-python3.11-alpine@sha256:0faa7934fac1db7f5056f159c1224d144bab864fd2677a4066d25a686ae32edd
+FROM ghcr.io/astral-sh/uv:0.9.8-python3.11-alpine@sha256:6c842c49ad032f46b62f32a7e7779f45f12671a8e0d82ea24c766ab62d58b396

 # Install dependencies
 # 1. git (required for caldav dependency from git)
@@ -2,8 +2,8 @@ apiVersion: v2
 name: nextcloud-mcp-server
 description: A Helm chart for Nextcloud MCP Server - enables AI assistants to interact with Nextcloud
 type: application
-version: 0.33.1
-appVersion: "0.33.1"
+version: 0.32.1
+appVersion: "0.32.1"
 keywords:
  - nextcloud
  - mcp
@@ -21,10 +21,6 @@ home: https://github.com/cbcoutinho/nextcloud-mcp-server
 sources:
  - https://github.com/cbcoutinho/nextcloud-mcp-server
 icon: https://raw.githubusercontent.com/nextcloud/server/master/core/img/logo/logo.svg
-annotations:
-  # Grafana dashboard support
-  grafana_dashboard: "true"
-  grafana_dashboard_folder: "Nextcloud MCP"
 dependencies:
  - name: qdrant
    version: "1.15.5"
@@ -280,72 +280,6 @@ Use OpenAI or any OpenAI-compatible API instead of Ollama.
 | `openai.secretKey` | Key in secret containing API key | `api-key` |
 | `openai.baseUrl` | Custom API endpoint (optional) | `""` |

-#### Observability & Monitoring
-
-The chart includes comprehensive observability features including Prometheus metrics, OpenTelemetry tracing, and Grafana dashboards.
-
-**Metrics Configuration:**
-
-| Parameter | Description | Default |
-|-----------|-------------|---------|
-| `observability.metrics.enabled` | Enable Prometheus metrics | `true` |
-| `observability.metrics.port` | Metrics port | `9090` |
-| `observability.metrics.path` | Metrics endpoint path | `/metrics` |
-
-**Tracing Configuration:**
-
-| Parameter | Description | Default |
-|-----------|-------------|---------|
-| `observability.tracing.enabled` | Enable OpenTelemetry tracing | `false` |
-| `observability.tracing.endpoint` | OTLP collector endpoint | `""` |
-| `observability.tracing.serviceName` | Service name in traces | `nextcloud-mcp-server` |
-| `observability.tracing.samplingRate` | Trace sampling rate (0.0-1.0) | `1.0` |
-
-**Logging Configuration:**
-
-| Parameter | Description | Default |
-|-----------|-------------|---------|
-| `observability.logging.format` | Log format (json or text) | `json` |
-| `observability.logging.level` | Log level | `INFO` |
-| `observability.logging.includeTraceContext` | Include trace IDs in logs | `true` |
-
-**ServiceMonitor (Prometheus Operator):**
-
-| Parameter | Description | Default |
-|-----------|-------------|---------|
-| `serviceMonitor.enabled` | Create ServiceMonitor resource | `false` |
-| `serviceMonitor.interval` | Scrape interval | `30s` |
-| `serviceMonitor.scrapeTimeout` | Scrape timeout | `10s` |
-| `serviceMonitor.labels` | Additional labels for ServiceMonitor | `{}` |
-
-**PrometheusRule (Prometheus Operator):**
-
-| Parameter | Description | Default |
-|-----------|-------------|---------|
-| `prometheusRule.enabled` | Create PrometheusRule with alert rules | `false` |
-| `prometheusRule.labels` | Additional labels for PrometheusRule | `{}` |
-
-**Grafana Dashboards:**
-
-| Parameter | Description | Default |
-|-----------|-------------|---------|
-| `dashboards.enabled` | Enable automatic dashboard provisioning | `false` |
-| `dashboards.grafanaFolder` | Grafana folder name for dashboards | `Nextcloud MCP` |
-| `dashboards.labels` | Additional labels for dashboard ConfigMap | `{}` |
-| `dashboards.annotations` | Additional annotations for dashboard ConfigMap | `{}` |
-
-When `dashboards.enabled` is `true`, a ConfigMap with the Grafana dashboard is created with the `grafana_dashboard: "1"` label. This enables automatic discovery by Grafana sidecar containers (commonly used with kube-prometheus-stack).
-
-The dashboard provides comprehensive monitoring including:
- HTTP request metrics (RED pattern: Rate, Errors, Duration)
- MCP tool performance and errors
- Nextcloud API performance by app (notes, calendar, contacts, etc.)
- OAuth token operations and cache hit rates
- External dependency health (Nextcloud, Qdrant, Keycloak, Unstructured API)
- Vector sync processing pipeline (when enabled)
-
-For manual import or more details, see `charts/nextcloud-mcp-server/dashboards/README.md`.
-
 ## Examples

 ### Example 1: Basic Auth with Ingress
@@ -6,57 +6,14 @@ This directory contains example Grafana dashboards for monitoring the Nextcloud

 ### nextcloud-mcp-server.json

-All-in-one Operations Dashboard with comprehensive monitoring across all system components.
+Comprehensive dashboard with the following panels:

-#### Overview Row
-High-level metrics for quick health assessment:
- **Request Rate** (stat): Total requests per second
- **Error Rate** (stat): Percentage of 5xx errors with color thresholds
- **P95 Latency** (stat): 95th percentile request latency
- **Active Requests** (stat): Current in-flight requests
-
-#### HTTP Metrics (RED Pattern)
-Core request/error/duration metrics:
- **Request Rate by Endpoint** (timeseries): RPS breakdown by endpoint
- **Error Rate by Status Code** (timeseries): Error rates for 4xx/5xx codes
- **Latency Percentiles** (timeseries): P50, P95, P99 latency trends
- **Status Code Distribution** (piechart): Percentage breakdown of all status codes
-
-#### MCP Tools Row
-MCP-specific tool performance:
- **Top Tools by Call Volume** (bargauge): Top 10 most-called tools
- **Tool Error Rate** (timeseries): Error rates per tool
- **Tool Execution Duration** (timeseries): P95 latency by tool
-
-#### Nextcloud API Row
-Backend API performance metrics:
- **API Calls by App** (timeseries): Request rate per Nextcloud app (notes, calendar, contacts, etc.)
- **API Latency by App** (timeseries): P95 latency per app
- **API Retries by Reason** (timeseries): Retry patterns (429, timeout, connection errors)
- **API Error Rate** (stat): Overall API error percentage
-
-#### OAuth & Authentication Row
-OAuth token operations and caching:
- **Token Validations** (timeseries): Success/failure rates for token validation
- **Token Exchange Operations** (timeseries): RFC 8693 token exchange operations
- **Token Cache Hit Rate** (stat): Percentage of cache hits (color-coded: red<50%, yellow<80%, green≥80%)
- **Refresh Token Operations** (timeseries): Refresh token storage operations by type
-
-#### Dependencies & Health Row
-External dependency status monitoring:
- **Nextcloud Health** (stat): UP/DOWN status with color coding
- **Qdrant Health** (stat): Vector database health status
- **Keycloak Health** (stat): Identity provider health status
- **Unstructured API Health** (stat): Document processing API status
- **Health Check Duration** (timeseries): Health check latency by dependency
- **Database Operation Latency** (timeseries): P95 latency for DB operations (SQLite, Qdrant)
-
-#### Vector Sync Row (when enabled)
-Document processing pipeline metrics:
- **Documents Processed Rate** (timeseries): Processing throughput by status (success/failure)
- **Processing Queue Depth** (gauge): Current queue size with thresholds (yellow>50, red>100)
- **Qdrant Operations** (timeseries): Vector database operations by type
- **Document Processing Duration** (timeseries): P95 processing latency
+- **Request Rate**: HTTP requests per second by method and endpoint
+- **Error Rate**: Percentage of 5xx errors
+- **Request Latency**: P50 and P95 latency by endpoint
+- **Top MCP Tools**: Most frequently called tools
+- **Nextcloud API Latency**: API call latency by app (notes, calendar, etc.)
+- **Vector Sync Queue**: Queue size for background document processing

 ## Importing to Grafana

@@ -68,77 +25,49 @@ Document processing pipeline metrics:
 4. Select your Prometheus data source
 5. Click "Import"

-### Automated Import (Helm Chart)
+### Automated Import (Kubernetes)

-The Helm chart now supports automatic dashboard provisioning via Grafana sidecar pattern.
-
-#### Option 1: Using Helm Chart (Recommended)
-
-Enable dashboard provisioning in your Helm values:
-
-```yaml
-# values.yaml for nextcloud-mcp-server chart
-dashboards:
-  enabled: true
-  grafanaFolder: "Nextcloud MCP"  # Folder name in Grafana
-  labels: {}  # Additional labels if needed
-```
-
-Then deploy or upgrade:
+If using the Grafana Operator or kube-prometheus-stack, you can create a ConfigMap:

 ```bash
-helm upgrade --install nextcloud-mcp nextcloud-mcp-server \
-  --set dashboards.enabled=true
-```
-
-The dashboard will be automatically imported by Grafana if the sidecar is configured
-to watch for ConfigMaps with label `grafana_dashboard: "1"`.
-
-#### Option 2: Using kube-prometheus-stack
-
-If using kube-prometheus-stack with Grafana sidecar enabled, the dashboard will be
-automatically discovered and imported. Ensure your Grafana deployment has:
-
-```yaml
-# kube-prometheus-stack values
-grafana:
-  sidecar:
-    dashboards:
-      enabled: true
-      label: grafana_dashboard
-      folder: /tmp/dashboards
-      provider:
-        foldersFromFilesStructure: true
-```
-
-#### Option 3: Manual ConfigMap Creation
-
-For other Grafana setups, create a ConfigMap manually:
-
-```bash
-kubectl create configmap nextcloud-mcp-dashboard \
+kubectl create configmap nextcloud-mcp-dashboards \
  --from-file=nextcloud-mcp-server.json \
  -n monitoring

-# Add sidecar discovery label
-kubectl label configmap nextcloud-mcp-dashboard \
+# Add label for Grafana sidecar to discover
+kubectl label configmap nextcloud-mcp-dashboards \
  grafana_dashboard=1 \
  -n monitoring
+```

-# Add folder annotation (annotations support spaces, unlike labels)
-kubectl annotate configmap nextcloud-mcp-dashboard \
-  grafana_folder="Nextcloud MCP" \
-  -n monitoring
+Or add to your Helm values:
+
+```yaml
+# values.yaml for kube-prometheus-stack
+grafana:
+  dashboardProviders:
+    dashboardproviders.yaml:
+      apiVersion: 1
+      providers:
+        - name: 'nextcloud-mcp'
+          orgId: 1
+          folder: 'Nextcloud MCP'
+          type: file
+          disableDeletion: false
+          editable: true
+          options:
+            path: /var/lib/grafana/dashboards/nextcloud-mcp
+
+  dashboardsConfigMaps:
+    nextcloud-mcp: nextcloud-mcp-dashboards
 ```

 ## Dashboard Variables

-The dashboard includes four template variables for dynamic filtering:
+The dashboard includes two variables:

- **datasource**: Select your Prometheus data source
- **namespace**: Filter metrics by Kubernetes namespace (supports "All")
- **pod**: Filter by specific pod(s) - multi-select enabled (supports "All")
- **interval**: Query interval for rate calculations (1m, 5m, 10m, 30m, 1h - default: 5m)
+- **Data Source**: Select your Prometheus data source
+- **Namespace**: Filter metrics by Kubernetes namespace

 ## Customization

@@ -96,30 +96,6 @@ Your Nextcloud MCP Server has been deployed in {{ .Values.auth.mode }} authentic
   kubectl --namespace {{ .Release.Namespace }} exec -it deploy/{{ include "nextcloud-mcp-server.fullname" . }} -- curl -s http://localhost:{{ include "nextcloud-mcp-server.port" . }}/user/page | grep "Vector Sync"
 {{- end }}

-{{- if .Values.dashboards.enabled }}
-
-6. Grafana Dashboards:
-   - Dashboard provisioning: Enabled
-   - ConfigMap: {{ include "nextcloud-mcp-server.fullname" . }}-dashboard
-   - Grafana Folder: {{ .Values.dashboards.grafanaFolder }}
-
-   The dashboard will be automatically imported by Grafana if the sidecar is configured
-   to watch for ConfigMaps with label "grafana_dashboard: 1".
-
-   To manually import the dashboard:
-   kubectl --namespace {{ .Release.Namespace }} get configmap {{ include "nextcloud-mcp-server.fullname" . }}-dashboard -o jsonpath='{.data.nextcloud-mcp-server\.json}' | jq . > dashboard.json
-
-   Then import dashboard.json via Grafana UI (Dashboards → Import).
-{{- else }}
-
-6. Grafana Dashboards:
-   - Dashboard provisioning: Disabled
-   - To enable automatic dashboard provisioning, set: dashboards.enabled=true
-
-   Manual import option:
-   The dashboard JSON is available in the chart at charts/nextcloud-mcp-server/dashboards/nextcloud-mcp-server.json
-{{- end }}
-
 For more information and documentation:
 - GitHub: https://github.com/cbcoutinho/nextcloud-mcp-server
 - Documentation: https://github.com/cbcoutinho/nextcloud-mcp-server#readme
@@ -1,25 +0,0 @@
-{{- if .Values.dashboards.enabled }}
-apiVersion: v1
-kind: ConfigMap
-metadata:
-  name: {{ include "nextcloud-mcp-server.fullname" . }}-dashboard
-  namespace: {{ .Release.Namespace }}
-  labels:
-    {{- include "nextcloud-mcp-server.labels" . | nindent 4 }}
-    {{- with .Values.dashboards.labels }}
-    {{- toYaml . | nindent 4 }}
-    {{- end }}
-    # Grafana sidecar discovery label
-    grafana_dashboard: "1"
-  annotations:
-    {{- with .Values.dashboards.annotations }}
-    {{- toYaml . | nindent 4 }}
-    {{- end }}
-    # Grafana folder name (annotations support spaces, unlike labels)
-    {{- if .Values.dashboards.grafanaFolder }}
-    grafana_folder: {{ .Values.dashboards.grafanaFolder | quote }}
-    {{- end }}
-data:
-  nextcloud-mcp-server.json: |-
-{{ .Files.Get "dashboards/nextcloud-mcp-server.json" | indent 4 }}
-{{- end }}
@@ -205,20 +205,6 @@ prometheusRule:
  # Additional labels for PrometheusRule (e.g., for Prometheus selector)
  # Example: { prometheus: kube-prometheus }

-# Grafana dashboards (requires Grafana with sidecar enabled)
-dashboards:
-  # Enable automatic dashboard provisioning via ConfigMap
-  enabled: false
-  # Grafana folder name where dashboards will be imported
-  # The grafana-sidecar looks for ConfigMaps with label "grafana_dashboard: 1"
-  # and reads the folder name from annotation "grafana_folder" (supports spaces)
-  grafanaFolder: "Nextcloud MCP"
-  # Additional labels for dashboard ConfigMap
-  # These will be added alongside the required "grafana_dashboard: 1" label
-  labels: {}
-  # Additional annotations for dashboard ConfigMap
-  annotations: {}
-
 service:
  type: ClusterIP
  port: 8000
@@ -1,895 +0,0 @@
-# ADR-011: Improving Semantic Search Quality Through Better Chunking and Embeddings
-
-**Status**: Proposed
-**Date**: 2025-11-12
-**Authors**: Development Team
-**Related**: ADR-003 (Vector Database Architecture), ADR-008 (MCP Sampling for RAG)
-
-## Context
-
-The semantic search implementation provides document retrieval across Nextcloud apps using vector embeddings. Production usage has revealed that **the system frequently misses relevant documents** (recall problem).
-
-Root cause analysis identifies two fundamental issues:
-
-### 1. Poor Chunking Strategy
-
-**Current Implementation** (`nextcloud_mcp_server/vector/document_chunker.py:36`):
-```python
-words = content.split()  # Naive whitespace splitting
-chunk_size = 512  # words
-overlap = 50  # words
-chunks = [words[i:i+chunk_size] for i in range(0, len(words), chunk_size-overlap)]
-```
-
-**Problems**:
- **Breaks semantic boundaries**: Splits mid-sentence, mid-paragraph, mid-thought
- **Loses context**: "The meeting discussed budget. We decided to..." becomes two disconnected chunks
- **Poor retrieval**: Relevant content split across chunks with low individual relevance scores
- **No structure awareness**: Ignores markdown headers, lists, code blocks
-
-**Evidence**:
- Documents with relevant content in middle sections score poorly (content split across 3+ chunks)
- Multi-sentence concepts (spanning 60-100 words) are fragmented
- Search for "budget planning process" misses documents where these words appear in adjacent sentences but different chunks
-
-### 2. Suboptimal Embedding Model
-
-**Current Implementation** (`nextcloud_mcp_server/embedding/ollama_provider.py:33`):
-```python
-_model = "nomic-embed-text"  # 768 dimensions
-_dimension = 768  # Hardcoded
-```
-
-**Problems**:
- **Model selection**: `nomic-embed-text` is general-purpose, not optimized for our use case
- **No benchmarking**: Selected without comparative evaluation
- **Dimensionality**: 768-dim may be insufficient for nuanced semantic distinctions
- **No domain adaptation**: Model not tuned for Nextcloud content (notes, calendar, deck cards)
-
-**Evidence**:
- Synonymous queries return different results ("meeting notes" vs. "discussion summary")
- Domain-specific terms poorly represented ("standup", "retrospective", "OKRs")
- Cross-lingual content (if present) not well supported
-
-### Current Performance
-
-**Baseline Metrics** (100-document test corpus, 50 queries):
- **Recall@10**: ~52% (misses 48% of relevant documents)
- **Precision@10**: ~78% (acceptable but room for improvement)
- **MRR**: 0.58 (relevant docs often not in top positions)
- **Zero-result queries**: 18% (completely missing relevant content)
-
-## Decision Drivers
-
-1. **Address Root Causes**: Fix fundamental issues (chunking, embeddings) before adding complexity (reranking, hybrid search)
-2. **Measurable Impact**: Target 40-60% improvement in recall through chunking/embedding alone
-3. **Independence**: Improvements should be orthogonal to future enhancements (reranking, GraphRAG)
-4. **Cost Efficiency**: Minimize infrastructure and API costs
-5. **Reindexing Acceptable**: One-time reindex cost justified by long-term quality improvement
-
-## Options Considered
-
-### Chunking Strategies
-
-#### Option C1: Semantic Sentence-Aware Chunking (RECOMMENDED)
-
-**Description**: Respect sentence boundaries while maintaining target chunk size
-
-**Implementation**:
-```python
-from langchain.text_splitter import RecursiveCharacterTextSplitter
-
-splitter = RecursiveCharacterTextSplitter(
-    chunk_size=2048,  # ~512 words in characters
-    chunk_overlap=200,  # ~50 words in characters
-    separators=["\n\n", "\n", ". ", "! ", "? ", "; ", ": ", ", ", " "],
-    length_function=len,
-)
-```
-
-**How it works**:
-1. Try splitting by paragraphs (`\n\n`)
-2. If chunks too large, split by sentences (`. `, `! `, `? `)
-3. If still too large, split by clauses (`;`, `:`)
-4. Last resort: split by words
-
-**Pros**:
- ✅ Preserves semantic boundaries (never breaks mid-sentence)
- ✅ Maintains context coherence within chunks
- ✅ Simple implementation (langchain library)
- ✅ Configurable separators for different content types
- ✅ Proven approach (used by major RAG systems)
-
-**Cons**:
- ❌ Variable chunk sizes (not exactly 512 words, but close)
- ❌ Adds dependency (langchain)
- ❌ Slightly slower than naive splitting (~10-20ms per document)
-
-**Expected Impact**: 20-30% recall improvement
-
-#### Option C2: Hierarchical Context-Preserving Chunks
-
-**Description**: Create overlapping parent/child chunks
-
-**Structure**:
-```
-Document → Large parent chunks (1024 words) → Small child chunks (256 words)
-          ↓                                    ↓
-   Stored in Qdrant                       Searched first
-                                          Return parent context
-```
-
-**Implementation**:
-```python
-# Generate child chunks (searched)
-child_chunks = splitter.split_text(content, chunk_size=1024)
-
-# Generate parent chunks (context)
-parent_chunks = splitter.split_text(content, chunk_size=4096)
-
-# Store both with parent-child relationships
-for child_idx, child in enumerate(child_chunks):
-    parent_idx = find_parent(child_idx)
-    store_vector(
-        vector=embed(child),
-        payload={
-            "chunk": child,
-            "parent_chunk": parent_chunks[parent_idx],
-            "chunk_type": "child"
-        }
-    )
-```
-
-**Pros**:
- ✅ Best of both worlds: precise matching + full context
- ✅ Handles multi-hop information needs
- ✅ Better for long documents (> 1000 words)
-
-**Cons**:
- ❌ 2x storage (parent + child chunks)
- ❌ More complex implementation
- ❌ Higher indexing time (embed twice)
- ❌ Query complexity (retrieve child, return parent)
-
-**Expected Impact**: 35-45% recall improvement (diminishing returns vs. complexity)
-
-**Verdict**: ⚠️ Consider only if Option C1 insufficient
-
-#### Option C3: Document Structure-Aware Chunking
-
-**Description**: Parse markdown/document structure before chunking
-
-**Implementation**:
-```python
-import mistune  # Markdown parser
-
-def structure_aware_chunk(markdown_content: str) -> list[str]:
-    ast = mistune.create_markdown(renderer='ast')(markdown_content)
-
-    chunks = []
-    for node in ast:
-        if node['type'] == 'heading':
-            # Start new chunk at each header
-            current_chunk = node['children'][0]['raw']
-        elif node['type'] == 'paragraph':
-            current_chunk += "\n" + node['children'][0]['raw']
-            if len(current_chunk) > 2048:
-                chunks.append(current_chunk)
-                current_chunk = ""
-
-    return chunks
-```
-
-**Pros**:
- ✅ Respects document logical structure
- ✅ Headers provide context for chunks
- ✅ Works well for structured notes (documentation, meeting notes with sections)
-
-**Cons**:
- ❌ Complex implementation (parser, AST traversal)
- ❌ Markdown-specific (doesn't help calendar events, deck cards)
- ❌ Variable chunk sizes (some sections very short/long)
- ❌ Breaks for unstructured content
-
-**Expected Impact**: 15-25% improvement for structured content only
-
-**Verdict**: ⚠️ Future enhancement after Option C1
-
-#### Option C4: Fixed Sliding Window (Current Baseline)
-
-**Description**: Current naive word-based splitting
-
-**Verdict**: ❌ Superseded by Option C1
-
-### Embedding Model Strategies
-
-#### Option E1: Upgrade to Better General-Purpose Model (RECOMMENDED)
-
-**Description**: Switch to state-of-the-art embedding model
-
-**Candidates**:
-
-| Model | Dimensions | MTEB Score | Pros | Cons |
-|-------|-----------|------------|------|------|
-| **mxbai-embed-large** | 1024 | 64.68 | Best performance, good balance | Larger (slower) |
-| **nomic-embed-text-v1.5** | 768 | 62.39 | Upgraded version of current | Incremental improvement |
-| **bge-large-en-v1.5** | 1024 | 64.23 | Excellent for English | Not multilingual |
-| **nomic-embed-text** (current) | 768 | 60.10 | Baseline | Lower performance |
-
-**MTEB**: Massive Text Embedding Benchmark (higher = better semantic understanding)
-
-**Recommendation**: **mxbai-embed-large-v1**
- Best MTEB score (64.68)
- 1024 dimensions (richer semantic space)
- Works well via Ollama
- ~15-20% better retrieval quality in benchmarks
-
-**Implementation**:
-```python
-# config.py
-OLLAMA_EMBEDDING_MODEL = "mxbai-embed-large-v1"  # Changed from nomic-embed-text
-
-# ollama_provider.py
-async def get_dimension(self) -> int:
-    # Query Ollama for actual dimension instead of hardcoding
-    response = await self.client.post("/api/show", json={"name": self.model})
-    return response.json()["details"]["embedding_length"]
-```
-
-**Migration**:
-1. Deploy new model to Ollama
-2. Create new Qdrant collection (different dimension)
-3. Reindex all documents with new embeddings
-4. Swap collections atomically
-5. Delete old collection
-
-**Pros**:
- ✅ Immediate quality improvement (15-20%)
- ✅ Simple change (config + reindex)
- ✅ No code complexity
- ✅ Future-proof (state-of-the-art model)
-
-**Cons**:
- ❌ Requires full reindex (2-4 hours for 1000 documents)
- ❌ Larger model = slower embedding (~50ms vs. 30ms per chunk)
- ❌ Higher dimensionality = more storage (~30% increase)
-
-**Expected Impact**: 15-25% recall improvement
-
-#### Option E2: Multi-Vector Embeddings (ColBERT-style)
-
-**Description**: Generate multiple embeddings per chunk (token-level)
-
-**Architecture**:
-```
-Chunk → Transformer → Token embeddings (e.g., 50 tokens × 128 dim) → Store all
-Query → Transformer → Token embeddings → MaxSim(query_tokens, doc_tokens)
-```
-
-**MaxSim scoring**:
-```python
-def maxsim_score(query_embeddings, doc_embeddings):
-    # For each query token, find max similarity with any doc token
-    scores = []
-    for q_emb in query_embeddings:
-        max_sim = max(cosine_similarity(q_emb, d_emb) for d_emb in doc_embeddings)
-        scores.append(max_sim)
-    return sum(scores)
-```
-
-**Pros**:
- ✅ Best retrieval quality (state-of-the-art results)
- ✅ Fine-grained matching (token-level)
- ✅ Handles partial matches better
-
-**Cons**:
- ❌ **50-100x storage increase** (50 vectors per chunk vs. 1)
- ❌ **Slower search** (compute MaxSim for each candidate)
- ❌ **Complex implementation** (custom scoring, storage schema)
- ❌ **Requires specialized model** (ColBERTv2, not available in Ollama)
-
-**Expected Impact**: 40-50% improvement, but at very high cost
-
-**Verdict**: ❌ Too complex, too expensive for marginal gain over E1+C1
-
-#### Option E3: Fine-Tuned Domain-Specific Model
-
-**Description**: Fine-tune embedding model on Nextcloud corpus
-
-**Process**:
-1. Collect training data (query-document pairs)
-2. Fine-tune base model (e.g., `nomic-embed-text`) on domain data
-3. Deploy fine-tuned model via Ollama
-4. Reindex with fine-tuned embeddings
-
-**Training data needed**:
- 1,000+ query-document pairs
- Labeled relevance (positive/negative examples)
- Representative of real usage
-
-**Pros**:
- ✅ Optimized for specific content (notes, calendar, deck)
- ✅ Better handling of domain terminology
- ✅ Highest potential quality improvement (30-40%)
-
-**Cons**:
- ❌ **Requires training data** (expensive to collect)
- ❌ **GPU infrastructure** needed for fine-tuning
- ❌ **Expertise required** (ML/NLP knowledge)
- ❌ **Maintenance burden** (retrain as corpus evolves)
- ❌ **Time investment**: 2-4 weeks initial setup
-
-**Expected Impact**: 30-40% improvement, but high cost
-
-**Verdict**: ⚠️ Consider only if E1+C1 insufficient AND have training data
-
-#### Option E4: Ensemble Embeddings
-
-**Description**: Generate embeddings with multiple models, combine scores
-
-**Implementation**:
-```python
-models = ["mxbai-embed-large-v1", "bge-large-en-v1.5"]
-
-# Index
-embeddings = [await embed(chunk, model) for model in models]
-store_multi_vector(embeddings)
-
-# Search
-query_embeddings = [await embed(query, model) for model in models]
-scores = [search(q_emb, model) for q_emb, model in zip(query_embeddings, models)]
-combined_score = 0.5 * scores[0] + 0.5 * scores[1]
-```
-
-**Pros**:
- ✅ Robust to individual model weaknesses
- ✅ Better coverage of semantic space
-
-**Cons**:
- ❌ 2x storage and compute
- ❌ Complex scoring and fusion
- ❌ Marginal improvement (~5-10%) over single best model
-
-**Expected Impact**: 5-10% over best single model
-
-**Verdict**: ❌ Not worth complexity
-
-### Combined Strategies
-
-#### Option D1: Best Chunking + Best Embedding (RECOMMENDED)
-
-**Combination**: Option C1 (Semantic Chunking) + Option E1 (mxbai-embed-large-v1)
-
-**Expected Impact**:
- Chunking: +20-30% recall
- Embedding: +15-25% recall
- **Combined**: +35-55% recall improvement (not strictly additive, but significant)
-
-**Cost**:
- Development: 1-2 days
- Reindex: 2-4 hours (one-time)
- Ongoing: None (same infrastructure)
-
-**Pros**:
- ✅ Addresses both root causes
- ✅ Orthogonal improvements (chunking + embedding)
- ✅ Simple implementation
- ✅ No new infrastructure
- ✅ Future-proof foundation for additional enhancements (reranking, hybrid search)
-
-**Cons**:
- ❌ Requires full reindex (manageable)
- ❌ Slightly higher storage (1024 vs. 768 dim)
-
-**Verdict**: ✅ **RECOMMENDED**
-
-## Decision
-
-**Adopt Option D1: Semantic Chunking + Upgraded Embedding Model**
-
-Implement both improvements together to maximize recall improvement:
-
-### 1. Semantic Sentence-Aware Chunking
-
-**Changes**:
- Replace naive word splitting with `RecursiveCharacterTextSplitter`
- Preserve sentence boundaries, paragraph structure
- Maintain similar chunk sizes (~512 words / 2048 characters)
-
-**Implementation**:
-
-```python
-# nextcloud_mcp_server/vector/document_chunker.py
-
-from langchain.text_splitter import RecursiveCharacterTextSplitter
-
-class DocumentChunker:
-    """Chunk documents into semantically coherent pieces."""
-
-    def __init__(
-        self,
-        chunk_size: int = 2048,  # Characters, not words
-        chunk_overlap: int = 200,  # Characters, not words
-    ):
-        self.chunk_size = chunk_size
-        self.chunk_overlap = chunk_overlap
-
-        self.splitter = RecursiveCharacterTextSplitter(
-            chunk_size=chunk_size,
-            chunk_overlap=chunk_overlap,
-            separators=[
-                "\n\n",  # Paragraphs (highest priority)
-                "\n",    # Lines
-                ". ",    # Sentences
-                "! ",
-                "? ",
-                "; ",    # Clauses
-                ": ",
-                ", ",    # Phrases
-                " ",     # Words (last resort)
-            ],
-            length_function=len,
-            is_separator_regex=False,
-        )
-
-    def chunk_text(self, content: str) -> list[str]:
-        """
-        Chunk text while preserving semantic boundaries.
-
-        Args:
-            content: Full document text
-
-        Returns:
-            List of text chunks, each ending at a semantic boundary
-        """
-        if not content:
-            return []
-
-        # Use RecursiveCharacterTextSplitter for semantic boundaries
-        chunks = self.splitter.split_text(content)
-
-        return chunks
-```
-
-**Configuration Changes** (`config.py`):
-```python
-# Old (word-based)
-DOCUMENT_CHUNK_SIZE: int = 512  # words
-DOCUMENT_CHUNK_OVERLAP: int = 50  # words
-
-# New (character-based, more precise)
-DOCUMENT_CHUNK_SIZE: int = 2048  # characters (~512 words)
-DOCUMENT_CHUNK_OVERLAP: int = 200  # characters (~50 words)
-```
-
-**Dependency** (`pyproject.toml`):
-```toml
-[project]
-dependencies = [
-    # ... existing dependencies
-    "langchain-text-splitters>=0.2.0",
-]
-```
-
-### 2. Upgrade Embedding Model
-
-**Changes**:
- Switch from `nomic-embed-text` (768-dim) to `mxbai-embed-large-v1` (1024-dim)
- Dynamic dimension detection (query Ollama instead of hardcoding)
- Create new Qdrant collection for new dimensions
-
-**Implementation**:
-
-```python
-# nextcloud_mcp_server/embedding/ollama_provider.py
-
-class OllamaEmbeddingProvider(EmbeddingProvider):
-    def __init__(self, base_url: str, model: str, verify_ssl: bool = True):
-        self.base_url = base_url
-        self.model = model
-        self._dimension: int | None = None  # Changed: query dynamically
-        self.client = httpx.AsyncClient(base_url=base_url, verify=verify_ssl)
-
-    async def dimension(self) -> int:
-        """Get embedding dimension from Ollama API."""
-        if self._dimension is None:
-            try:
-                response = await self.client.post(
-                    "/api/show",
-                    json={"name": self.model},
-                    timeout=10.0,
-                )
-                response.raise_for_status()
-                info = response.json()
-                self._dimension = info.get("details", {}).get("embedding_length")
-
-                if self._dimension is None:
-                    # Fallback: generate test embedding to detect dimension
-                    test_emb = await self.embed("test")
-                    self._dimension = len(test_emb)
-
-            except Exception as e:
-                logger.warning(f"Failed to get dimension from Ollama: {e}, using fallback")
-                # Fallback dimensions by model name
-                if "mxbai-embed-large" in self.model:
-                    self._dimension = 1024
-                elif "nomic-embed-text" in self.model:
-                    self._dimension = 768
-                else:
-                    self._dimension = 768  # Default
-
-        return self._dimension
-```
-
-**Configuration Changes** (`config.py`):
-```python
-# Old
-OLLAMA_EMBEDDING_MODEL: str = "nomic-embed-text"
-
-# New
-OLLAMA_EMBEDDING_MODEL: str = "mxbai-embed-large-v1"
-```
-
-**Environment Variable**:
-```bash
-OLLAMA_EMBEDDING_MODEL=mxbai-embed-large-v1
-```
-
-### 3. Migration Strategy
-
-**Reindexing Process**:
-
-```python
-# nextcloud_mcp_server/vector/migration.py
-
-async def migrate_to_new_embeddings():
-    """
-    Migrate from old embeddings to new embeddings.
-
-    Process:
-    1. Create new collection with new dimension
-    2. Reindex all documents with new embeddings
-    3. Atomic swap (update collection name in config)
-    4. Delete old collection
-    """
-    old_collection = "nextcloud_content"
-    new_collection = "nextcloud_content_v2"
-
-    # 1. Create new collection
-    await qdrant_client.create_collection(
-        collection_name=new_collection,
-        vectors_config=VectorParams(
-            size=1024,  # mxbai-embed-large-v1 dimension
-            distance=Distance.COSINE,
-        ),
-    )
-
-    # 2. Reindex all documents
-    logger.info("Starting reindex with new embeddings...")
-    scanner = VectorScanner(...)
-    processor = VectorProcessor(collection_name=new_collection, ...)
-
-    await scanner.scan_all()  # Rescans and re-embeds all documents
-
-    # 3. Wait for completion
-    while True:
-        status = await get_sync_status()
-        if status.pending_documents == 0:
-            break
-        await asyncio.sleep(5)
-
-    # 4. Atomic swap
-    # Update config to point to new collection
-    # (or use collection alias in Qdrant)
-    await qdrant_client.update_collection_aliases(
-        change_aliases_operations=[
-            CreateAliasOperation(
-                create_alias=CreateAlias(
-                    collection_name=new_collection,
-                    alias_name="nextcloud_content"
-                )
-            )
-        ]
-    )
-
-    # 5. Verify new collection works
-    test_results = await run_benchmark_queries()
-    if test_results.recall < baseline_recall:
-        # Rollback
-        logger.error("New embeddings worse than baseline, rolling back")
-        await rollback_migration()
-        return False
-
-    # 6. Delete old collection
-    await qdrant_client.delete_collection(old_collection)
-    logger.info("Migration complete!")
-    return True
-```
-
-**Downtime Mitigation**:
- Use Qdrant collection aliases for atomic swap
- Reindex can happen in background
- Only brief downtime during alias swap (~1s)
-
-**Rollback Plan**:
- Keep old collection until validation complete
- If new embeddings worse, swap alias back to old collection
- No data loss
-
-### 4. Validation & Benchmarking
-
-**Before/After Comparison**:
-
-```python
-# tests/benchmarks/chunking_embedding_comparison.py
-
-async def benchmark_chunking_embeddings():
-    """
-    Compare old vs. new chunking and embeddings on test queries.
-    """
-    test_queries = load_benchmark_queries()  # 100 queries with known relevant docs
-
-    # Baseline (current)
-    baseline_results = await run_queries(
-        queries=test_queries,
-        collection="nextcloud_content",  # Old: nomic-embed-text, word chunks
-    )
-
-    # New implementation
-    new_results = await run_queries(
-        queries=test_queries,
-        collection="nextcloud_content_v2",  # New: mxbai-embed-large-v1, semantic chunks
-    )
-
-    # Compare metrics
-    comparison = {
-        "baseline": {
-            "recall@10": calculate_recall(baseline_results, k=10),
-            "precision@10": calculate_precision(baseline_results, k=10),
-            "mrr": calculate_mrr(baseline_results),
-            "zero_result_rate": calculate_zero_result_rate(baseline_results),
-        },
-        "new": {
-            "recall@10": calculate_recall(new_results, k=10),
-            "precision@10": calculate_precision(new_results, k=10),
-            "mrr": calculate_mrr(new_results),
-            "zero_result_rate": calculate_zero_result_rate(new_results),
-        },
-        "improvement": {
-            "recall_improvement": (new_recall - baseline_recall) / baseline_recall,
-            "precision_improvement": (new_precision - baseline_precision) / baseline_precision,
-        }
-    }
-
-    return comparison
-```
-
-**Success Criteria**:
- **Recall@10**: Improve from ~52% to ≥75% (+40% improvement)
- **Precision@10**: Maintain ≥75% (no degradation)
- **MRR**: Improve from 0.58 to ≥0.70
- **Zero-result rate**: Reduce from 18% to ≤10%
- **Indexing time**: Maintain ≤10s per document
-
-**Validation Process**:
-1. Run benchmark on baseline (current implementation)
-2. Implement changes
-3. Run benchmark on new implementation
-4. Compare metrics
-5. If improvement ≥40%, proceed to production
-6. If improvement <40%, investigate and iterate
-
-## Implementation Timeline
-
-### Week 1: Development & Testing
-
-**Day 1-2: Chunking Implementation**
- [ ] Add langchain-text-splitters dependency
- [ ] Refactor `document_chunker.py`
- [ ] Update configuration (character-based chunk sizes)
- [ ] Write unit tests for semantic boundaries
- [ ] Validate: Chunks never break mid-sentence
-
-**Day 3-4: Embedding Implementation**
- [ ] Update `ollama_provider.py` with dynamic dimension detection
- [ ] Update configuration (new model name)
- [ ] Deploy `mxbai-embed-large-v1` to Ollama
- [ ] Test embedding generation with new model
- [ ] Validate: Embeddings are 1024-dim
-
-**Day 5: Migration Script**
- [ ] Write migration script (collection creation, reindexing, alias swap)
- [ ] Test migration on staging environment
- [ ] Validate: No data loss, atomic swap works
-
-### Week 2: Reindexing & Validation
-
-**Day 1-2: Staging Reindex**
- [ ] Run full reindex on staging environment
- [ ] Monitor indexing performance
- [ ] Validate: All documents indexed correctly
-
-**Day 3: Benchmarking**
- [ ] Run benchmark queries on old collection (baseline)
- [ ] Run benchmark queries on new collection
- [ ] Compare metrics (recall, precision, MRR)
- [ ] Validate: ≥40% recall improvement
-
-**Day 4: Production Reindex**
- [ ] Schedule maintenance window (optional, can run in background)
- [ ] Run migration script on production
- [ ] Monitor reindexing progress
- [ ] Atomic swap when complete
-
-**Day 5: Production Validation**
- [ ] Monitor search quality metrics
- [ ] Collect user feedback
- [ ] Compare production metrics to staging
- [ ] Rollback if issues detected
-
-## Cost Analysis
-
-### Development Cost
- **Time**: 1-2 weeks (implementation + validation)
- **Effort**: 40-60 hours @ $100/hour = $4,000 - $6,000
-
-### Infrastructure Cost
- **Storage**: +30% (1024-dim vs. 768-dim)
-  - Example: 1,000 notes × 3 chunks × 1024 dim × 4 bytes = 12 MB (negligible)
- **Compute**: +20% embedding time (50ms vs. 30ms per chunk)
-  - Amortized over batch indexing, minimal impact
- **No new infrastructure**: Uses existing Ollama + Qdrant
-
-### Reindexing Cost (One-Time)
- **Time**: 2-4 hours for 1,000 documents
-  - 1,000 docs × 3 chunks × 50ms = 150 seconds (~2.5 minutes embedding)
-  - + Ollama processing time + Qdrant insertion
- **Downtime**: ~1 second (atomic alias swap)
-
-### Total Cost
- **Initial**: $4,000 - $6,000 (development + testing)
- **Ongoing**: $0 (no new infrastructure or API costs)
-
-### ROI
- **Recall improvement**: +40-60% (finding relevant documents)
- **User satisfaction**: Reduced zero-result queries (18% → 10%)
- **Foundation**: Enables future enhancements (reranking, hybrid search)
- **Cost per % improvement**: $100 - $150 (excellent ROI)
-
-## Consequences
-
-### Positive
-
-1. **Addresses Root Causes**: Fixes fundamental issues (chunking, embeddings) not symptoms
-2. **High Impact**: Expected 40-60% recall improvement from foundational changes
-3. **Future-Proof**: Creates solid foundation for future enhancements (reranking, hybrid search, GraphRAG)
-4. **Simple**: No architectural changes, no new infrastructure
-5. **Orthogonal**: Improvements are independent, can be validated separately
-6. **Low Risk**: Proven techniques (RecursiveCharacterTextSplitter, mxbai-embed-large-v1)
-7. **Maintainable**: Standard libraries and models, easy to debug
-
-### Negative
-
-1. **Reindexing Required**: 2-4 hours one-time cost (manageable, can run in background)
-2. **Storage Increase**: +30% for higher-dimensional embeddings (12 MB vs. 9 MB for 1K docs)
-3. **Slower Indexing**: +20% embedding time (50ms vs. 30ms per chunk)
-4. **Dependency**: Adds langchain-text-splitters (minimal, well-maintained library)
-5. **Not a Complete Solution**: May still need reranking/hybrid search for optimal recall (but solid foundation)
-
-### Neutral
-
-1. **Model Lock-In**: Committed to mxbai-embed-large-v1, but can change later (another reindex)
-2. **Chunk Size Trade-offs**: ~512 words is heuristic, may need tuning for specific content types
-
-## Monitoring & Success Metrics
-
-### Real-Time Metrics (Grafana)
-
-**Search Quality**:
- `semantic_search_recall_at_10` (target: ≥75%)
- `semantic_search_precision_at_10` (target: ≥75%)
- `semantic_search_mrr` (target: ≥0.70)
- `semantic_search_zero_result_rate` (target: ≤10%)
-
-**Performance**:
- `semantic_search_latency_ms` (p50, p95, p99)
- `embedding_generation_time_ms`
- `indexing_throughput_docs_per_sec`
-
-**Indexing**:
- `documents_indexed_total`
- `documents_pending`
- `indexing_errors_total`
-
-### Weekly Validation
-
-**A/B Testing** (if gradual rollout):
- 50% users: New embeddings
- 50% users: Old embeddings
- Compare metrics for 1 week
- Full rollout if new embeddings superior
-
-**User Feedback**:
- Survey: "How satisfied are you with search results?" (1-5 scale)
- Track: Number of "search not working" support tickets
- Monitor: User-reported false negatives ("I know this doc exists")
-
-### Rollback Criteria
-
-**Automatic Rollback** if:
- Recall decreases by >10% from baseline
- Error rate increases by >50%
- Query latency increases by >100%
-
-**Manual Rollback** if:
- User complaints increase significantly
- Zero-result queries increase instead of decrease
-
-## Future Enhancements
-
-These improvements create a solid foundation. Future enhancements (in order of priority):
-
-1. **Cross-Encoder Reranking** (ADR-012)
-   - Two-stage retrieval: broad recall (50 candidates) → precise reranking (top 10)
-   - Expected: +15-20% additional recall improvement
-   - Builds on: Better embeddings retrieve better candidates to rerank
-
-2. **Hybrid Search** (ADR-013)
-   - Combine vector search + BM25 keyword search
-   - Expected: +10-15% additional recall (especially for exact matches)
-   - Builds on: Semantic chunks provide better keyword match context
-
-3. **Multi-App Indexing** (ADR-014)
-   - Index calendar, deck, files (currently notes-only)
-   - Expected: Expands searchable corpus 3-5x
-   - Builds on: Proven chunking and embedding strategy
-
-4. **GraphRAG** (ADR-015, conditional)
-   - Only if: Global thematic queries needed OR corpus >10K documents
-   - Expected: Relationship discovery, multi-hop reasoning
-   - Builds on: High-quality embeddings improve graph construction
-
-## References
-
-### Research Papers
-
-1. **RecursiveCharacterTextSplitter**
-   - LangChain Documentation: https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter
-   - Proven technique used by major RAG systems
-
-2. **MTEB Leaderboard** (Massive Text Embedding Benchmark)
-   - https://huggingface.co/spaces/mteb/leaderboard
-   - Comprehensive embedding model comparison
-
-3. **mxbai-embed-large**
-   - Model: https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1
-   - Best general-purpose embedding model (MTEB: 64.68)
-
-### Related ADRs
-
- **ADR-003**: Vector Database and Semantic Search Architecture (original implementation)
- **ADR-008**: MCP Sampling for Multi-App Semantic Search with RAG (answer generation)
-
-### Tools & Libraries
-
- **LangChain Text Splitters**: https://python.langchain.com/docs/modules/data_connection/document_transformers/
- **Ollama Embedding Models**: https://ollama.ai/library
- **Qdrant Collections**: https://qdrant.tech/documentation/concepts/collections/
-
-## Summary
-
-This ADR addresses the root causes of poor semantic search recall:
-
-1. **Better Chunking**: Semantic sentence-aware splitting (preserves context)
-2. **Better Embeddings**: Upgrade to mxbai-embed-large-v1 (richer semantic space)
-
-**Expected Impact**: 40-60% recall improvement with minimal cost and complexity.
-
-**Why This Approach**:
- Fixes fundamentals before adding complexity
- Proven techniques (not experimental)
- Simple implementation (1-2 weeks)
- Creates foundation for future enhancements
- No new infrastructure or ongoing costs
-
-**Next Steps**: Approve ADR → Implement changes → Reindex → Validate → Production rollout
@@ -1,6 +1,5 @@
 import logging
 import os
-import time
 from collections.abc import AsyncIterator
 from contextlib import AsyncExitStack, asynccontextmanager
 from dataclasses import dataclass
@@ -45,10 +44,6 @@ from nextcloud_mcp_server.observability import (
    setup_metrics,
    setup_tracing,
 )
-from nextcloud_mcp_server.observability.metrics import (
-    record_dependency_check,
-    set_dependency_health,
-)
 from nextcloud_mcp_server.server import (
    configure_calendar_tools,
    configure_contacts_tools,
@@ -507,9 +502,9 @@ async def setup_oauth_config():
    - External IdP mode: OIDC_DISCOVERY_URL points to external provider
      → External IdP for OAuth, Nextcloud user_oidc validates tokens and provides API access

-    Uses OIDC environment variables:
+    Uses generic OIDC environment variables:
    - OIDC_DISCOVERY_URL: OIDC discovery endpoint (optional, defaults to NEXTCLOUD_HOST)
-    - NEXTCLOUD_OIDC_CLIENT_ID / NEXTCLOUD_OIDC_CLIENT_SECRET: Static credentials (optional, uses DCR if not provided)
+    - OIDC_CLIENT_ID / OIDC_CLIENT_SECRET: Static credentials (optional, uses DCR if not provided)
    - NEXTCLOUD_OIDC_SCOPES: Requested OAuth scopes

    This is done synchronously before FastMCP initialization because FastMCP
@@ -633,21 +628,19 @@ async def setup_oauth_config():
            )

    # Load client credentials (static or dynamic registration)
-    client_id = os.getenv("NEXTCLOUD_OIDC_CLIENT_ID")
-    client_secret = os.getenv("NEXTCLOUD_OIDC_CLIENT_SECRET")
+    client_id = os.getenv("OIDC_CLIENT_ID")
+    client_secret = os.getenv("OIDC_CLIENT_SECRET")

    if client_id and client_secret:
        logger.info(f"Using static OIDC client credentials: {client_id}")
    elif registration_endpoint:
-        logger.info(
-            "NEXTCLOUD_OIDC_CLIENT_ID not set, attempting Dynamic Client Registration"
-        )
+        logger.info("OIDC_CLIENT_ID not set, attempting Dynamic Client Registration")
        client_id, client_secret = await load_oauth_client_credentials(
            nextcloud_host=nextcloud_host, registration_endpoint=registration_endpoint
        )
    else:
        raise ValueError(
-            "NEXTCLOUD_OIDC_CLIENT_ID and NEXTCLOUD_OIDC_CLIENT_SECRET environment variables are required "
+            "OIDC_CLIENT_ID and OIDC_CLIENT_SECRET environment variables are required "
            "when the OIDC provider does not support Dynamic Client Registration. "
            f"Discovery URL: {discovery_url}"
        )
@@ -1212,35 +1205,12 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
        checks = {}
        is_ready = True

-        # Check Nextcloud host configuration and connectivity
+        # Check Nextcloud host configuration
        nextcloud_host = os.getenv("NEXTCLOUD_HOST")
        if nextcloud_host:
            checks["nextcloud_configured"] = "ok"
-            # Try to connect to Nextcloud
-            start_time = time.time()
-            try:
-                async with httpx.AsyncClient(timeout=2.0) as client:
-                    response = await client.get(f"{nextcloud_host}/status.php")
-                    duration = time.time() - start_time
-                    if response.status_code == 200:
-                        checks["nextcloud_reachable"] = "ok"
-                        set_dependency_health("nextcloud", True)
-                    else:
-                        checks["nextcloud_reachable"] = (
-                            f"error: status {response.status_code}"
-                        )
-                        set_dependency_health("nextcloud", False)
-                        is_ready = False
-                    record_dependency_check("nextcloud", duration)
-            except Exception as e:
-                duration = time.time() - start_time
-                checks["nextcloud_reachable"] = f"error: {str(e)}"
-                set_dependency_health("nextcloud", False)
-                record_dependency_check("nextcloud", duration)
-                is_ready = False
        else:
            checks["nextcloud_configured"] = "error: NEXTCLOUD_HOST not set"
-            set_dependency_health("nextcloud", False)
            is_ready = False

        # Check authentication configuration
@@ -1268,29 +1238,20 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
        qdrant_url = os.getenv("QDRANT_URL")  # Only set in network mode

        if vector_sync_enabled and qdrant_url:
-            start_time = time.time()
            try:
                async with httpx.AsyncClient(timeout=2.0) as client:
                    response = await client.get(f"{qdrant_url}/readyz")
-                    duration = time.time() - start_time
                    if response.status_code == 200:
                        checks["qdrant"] = "ok"
-                        set_dependency_health("qdrant", True)
                    else:
                        checks["qdrant"] = f"error: status {response.status_code}"
-                        set_dependency_health("qdrant", False)
                        is_ready = False
-                    record_dependency_check("qdrant", duration)
            except Exception as e:
-                duration = time.time() - start_time
                checks["qdrant"] = f"error: {str(e)}"
-                set_dependency_health("qdrant", False)
-                record_dependency_check("qdrant", duration)
                is_ready = False
        elif vector_sync_enabled:
            # Using embedded Qdrant (memory or persistent mode)
            checks["qdrant"] = "embedded"
-            set_dependency_health("qdrant", True)

        status_code = 200 if is_ready else 503
        return JSONResponse(
@@ -12,10 +12,6 @@ from mcp.server.fastmcp import Context

 from ..client import NextcloudClient
 from ..config import get_settings
-from ..observability.metrics import (
-    oauth_token_cache_hits_total,
-    oauth_token_exchange_total,
-)
 from .token_exchange import exchange_token_for_audience

 logger = logging.getLogger(__name__)
@@ -142,7 +138,6 @@ async def get_session_client_from_context(
                logger.debug(
                    f"Using cached exchanged token (expires in {expiry - time.time():.1f}s)"
                )
-                oauth_token_cache_hits_total.labels(hit="true").inc()
                return NextcloudClient.from_token(
                    base_url=base_url, token=cached_token, username=username
                )
@@ -150,24 +145,17 @@ async def get_session_client_from_context(
                logger.debug("Cached token expired, removing from cache")
                del _exchange_cache[cache_key]

-        oauth_token_cache_hits_total.labels(hit="false").inc()
-
        # Perform RFC 8693 token exchange
        logger.info(f"Exchanging MCP token for Nextcloud API token (user: {username})")

-        try:
-            # Exchange for Nextcloud resource URI audience
-            exchanged_token, expires_in = await exchange_token_for_audience(
-                subject_token=mcp_token,
-                requested_audience=settings.nextcloud_resource_uri or "nextcloud",
-                requested_scopes=None,  # Nextcloud doesn't support scopes
-            )
-            oauth_token_exchange_total.labels(status="success").inc()
+        # Exchange for Nextcloud resource URI audience
+        exchanged_token, expires_in = await exchange_token_for_audience(
+            subject_token=mcp_token,
+            requested_audience=settings.nextcloud_resource_uri or "nextcloud",
+            requested_scopes=None,  # Nextcloud doesn't support scopes
+        )

-            logger.info(f"Token exchange successful. Token expires in {expires_in}s")
-        except Exception:
-            oauth_token_exchange_total.labels(status="error").inc()
-            raise
+        logger.info(f"Token exchange successful. Token expires in {expires_in}s")

        # Cache the exchanged token
        # Use the minimum of exchange TTL and configured cache TTL
@@ -35,8 +35,6 @@ from typing import Any, Optional
 import aiosqlite
 from cryptography.fernet import Fernet

-from nextcloud_mcp_server.observability.metrics import record_db_operation
-
 logger = logging.getLogger(__name__)


@@ -294,43 +292,35 @@ class RefreshTokenStorage:
        # For Flow 2, set provisioned_at timestamp
        provisioned_at = now if flow_type == "flow2" else None

-        start_time = time.time()
-        try:
-            async with aiosqlite.connect(self.db_path) as db:
-                await db.execute(
-                    """
-                    INSERT OR REPLACE INTO refresh_tokens
-                    (user_id, encrypted_token, expires_at, created_at, updated_at,
-                     flow_type, token_audience, provisioned_at, provisioning_client_id, scopes)
-                    VALUES (?, ?, ?, COALESCE((SELECT created_at FROM refresh_tokens WHERE user_id = ?), ?), ?,
-                            ?, ?, ?, ?, ?)
-                    """,
-                    (
-                        user_id,
-                        encrypted_token,
-                        expires_at,
-                        user_id,
-                        now,
-                        now,
-                        flow_type,
-                        token_audience,
-                        provisioned_at,
-                        provisioning_client_id,
-                        scopes_json,
-                    ),
-                )
-                await db.commit()
-            duration = time.time() - start_time
-            record_db_operation("sqlite", "insert", duration, "success")
-
-            logger.info(
-                f"Stored refresh token for user {user_id}"
-                + (f" (expires at {expires_at})" if expires_at else "")
+        async with aiosqlite.connect(self.db_path) as db:
+            await db.execute(
+                """
+                INSERT OR REPLACE INTO refresh_tokens
+                (user_id, encrypted_token, expires_at, created_at, updated_at,
+                 flow_type, token_audience, provisioned_at, provisioning_client_id, scopes)
+                VALUES (?, ?, ?, COALESCE((SELECT created_at FROM refresh_tokens WHERE user_id = ?), ?), ?,
+                        ?, ?, ?, ?, ?)
+                """,
+                (
+                    user_id,
+                    encrypted_token,
+                    expires_at,
+                    user_id,
+                    now,
+                    now,
+                    flow_type,
+                    token_audience,
+                    provisioned_at,
+                    provisioning_client_id,
+                    scopes_json,
+                ),
            )
-        except Exception:
-            duration = time.time() - start_time
-            record_db_operation("sqlite", "insert", duration, "error")
-            raise
+            await db.commit()
+
+        logger.info(
+            f"Stored refresh token for user {user_id}"
+            + (f" (expires at {expires_at})" if expires_at else "")
+        )

        # Audit log
        await self._audit_log(
@@ -432,45 +422,40 @@ class RefreshTokenStorage:
        if not self._initialized:
            await self.initialize()

-        start_time = time.time()
+        async with aiosqlite.connect(self.db_path) as db:
+            async with db.execute(
+                """
+                SELECT encrypted_token, expires_at, flow_type, token_audience,
+                       provisioned_at, provisioning_client_id, scopes
+                FROM refresh_tokens WHERE user_id = ?
+                """,
+                (user_id,),
+            ) as cursor:
+                row = await cursor.fetchone()
+
+        if not row:
+            logger.debug(f"No refresh token found for user {user_id}")
+            return None
+
+        (
+            encrypted_token,
+            expires_at,
+            flow_type,
+            token_audience,
+            provisioned_at,
+            provisioning_client_id,
+            scopes_json,
+        ) = row
+
+        # Check expiration
+        if expires_at is not None and expires_at < time.time():
+            logger.warning(
+                f"Refresh token for user {user_id} has expired (expired at {expires_at})"
+            )
+            await self.delete_refresh_token(user_id)
+            return None
+
        try:
-            async with aiosqlite.connect(self.db_path) as db:
-                async with db.execute(
-                    """
-                    SELECT encrypted_token, expires_at, flow_type, token_audience,
-                           provisioned_at, provisioning_client_id, scopes
-                    FROM refresh_tokens WHERE user_id = ?
-                    """,
-                    (user_id,),
-                ) as cursor:
-                    row = await cursor.fetchone()
-
-            if not row:
-                logger.debug(f"No refresh token found for user {user_id}")
-                duration = time.time() - start_time
-                record_db_operation("sqlite", "select", duration, "success")
-                return None
-
-            (
-                encrypted_token,
-                expires_at,
-                flow_type,
-                token_audience,
-                provisioned_at,
-                provisioning_client_id,
-                scopes_json,
-            ) = row
-
-            # Check expiration
-            if expires_at is not None and expires_at < time.time():
-                logger.warning(
-                    f"Refresh token for user {user_id} has expired (expired at {expires_at})"
-                )
-                await self.delete_refresh_token(user_id)
-                duration = time.time() - start_time
-                record_db_operation("sqlite", "select", duration, "success")
-                return None
-
            decrypted_token = self.cipher.decrypt(encrypted_token).decode()
            scopes = json.loads(scopes_json) if scopes_json else None

@@ -478,9 +463,6 @@ class RefreshTokenStorage:
                f"Retrieved refresh token for user {user_id} (flow_type: {flow_type})"
            )

-            duration = time.time() - start_time
-            record_db_operation("sqlite", "select", duration, "success")
-
            return {
                "refresh_token": decrypted_token,
                "expires_at": expires_at,
@@ -492,8 +474,6 @@ class RefreshTokenStorage:
                "scopes": scopes,
            }
        except Exception as e:
-            duration = time.time() - start_time
-            record_db_operation("sqlite", "select", duration, "error")
            logger.error(f"Failed to decrypt refresh token for user {user_id}: {e}")
            return None

@@ -588,34 +568,25 @@ class RefreshTokenStorage:
        if not self._initialized:
            await self.initialize()

-        start_time = time.time()
-        try:
-            async with aiosqlite.connect(self.db_path) as db:
-                cursor = await db.execute(
-                    "DELETE FROM refresh_tokens WHERE user_id = ?",
-                    (user_id,),
-                )
-                await db.commit()
-                deleted = cursor.rowcount > 0
+        async with aiosqlite.connect(self.db_path) as db:
+            cursor = await db.execute(
+                "DELETE FROM refresh_tokens WHERE user_id = ?",
+                (user_id,),
+            )
+            await db.commit()
+            deleted = cursor.rowcount > 0

-            duration = time.time() - start_time
-            record_db_operation("sqlite", "delete", duration, "success")
+        if deleted:
+            logger.info(f"Deleted refresh token for user {user_id}")
+            await self._audit_log(
+                event="delete_refresh_token",
+                user_id=user_id,
+                auth_method="offline_access",
+            )
+        else:
+            logger.debug(f"No refresh token to delete for user {user_id}")

-            if deleted:
-                logger.info(f"Deleted refresh token for user {user_id}")
-                await self._audit_log(
-                    event="delete_refresh_token",
-                    user_id=user_id,
-                    auth_method="offline_access",
-                )
-            else:
-                logger.debug(f"No refresh token to delete for user {user_id}")
-
-            return deleted
-        except Exception:
-            duration = time.time() - start_time
-            record_db_operation("sqlite", "delete", duration, "error")
-            raise
+        return deleted

    async def get_all_user_ids(self) -> list[str]:
        """
@@ -26,10 +26,6 @@ from jwt import PyJWKClient
 from mcp.server.auth.provider import AccessToken, TokenVerifier

 from nextcloud_mcp_server.config import Settings
-from nextcloud_mcp_server.observability.metrics import (
-    oauth_token_cache_hits_total,
-    record_oauth_token_validation,
-)

 logger = logging.getLogger(__name__)

@@ -109,11 +105,8 @@ class UnifiedTokenVerifier(TokenVerifier):
        cached = self._get_cached_token(token)
        if cached:
            logger.debug("Token found in cache")
-            oauth_token_cache_hits_total.labels(hit="true").inc()
            return cached

-        oauth_token_cache_hits_total.labels(hit="false").inc()
-
        # Both modes do the same validation (MCP audience only)
        return await self._verify_mcp_audience(token)

@@ -131,24 +124,13 @@ class UnifiedTokenVerifier(TokenVerifier):
        Returns:
            AccessToken if valid with MCP audience, None otherwise
        """
-        validation_method = "unknown"
        try:
            # Attempt JWT verification first
            if self._is_jwt_format(token) and self.jwks_client:
-                validation_method = "jwt"
                payload = await self._verify_jwt_signature(token)
-                if payload:
-                    record_oauth_token_validation("jwt", "valid")
-                else:
-                    record_oauth_token_validation("jwt", "invalid")
            else:
                # Fall back to introspection for opaque tokens
-                validation_method = "introspect"
                payload = await self._introspect_token(token)
-                if payload:
-                    record_oauth_token_validation("introspect", "valid")
-                else:
-                    record_oauth_token_validation("introspect", "invalid")
                if not payload:
                    return None

@@ -164,8 +146,6 @@ class UnifiedTokenVerifier(TokenVerifier):
                    f"Got {audiences}, need MCP ({self.settings.oidc_client_id} or "
                    f"{self.settings.nextcloud_mcp_server_url})"
                )
-                # Record as invalid due to audience mismatch
-                record_oauth_token_validation(validation_method, "invalid")
                return None

            # Log based on mode for clarity
@@ -183,7 +163,6 @@ class UnifiedTokenVerifier(TokenVerifier):

        except Exception as e:
            logger.error(f"Token verification failed: {e}")
-            record_oauth_token_validation(validation_method, "error")
            return None

    def _has_mcp_audience(self, payload: dict[str, Any]) -> bool:
@@ -288,8 +288,8 @@ def get_settings() -> Settings:
    return Settings(
        # OAuth/OIDC settings
        oidc_discovery_url=os.getenv("OIDC_DISCOVERY_URL"),
-        oidc_client_id=os.getenv("NEXTCLOUD_OIDC_CLIENT_ID"),
-        oidc_client_secret=os.getenv("NEXTCLOUD_OIDC_CLIENT_SECRET"),
+        oidc_client_id=os.getenv("OIDC_CLIENT_ID"),
+        oidc_client_secret=os.getenv("OIDC_CLIENT_SECRET"),
        oidc_issuer=os.getenv("OIDC_ISSUER"),
        # Nextcloud settings
        nextcloud_host=os.getenv("NEXTCLOUD_HOST"),
@@ -352,92 +352,3 @@ def record_dependency_check(dependency: str, duration: float) -> None:
        duration: Check duration in seconds
    """
    dependency_check_duration_seconds.labels(dependency=dependency).observe(duration)
-
-
-def record_vector_sync_scan(documents_found: int) -> None:
-    """
-    Record documents scanned during vector sync.
-
-    Args:
-        documents_found: Number of documents discovered in scan
-    """
-    vector_sync_documents_scanned_total.inc(documents_found)
-
-
-def record_vector_sync_processing(duration: float, status: str = "success") -> None:
-    """
-    Record document processing with duration and status.
-
-    Args:
-        duration: Processing duration in seconds
-        status: "success" or "error"
-    """
-    vector_sync_documents_processed_total.labels(status=status).inc()
-    vector_sync_processing_duration_seconds.observe(duration)
-
-
-def record_qdrant_operation(operation: str, status: str = "success") -> None:
-    """
-    Record Qdrant vector database operation.
-
-    Args:
-        operation: Operation type ("upsert", "search", "delete")
-        status: "success" or "error"
-    """
-    qdrant_operations_total.labels(operation=operation, status=status).inc()
-
-
-def update_vector_sync_queue_size(size: int) -> None:
-    """
-    Update vector sync queue size gauge.
-
-    Args:
-        size: Current queue size
-    """
-    vector_sync_queue_size.set(size)
-
-
-# =============================================================================
-# Decorator for Automatic Tool Instrumentation
-# =============================================================================
-
-
-def instrument_tool(func):
-    """
-    Decorator to automatically instrument MCP tool functions with metrics.
-
-    Wraps async tool functions to record execution time and success/error status.
-    Compatible with @mcp.tool() and @require_scopes() decorators.
-
-    Usage:
-        @mcp.tool()
-        @require_scopes("notes:write")
-        @instrument_tool
-        async def nc_notes_create_note(...):
-            ...
-
-    Args:
-        func: The async function to instrument
-
-    Returns:
-        Wrapped function with metrics instrumentation
-    """
-    import functools
-    import time
-
-    @functools.wraps(func)
-    async def wrapper(*args, **kwargs):
-        tool_name = func.__name__
-        start_time = time.time()
-        try:
-            result = await func(*args, **kwargs)
-            duration = time.time() - start_time
-            record_tool_call(tool_name, duration, "success")
-            return result
-        except Exception as e:
-            duration = time.time() - start_time
-            record_tool_call(tool_name, duration, "error")
-            record_tool_error(tool_name, type(e).__name__)
-            raise
-
-    return wrapper
@@ -12,7 +12,6 @@ from nextcloud_mcp_server.models.calendar import (
    ListTodosResponse,
    Todo,
 )
-from nextcloud_mcp_server.observability.metrics import instrument_tool

 logger = logging.getLogger(__name__)

@@ -21,7 +20,6 @@ def configure_calendar_tools(mcp: FastMCP):
    # Calendar tools
    @mcp.tool()
    @require_scopes("calendar:read")
-    @instrument_tool
    async def nc_calendar_list_calendars(ctx: Context) -> ListCalendarsResponse:
        """List all available calendars for the user"""
        client = await get_client(ctx)
@@ -32,7 +30,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:write")
-    @instrument_tool
    async def nc_calendar_create_event(
        calendar_name: str,
        title: str,
@@ -109,7 +106,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:read")
-    @instrument_tool
    async def nc_calendar_list_events(
        calendar_name: str,
        ctx: Context,
@@ -212,7 +208,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:read")
-    @instrument_tool
    async def nc_calendar_get_event(
        calendar_name: str,
        event_uid: str,
@@ -225,7 +220,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:write")
-    @instrument_tool
    async def nc_calendar_update_event(
        calendar_name: str,
        event_uid: str,
@@ -299,7 +293,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:write")
-    @instrument_tool
    async def nc_calendar_delete_event(
        calendar_name: str,
        event_uid: str,
@@ -311,7 +304,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:write")
-    @instrument_tool
    async def nc_calendar_create_meeting(
        title: str,
        date: str,
@@ -378,7 +370,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:read")
-    @instrument_tool
    async def nc_calendar_get_upcoming_events(
        ctx: Context,
        calendar_name: str = "",  # Empty = all calendars
@@ -429,7 +420,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:read")
-    @instrument_tool
    async def nc_calendar_find_availability(
        duration_minutes: int,
        ctx: Context,
@@ -510,7 +500,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:write")
-    @instrument_tool
    async def nc_calendar_bulk_operations(
        operation: str,  # "update", "delete", "move"
        ctx: Context,
@@ -760,7 +749,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("calendar:write")
-    @instrument_tool
    async def nc_calendar_manage_calendar(
        action: str,  # "create", "delete", "update", "list"
        ctx: Context,
@@ -830,7 +818,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("todo:read", "calendar:read")
-    @instrument_tool
    async def nc_calendar_list_todos(
        calendar_name: str,
        ctx: Context,
@@ -876,7 +863,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("todo:write", "calendar:read")
-    @instrument_tool
    async def nc_calendar_create_todo(
        calendar_name: str,
        summary: str,
@@ -920,7 +906,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("todo:write", "calendar:read")
-    @instrument_tool
    async def nc_calendar_update_todo(
        calendar_name: str,
        todo_uid: str,
@@ -981,7 +966,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("todo:write", "calendar:read")
-    @instrument_tool
    async def nc_calendar_delete_todo(
        calendar_name: str,
        todo_uid: str,
@@ -1002,7 +986,6 @@ def configure_calendar_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("todo:read", "calendar:read")
-    @instrument_tool
    async def nc_calendar_search_todos(
        ctx: Context,
        status: Optional[str] = None,
@@ -4,7 +4,6 @@ from mcp.server.fastmcp import Context, FastMCP

 from nextcloud_mcp_server.auth import require_scopes
 from nextcloud_mcp_server.context import get_client
-from nextcloud_mcp_server.observability.metrics import instrument_tool

 logger = logging.getLogger(__name__)

@@ -13,7 +12,6 @@ def configure_contacts_tools(mcp: FastMCP):
    # Contacts tools
    @mcp.tool()
    @require_scopes("contacts:read")
-    @instrument_tool
    async def nc_contacts_list_addressbooks(ctx: Context):
        """List all addressbooks for the user."""
        client = await get_client(ctx)
@@ -21,7 +19,6 @@ def configure_contacts_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("contacts:read")
-    @instrument_tool
    async def nc_contacts_list_contacts(ctx: Context, *, addressbook: str):
        """List all contacts in the specified addressbook."""
        client = await get_client(ctx)
@@ -29,7 +26,6 @@ def configure_contacts_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("contacts:write")
-    @instrument_tool
    async def nc_contacts_create_addressbook(
        ctx: Context, *, name: str, display_name: str
    ):
@@ -46,7 +42,6 @@ def configure_contacts_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("contacts:write")
-    @instrument_tool
    async def nc_contacts_delete_addressbook(ctx: Context, *, name: str):
        """Delete an addressbook."""
        client = await get_client(ctx)
@@ -54,7 +49,6 @@ def configure_contacts_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("contacts:write")
-    @instrument_tool
    async def nc_contacts_create_contact(
        ctx: Context, *, addressbook: str, uid: str, contact_data: dict
    ):
@@ -72,7 +66,6 @@ def configure_contacts_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("contacts:write")
-    @instrument_tool
    async def nc_contacts_delete_contact(ctx: Context, *, addressbook: str, uid: str):
        """Delete a contact."""
        client = await get_client(ctx)
@@ -80,7 +73,6 @@ def configure_contacts_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("contacts:write")
-    @instrument_tool
    async def nc_contacts_update_contact(
        ctx: Context, *, addressbook: str, uid: str, contact_data: dict, etag: str = ""
    ):
@@ -24,7 +24,6 @@ from nextcloud_mcp_server.models.cookbook import (
    UpdateRecipeResponse,
    Version,
 )
-from nextcloud_mcp_server.observability.metrics import instrument_tool

 logger = logging.getLogger(__name__)

@@ -73,7 +72,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:write")
-    @instrument_tool
    async def nc_cookbook_import_recipe(url: str, ctx: Context) -> ImportRecipeResponse:
        """Import a recipe from a URL using schema.org metadata.

@@ -131,7 +129,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_list_recipes(ctx: Context) -> ListRecipesResponse:
        """Get all recipes in the database"""
        client = await get_client(ctx)
@@ -157,7 +154,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_get_recipe(recipe_id: int, ctx: Context) -> Recipe:
        """Get a specific recipe by its ID"""
        client = await get_client(ctx)
@@ -183,7 +179,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:write")
-    @instrument_tool
    async def nc_cookbook_create_recipe(
        name: str,
        description: str | None = None,
@@ -263,7 +258,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:write")
-    @instrument_tool
    async def nc_cookbook_update_recipe(
        recipe_id: int,
        name: str | None = None,
@@ -353,7 +347,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:write")
-    @instrument_tool
    async def nc_cookbook_delete_recipe(
        recipe_id: int, ctx: Context
    ) -> DeleteRecipeResponse:
@@ -389,7 +382,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_search_recipes(
        query: str, ctx: Context
    ) -> SearchRecipesResponse:
@@ -426,7 +418,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_list_categories(ctx: Context) -> ListCategoriesResponse:
        """Get all known categories.

@@ -454,7 +445,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_get_recipes_in_category(
        category: str, ctx: Context
    ) -> ListRecipesResponse:
@@ -491,7 +481,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_list_keywords(ctx: Context) -> ListKeywordsResponse:
        """Get all known keywords/tags"""
        client = await get_client(ctx)
@@ -517,7 +506,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:read")
-    @instrument_tool
    async def nc_cookbook_get_recipes_with_keywords(
        keywords: list[str], ctx: Context
    ) -> ListRecipesResponse:
@@ -552,7 +540,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:write")
-    @instrument_tool
    async def nc_cookbook_set_config(
        folder: str | None = None,
        update_interval: int | None = None,
@@ -596,7 +583,6 @@ def configure_cookbook_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("cookbook:write")
-    @instrument_tool
    async def nc_cookbook_reindex(ctx: Context) -> ReindexResponse:
        """Trigger a rescan of all recipes into the caching database.

@@ -18,7 +18,6 @@ from nextcloud_mcp_server.models.deck import (
    LabelOperationResponse,
    StackOperationResponse,
 )
-from nextcloud_mcp_server.observability.metrics import instrument_tool

 logger = logging.getLogger(__name__)

@@ -119,7 +118,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_boards(ctx: Context) -> list[DeckBoard]:
        """Get all Nextcloud Deck boards"""
        client = await get_client(ctx)
@@ -128,7 +126,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_board(ctx: Context, board_id: int) -> DeckBoard:
        """Get details of a specific Nextcloud Deck board"""
        client = await get_client(ctx)
@@ -137,7 +134,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_stacks(ctx: Context, board_id: int) -> list[DeckStack]:
        """Get all stacks in a Nextcloud Deck board"""
        client = await get_client(ctx)
@@ -146,7 +142,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_stack(ctx: Context, board_id: int, stack_id: int) -> DeckStack:
        """Get details of a specific Nextcloud Deck stack"""
        client = await get_client(ctx)
@@ -155,7 +150,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_cards(
        ctx: Context, board_id: int, stack_id: int
    ) -> list[DeckCard]:
@@ -168,7 +162,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int
    ) -> DeckCard:
@@ -179,7 +172,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_labels(ctx: Context, board_id: int) -> list[DeckLabel]:
        """Get all labels in a Nextcloud Deck board"""
        client = await get_client(ctx)
@@ -188,7 +180,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:read")
-    @instrument_tool
    async def deck_get_label(ctx: Context, board_id: int, label_id: int) -> DeckLabel:
        """Get details of a specific Nextcloud Deck label"""
        client = await get_client(ctx)
@@ -199,7 +190,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_create_board(
        ctx: Context, title: str, color: str
    ) -> CreateBoardResponse:
@@ -217,7 +207,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_create_stack(
        ctx: Context, board_id: int, title: str, order: int
    ) -> CreateStackResponse:
@@ -234,7 +223,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_update_stack(
        ctx: Context,
        board_id: int,
@@ -261,7 +249,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_delete_stack(
        ctx: Context, board_id: int, stack_id: int
    ) -> StackOperationResponse:
@@ -283,7 +270,6 @@ def configure_deck_tools(mcp: FastMCP):
    # Card Tools
    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_create_card(
        ctx: Context,
        board_id: int,
@@ -318,7 +304,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_update_card(
        ctx: Context,
        board_id: int,
@@ -372,7 +357,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_delete_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int
    ) -> CardOperationResponse:
@@ -395,7 +379,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_archive_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int
    ) -> CardOperationResponse:
@@ -418,7 +401,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_unarchive_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int
    ) -> CardOperationResponse:
@@ -441,7 +423,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_reorder_card(
        ctx: Context,
        board_id: int,
@@ -474,7 +455,6 @@ def configure_deck_tools(mcp: FastMCP):
    # Label Tools
    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_create_label(
        ctx: Context, board_id: int, title: str, color: str
    ) -> CreateLabelResponse:
@@ -491,7 +471,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_update_label(
        ctx: Context,
        board_id: int,
@@ -518,7 +497,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_delete_label(
        ctx: Context, board_id: int, label_id: int
    ) -> LabelOperationResponse:
@@ -540,7 +518,6 @@ def configure_deck_tools(mcp: FastMCP):
    # Card-Label Assignment Tools
    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_assign_label_to_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int, label_id: int
    ) -> CardOperationResponse:
@@ -564,7 +541,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_remove_label_from_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int, label_id: int
    ) -> CardOperationResponse:
@@ -589,7 +565,6 @@ def configure_deck_tools(mcp: FastMCP):
    # Card-User Assignment Tools
    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_assign_user_to_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int, user_id: str
    ) -> CardOperationResponse:
@@ -613,7 +588,6 @@ def configure_deck_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("deck:write")
-    @instrument_tool
    async def deck_unassign_user_from_card(
        ctx: Context, board_id: int, stack_id: int, card_id: int, user_id: str
    ) -> CardOperationResponse:
@@ -17,7 +17,6 @@ from nextcloud_mcp_server.models.notes import (
    SearchNotesResponse,
    UpdateNoteResponse,
 )
-from nextcloud_mcp_server.observability.metrics import instrument_tool

 logger = logging.getLogger(__name__)

@@ -87,7 +86,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:write")
-    @instrument_tool
    async def nc_notes_create_note(
        title: str, content: str, category: str, ctx: Context
    ) -> CreateNoteResponse:
@@ -134,7 +132,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:write")
-    @instrument_tool
    async def nc_notes_update_note(
        note_id: int,
        etag: str,
@@ -200,7 +197,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:write")
-    @instrument_tool
    async def nc_notes_append_content(
        note_id: int, content: str, ctx: Context
    ) -> AppendContentResponse:
@@ -251,7 +247,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:read")
-    @instrument_tool
    async def nc_notes_search_notes(query: str, ctx: Context) -> SearchNotesResponse:
        """Search notes by title or content, returning only id, title, and category (requires notes:read scope)."""
        client = await get_client(ctx)
@@ -298,7 +293,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:read")
-    @instrument_tool
    async def nc_notes_get_note(note_id: int, ctx: Context) -> Note:
        """Get a specific note by its ID (requires notes:read scope)"""
        client = await get_client(ctx)
@@ -328,7 +322,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:read")
-    @instrument_tool
    async def nc_notes_get_attachment(
        note_id: int, attachment_filename: str, ctx: Context
    ) -> dict[str, str]:
@@ -375,7 +368,6 @@ def configure_notes_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("notes:write")
-    @instrument_tool
    async def nc_notes_delete_note(note_id: int, ctx: Context) -> DeleteNoteResponse:
        """Delete a note permanently"""
        logger.info("Deleting note %s", note_id)
@@ -21,10 +21,6 @@ from nextcloud_mcp_server.models.semantic import (
    SemanticSearchResult,
    VectorSyncStatusResponse,
 )
-from nextcloud_mcp_server.observability.metrics import (
-    instrument_tool,
-    record_qdrant_operation,
-)

 logger = logging.getLogger(__name__)

@@ -34,7 +30,6 @@ def configure_semantic_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("semantic:read")
-    @instrument_tool
    async def nc_semantic_search(
        query: str, ctx: Context, limit: int = 10, score_threshold: float = 0.7
    ) -> SemanticSearchResponse:
@@ -90,33 +85,26 @@ def configure_semantic_tools(mcp: FastMCP):
            # Note: Currently only searching notes (doc_type="note")
            # Future: Remove doc_type filter to search all apps
            qdrant_client = await get_qdrant_client()
-            try:
-                search_response = await qdrant_client.query_points(
-                    collection_name=settings.get_collection_name(),
-                    query=query_embedding,
-                    query_filter=Filter(
-                        must=[
-                            FieldCondition(
-                                key="user_id",
-                                match=MatchValue(value=username),
-                            ),
-                            FieldCondition(
-                                key="doc_type",
-                                match=MatchValue(value="note"),
-                            ),
-                        ]
-                    ),
-                    limit=limit * 2,  # Get extra for filtering
-                    score_threshold=score_threshold,
-                    with_payload=True,
-                    with_vectors=False,  # Don't return vectors to save bandwidth
-                )
-                # Record successful search operation
-                record_qdrant_operation("search", "success")
-            except Exception:
-                # Record failed search operation
-                record_qdrant_operation("search", "error")
-                raise
+            search_response = await qdrant_client.query_points(
+                collection_name=settings.get_collection_name(),
+                query=query_embedding,
+                query_filter=Filter(
+                    must=[
+                        FieldCondition(
+                            key="user_id",
+                            match=MatchValue(value=username),
+                        ),
+                        FieldCondition(
+                            key="doc_type",
+                            match=MatchValue(value="note"),
+                        ),
+                    ]
+                ),
+                limit=limit * 2,  # Get extra for filtering
+                score_threshold=score_threshold,
+                with_payload=True,
+                with_vectors=False,  # Don't return vectors to save bandwidth
+            )

            logger.info(
                f"Qdrant returned {len(search_response.points)} results "
@@ -220,7 +208,6 @@ def configure_semantic_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("semantic:read")
-    @instrument_tool
    async def nc_semantic_search_answer(
        query: str,
        ctx: Context,
@@ -344,71 +331,21 @@ def configure_semantic_tools(mcp: FastMCP):
                success=True,
            )

-        # 4. Fetch full content for notes to provide complete context to LLM
-        # Filter out inaccessible notes (deleted or permissions changed)
-        client = await get_client(ctx)
-        accessible_results = []
-        full_contents = []  # Full content for accessible notes
-
-        for result in search_response.results:
-            if result.doc_type == "note":
-                try:
-                    note = await client.notes.get_note(result.id)
-                    # Note is accessible, store full content
-                    accessible_results.append(result)
-                    full_contents.append(note.get("content", ""))
-                    logger.debug(
-                        f"Fetched full content for note {result.id} "
-                        f"(length: {len(full_contents[-1])} chars)"
-                    )
-                except Exception as e:
-                    # Note might have been deleted or permissions changed
-                    # Filter it out to avoid corrupting LLM with inaccessible data
-                    logger.warning(
-                        f"Failed to fetch full content for note {result.id}: {e}. "
-                        f"Excluding from results."
-                    )
-            else:
-                # Non-note document types (future: calendar, deck, files)
-                # For now, keep them with excerpts
-                accessible_results.append(result)
-                full_contents.append(None)
-
-        # Check if we filtered out all results
-        if not accessible_results:
-            logger.warning(f"All search results became inaccessible for query: {query}")
-            return SamplingSearchResponse(
-                query=query,
-                generated_answer="All matching documents are no longer accessible.",
-                sources=[],
-                total_found=0,
-                search_method="semantic_sampling",
-                success=True,
-            )
-
-        # 5. Construct context from accessible documents with full content
+        # 4. Construct context from retrieved documents
        context_parts = []
-        for idx, (result, content) in enumerate(
-            zip(accessible_results, full_contents), 1
-        ):
-            # Use full content if available (notes), otherwise use excerpt
-            if content is not None:
-                content_field = f"Content: {content}"
-            else:
-                content_field = f"Excerpt: {result.excerpt}"
-
+        for idx, result in enumerate(search_response.results, 1):
            context_parts.append(
                f"[Document {idx}]\n"
                f"Type: {result.doc_type}\n"
                f"Title: {result.title}\n"
                f"Category: {result.category}\n"
-                f"{content_field}\n"
+                f"Excerpt: {result.excerpt}\n"
                f"Relevance Score: {result.score:.2f}\n"
            )

        context = "\n".join(context_parts)

-        # 6. Construct prompt - reuse user's query, add context and instructions
+        # 5. Construct prompt - reuse user's query, add context and instructions
        prompt = (
            f"{query}\n\n"
            f"Here are relevant documents from Nextcloud (notes, calendar events, deck cards, files, contacts):\n\n"
@@ -464,8 +401,8 @@ def configure_semantic_tools(mcp: FastMCP):
            return SamplingSearchResponse(
                query=query,
                generated_answer=generated_answer,
-                sources=accessible_results,
-                total_found=len(accessible_results),
+                sources=search_response.results,
+                total_found=search_response.total_found,
                search_method="semantic_sampling",
                model_used=sampling_result.model,
                stop_reason=sampling_result.stopReason,
@@ -482,11 +419,11 @@ def configure_semantic_tools(mcp: FastMCP):
                generated_answer=(
                    f"[Sampling request timed out]\n\n"
                    f"The answer generation took too long (>30s). "
-                    f"Found {len(accessible_results)} relevant documents. "
+                    f"Found {search_response.total_found} relevant documents. "
                    f"Please review the sources below or try a simpler query."
                ),
-                sources=accessible_results,
-                total_found=len(accessible_results),
+                sources=search_response.results,
+                total_found=search_response.total_found,
                search_method="semantic_sampling_timeout",
                success=True,
            )
@@ -517,11 +454,11 @@ def configure_semantic_tools(mcp: FastMCP):
                query=query,
                generated_answer=(
                    f"[{user_message}]\n\n"
-                    f"Found {len(accessible_results)} relevant documents. "
+                    f"Found {search_response.total_found} relevant documents. "
                    f"Please review the sources below."
                ),
-                sources=accessible_results,
-                total_found=len(accessible_results),
+                sources=search_response.results,
+                total_found=search_response.total_found,
                search_method=search_method,
                success=True,
            )
@@ -538,18 +475,17 @@ def configure_semantic_tools(mcp: FastMCP):
                query=query,
                generated_answer=(
                    f"[Unexpected error during sampling]\n\n"
-                    f"Found {len(accessible_results)} relevant documents. "
+                    f"Found {search_response.total_found} relevant documents. "
                    f"Please review the sources below."
                ),
-                sources=accessible_results,
-                total_found=len(accessible_results),
+                sources=search_response.results,
+                total_found=search_response.total_found,
                search_method="semantic_sampling_error",
                success=True,
            )

    @mcp.tool()
    @require_scopes("semantic:read")
-    @instrument_tool
    async def nc_get_vector_sync_status(ctx: Context) -> VectorSyncStatusResponse:
        """Get the current vector sync status.

@@ -6,7 +6,6 @@ from mcp.server.fastmcp import Context, FastMCP

 from nextcloud_mcp_server.auth import require_scopes
 from nextcloud_mcp_server.context import get_client
-from nextcloud_mcp_server.observability.metrics import instrument_tool


 def configure_sharing_tools(mcp: FastMCP):
@@ -18,7 +17,6 @@ def configure_sharing_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("sharing:write")
-    @instrument_tool
    async def nc_share_create(
        path: str,
        share_with: str,
@@ -58,7 +56,6 @@ def configure_sharing_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("sharing:write")
-    @instrument_tool
    async def nc_share_delete(share_id: int, ctx: Context) -> str:
        """Delete a share by its ID.

@@ -78,7 +75,6 @@ def configure_sharing_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("sharing:write")
-    @instrument_tool
    async def nc_share_get(share_id: int, ctx: Context) -> str:
        """Get information about a specific share.

@@ -97,7 +93,6 @@ def configure_sharing_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("sharing:write")
-    @instrument_tool
    async def nc_share_list(
        ctx: Context, path: str | None = None, shared_with_me: bool = False
    ) -> str:
@@ -119,7 +114,6 @@ def configure_sharing_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("sharing:write")
-    @instrument_tool
    async def nc_share_update(share_id: int, permissions: int, ctx: Context) -> str:
        """Update the permissions of an existing share.

@@ -4,7 +4,6 @@ from mcp.server.fastmcp import Context, FastMCP

 from nextcloud_mcp_server.auth import require_scopes
 from nextcloud_mcp_server.context import get_client
-from nextcloud_mcp_server.observability.metrics import instrument_tool

 logger = logging.getLogger(__name__)

@@ -13,7 +12,6 @@ def configure_tables_tools(mcp: FastMCP):
    # Tables tools
    @mcp.tool()
    @require_scopes("tables:read")
-    @instrument_tool
    async def nc_tables_list_tables(ctx: Context):
        """List all tables available to the user"""
        client = await get_client(ctx)
@@ -21,7 +19,6 @@ def configure_tables_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("tables:read")
-    @instrument_tool
    async def nc_tables_get_schema(table_id: int, ctx: Context):
        """Get the schema/structure of a specific table including columns and views"""
        client = await get_client(ctx)
@@ -29,7 +26,6 @@ def configure_tables_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("tables:read")
-    @instrument_tool
    async def nc_tables_read_table(
        table_id: int,
        ctx: Context,
@@ -42,7 +38,6 @@ def configure_tables_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("tables:write")
-    @instrument_tool
    async def nc_tables_insert_row(table_id: int, data: dict, ctx: Context):
        """Insert a new row into a table.

@@ -53,7 +48,6 @@ def configure_tables_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("tables:write")
-    @instrument_tool
    async def nc_tables_update_row(row_id: int, data: dict, ctx: Context):
        """Update an existing row in a table.

@@ -64,7 +58,6 @@ def configure_tables_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("tables:write")
-    @instrument_tool
    async def nc_tables_delete_row(row_id: int, ctx: Context):
        """Delete a row from a table"""
        client = await get_client(ctx)
@@ -5,7 +5,6 @@ from mcp.server.fastmcp import Context, FastMCP
 from nextcloud_mcp_server.auth import require_scopes
 from nextcloud_mcp_server.context import get_client
 from nextcloud_mcp_server.models import DirectoryListing, FileInfo, SearchFilesResponse
-from nextcloud_mcp_server.observability.metrics import instrument_tool
 from nextcloud_mcp_server.utils.document_parser import (
    is_parseable_document,
    parse_document,
@@ -18,7 +17,6 @@ def configure_webdav_tools(mcp: FastMCP):
    # WebDAV file system tools
    @mcp.tool()
    @require_scopes("files:read")
-    @instrument_tool
    async def nc_webdav_list_directory(
        ctx: Context, path: str = ""
    ) -> DirectoryListing:
@@ -52,7 +50,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:read")
-    @instrument_tool
    async def nc_webdav_read_file(path: str, ctx: Context):
        """Read the content of a file from NextCloud.

@@ -133,7 +130,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:write")
-    @instrument_tool
    async def nc_webdav_write_file(
        path: str, content: str, ctx: Context, content_type: str | None = None
    ):
@@ -162,7 +158,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:write")
-    @instrument_tool
    async def nc_webdav_create_directory(path: str, ctx: Context):
        """Create a directory in NextCloud.

@@ -177,7 +172,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:write")
-    @instrument_tool
    async def nc_webdav_delete_resource(path: str, ctx: Context):
        """Delete a file or directory in NextCloud.

@@ -192,7 +186,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:write")
-    @instrument_tool
    async def nc_webdav_move_resource(
        source_path: str, destination_path: str, ctx: Context, overwrite: bool = False
    ):
@@ -213,7 +206,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:write")
-    @instrument_tool
    async def nc_webdav_copy_resource(
        source_path: str, destination_path: str, ctx: Context, overwrite: bool = False
    ):
@@ -234,7 +226,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:read")
-    @instrument_tool
    async def nc_webdav_search_files(
        ctx: Context,
        scope: str = "",
@@ -351,7 +342,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:read")
-    @instrument_tool
    async def nc_webdav_find_by_name(
        pattern: str, ctx: Context, scope: str = "", limit: int | None = None
    ) -> SearchFilesResponse:
@@ -379,7 +369,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:read")
-    @instrument_tool
    async def nc_webdav_find_by_type(
        mime_type: str, ctx: Context, scope: str = "", limit: int | None = None
    ) -> SearchFilesResponse:
@@ -407,7 +396,6 @@ def configure_webdav_tools(mcp: FastMCP):

    @mcp.tool()
    @require_scopes("files:read")
-    @instrument_tool
    async def nc_webdav_list_favorites(
        ctx: Context, scope: str = "", limit: int | None = None
    ) -> SearchFilesResponse:
@@ -15,11 +15,6 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue, PointStruct
 from nextcloud_mcp_server.client import NextcloudClient
 from nextcloud_mcp_server.config import get_settings
 from nextcloud_mcp_server.embedding import get_embedding_service
-from nextcloud_mcp_server.observability.metrics import (
-    record_qdrant_operation,
-    record_vector_sync_processing,
-    update_vector_sync_queue_size,
-)
 from nextcloud_mcp_server.observability.tracing import trace_operation
 from nextcloud_mcp_server.vector.document_chunker import DocumentChunker
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
@@ -62,21 +57,11 @@ async def processor_task(
            with anyio.fail_after(1.0):
                doc_task = await receive_stream.receive()

-            # Update queue size metric after receiving
-            stream_stats = receive_stream.statistics()
-            update_vector_sync_queue_size(stream_stats.current_buffer_used)
-
            # Process document
            await process_document(doc_task, nc_client)

-            # Update queue size metric after processing
-            stream_stats = receive_stream.statistics()
-            update_vector_sync_queue_size(stream_stats.current_buffer_used)
-
        except TimeoutError:
-            # No documents available, update metric to show empty queue
-            stream_stats = receive_stream.statistics()
-            update_vector_sync_queue_size(stream_stats.current_buffer_used)
+            # No documents available, continue
            continue

        except anyio.EndOfStream:
@@ -105,8 +90,6 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
        doc_task: Document task to process
        nc_client: Authenticated Nextcloud client
    """
-    start_time = time.time()
-
    logger.debug(
        f"Processing {doc_task.doc_type}_{doc_task.doc_id} "
        f"for {doc_task.user_id} ({doc_task.operation})"
@@ -122,79 +105,58 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
            "vector_sync.doc_operation": doc_task.operation,
        },
    ):
-        try:
-            qdrant_client = await get_qdrant_client()
-            settings = get_settings()
+        qdrant_client = await get_qdrant_client()
+        settings = get_settings()

-            # Handle deletion
-            if doc_task.operation == "delete":
-                await qdrant_client.delete(
-                    collection_name=settings.get_collection_name(),
-                    points_selector=Filter(
-                        must=[
-                            FieldCondition(
-                                key="user_id",
-                                match=MatchValue(value=doc_task.user_id),
-                            ),
-                            FieldCondition(
-                                key="doc_id",
-                                match=MatchValue(value=doc_task.doc_id),
-                            ),
-                            FieldCondition(
-                                key="doc_type",
-                                match=MatchValue(value=doc_task.doc_type),
-                            ),
-                        ]
-                    ),
-                )
-                logger.info(
-                    f"Deleted {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id}"
-                )
+        # Handle deletion
+        if doc_task.operation == "delete":
+            await qdrant_client.delete(
+                collection_name=settings.get_collection_name(),
+                points_selector=Filter(
+                    must=[
+                        FieldCondition(
+                            key="user_id",
+                            match=MatchValue(value=doc_task.user_id),
+                        ),
+                        FieldCondition(
+                            key="doc_id",
+                            match=MatchValue(value=doc_task.doc_id),
+                        ),
+                        FieldCondition(
+                            key="doc_type",
+                            match=MatchValue(value=doc_task.doc_type),
+                        ),
+                    ]
+                ),
+            )
+            logger.info(
+                f"Deleted {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id}"
+            )
+            return

-                # Record successful deletion metrics
-                duration = time.time() - start_time
-                record_qdrant_operation("delete", "success")
-                record_vector_sync_processing(duration, "success")
-                return
+        # Handle indexing with retry
+        max_retries = 3
+        retry_delay = 1.0

-            # Handle indexing with retry
-            max_retries = 3
-            retry_delay = 1.0
+        for attempt in range(max_retries):
+            try:
+                await _index_document(doc_task, nc_client, qdrant_client)
+                return  # Success

-            for attempt in range(max_retries):
-                try:
-                    await _index_document(doc_task, nc_client, qdrant_client)
-
-                    # Record successful processing metrics
-                    duration = time.time() - start_time
-                    record_qdrant_operation("upsert", "success")
-                    record_vector_sync_processing(duration, "success")
-                    return  # Success
-
-                except (HTTPStatusError, Exception) as e:
-                    if attempt < max_retries - 1:
-                        logger.warning(
-                            f"Retry {attempt + 1}/{max_retries} for "
-                            f"{doc_task.doc_type}_{doc_task.doc_id}: {e}"
-                        )
-                        await anyio.sleep(retry_delay)
-                        retry_delay *= 2  # Exponential backoff
-                    else:
-                        logger.error(
-                            f"Failed to index {doc_task.doc_type}_{doc_task.doc_id} "
-                            f"after {max_retries} retries: {e}"
-                        )
-                        # Record failed processing metrics
-                        duration = time.time() - start_time
-                        record_qdrant_operation("upsert", "error")
-                        record_vector_sync_processing(duration, "error")
-                        raise
-
-        except Exception:
-            # Catch any other unexpected errors
-            duration = time.time() - start_time
-            record_vector_sync_processing(duration, "error")
-            raise
+            except (HTTPStatusError, Exception) as e:
+                if attempt < max_retries - 1:
+                    logger.warning(
+                        f"Retry {attempt + 1}/{max_retries} for "
+                        f"{doc_task.doc_type}_{doc_task.doc_id}: {e}"
+                    )
+                    await anyio.sleep(retry_delay)
+                    retry_delay *= 2  # Exponential backoff
+                else:
+                    logger.error(
+                        f"Failed to index {doc_task.doc_type}_{doc_task.doc_id} "
+                        f"after {max_retries} retries: {e}"
+                    )
+                    raise


 async def _index_document(
@@ -13,7 +13,6 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue

 from nextcloud_mcp_server.client import NextcloudClient
 from nextcloud_mcp_server.config import get_settings
-from nextcloud_mcp_server.observability.metrics import record_vector_sync_scan
 from nextcloud_mcp_server.observability.tracing import trace_operation
 from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client

@@ -182,9 +181,6 @@ async def scan_user_documents(
        ]
        logger.info(f"[SCAN-{scan_id}] Found {len(notes)} notes for {user_id}")

-        # Record documents scanned
-        record_vector_sync_scan(len(notes))
-
        if initial_sync:
            # Send everything on first sync
            for note in notes:
@@ -1,6 +1,6 @@
 [project]
 name = "nextcloud-mcp-server"
-version = "0.33.1"
+version = "0.32.1"
 description = "Model Context Protocol (MCP) server for Nextcloud integration - enables AI assistants to interact with Nextcloud data"
 authors = [
    {name = "Chris Coutinho", email = "chris@coutinho.io"}
@@ -1053,7 +1053,7 @@ wheels = [

 [[package]]
 name = "nextcloud-mcp-server"
-version = "0.33.1"
+version = "0.32.1"
 source = { editable = "." }
 dependencies = [
    { name = "aiosqlite" },