Compare commits

..

29 Commits

Author SHA1 Message Date
github-actions[bot] a0576aa9a2 bump: version 0.29.1 → 0.29.2 2025-11-09 18:28:34 +00:00
Chris Coutinho 4a6c60113b fix(helm): Set default strategy to Recreate 2025-11-09 19:27:55 +01:00
Chris Coutinho a0cb1ac9fe Merge pull request #281 from cbcoutinho/renovate/qdrant-1.x
chore(deps): update helm release qdrant to v1
2025-11-09 18:38:22 +01:00
renovate-bot-cbcoutinho[bot] de4f1032aa chore(deps): update helm release qdrant to v1 2025-11-09 17:08:13 +00:00
Chris Coutinho 178be5da6d Merge pull request #279 from cbcoutinho/renovate/ollama-1.x
chore(deps): update helm release ollama to v1.34.0
2025-11-09 18:04:08 +01:00
Chris Coutinho 61d8c851c9 Merge pull request #272 from cbcoutinho/renovate/softprops-action-gh-release-2.x
chore(deps): update softprops/action-gh-release action to v2.4.2
2025-11-09 17:02:19 +01:00
Chris Coutinho a8c63c8379 Merge pull request #278 from cbcoutinho/renovate/azure-setup-helm-4.x
chore(deps): update azure/setup-helm action to v4.3.1
2025-11-09 17:01:59 +01:00
renovate-bot-cbcoutinho[bot] 3147180ccd chore(deps): update helm release ollama to v1.34.0 2025-11-09 11:08:18 +00:00
renovate-bot-cbcoutinho[bot] 380578dd2e chore(deps): update softprops/action-gh-release action to v2.4.2 2025-11-09 11:07:57 +00:00
renovate-bot-cbcoutinho[bot] 10c5557aea chore(deps): update azure/setup-helm action to v4.3.1 2025-11-09 11:07:52 +00:00
github-actions[bot] 7772b1ac2e bump: version 0.29.0 → 0.29.1 2025-11-09 08:54:26 +00:00
Chris Coutinho 0513bec105 Merge pull request #275 from cbcoutinho/feature/observability-monitoring
fix(observability): isolate metrics endpoint to dedicated port
2025-11-09 09:54:00 +01:00
Chris Coutinho 4e89e92b65 fix(observability): isolate metrics endpoint to dedicated port
Security fix: Move Prometheus metrics endpoint from main HTTP port to
dedicated port 9090 to prevent external exposure of metrics data.

Changes:
- Use prometheus_client.start_http_server() for dedicated metrics server
- Remove /metrics route from main application routes
- Metrics now only accessible on port 9090 (configurable via METRICS_PORT)
- Main application port no longer serves /metrics endpoint

This follows security best practice of isolating monitoring endpoints
from application traffic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 09:53:36 +01:00
github-actions[bot] af96378cb6 bump: version 0.28.0 → 0.29.0 2025-11-09 08:29:53 +00:00
Chris Coutinho c5da11aa4c Merge pull request #274 from cbcoutinho/feature/observability-monitoring
feature/observability monitoring
2025-11-09 09:29:25 +01:00
Chris Coutinho 5e4667a643 fix(readiness): Only check external Qdrant in network mode
The readiness probe incorrectly tried to connect to an external Qdrant service
even when using memory or persistent mode (embedded Qdrant). This caused pods
to never become ready in Kubernetes deployments using the default configuration.

Root cause:
- In memory/persistent modes, QDRANT_URL env var is NOT set
- Readiness check used default 'http://qdrant:6333' anyway
- Tried to connect to non-existent service
- Connection failed -> 503 -> pod stuck in not-ready state

Fix:
- Only check external Qdrant health if QDRANT_URL is explicitly set (network mode)
- For embedded modes (memory/persistent), report status as 'embedded' without blocking
- Background scanner tasks don't block readiness (already non-blocking via anyio.start_soon)

This allows pods to become ready immediately when using embedded Qdrant,
while still validating external Qdrant connectivity in network mode.

Fixes: Kubernetes pods failing readiness check with default Qdrant configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 09:28:09 +01:00
Chris Coutinho 093ac5b5ba feat(helm): Add observability support with ServiceMonitor and Grafana dashboard
Add comprehensive observability configuration to Helm chart:

**Helm Values:**
- Add observability configuration section for metrics, tracing, and logging
- Add serviceMonitor configuration (disabled by default)
- Add prometheusRule configuration (disabled by default)

**Templates:**
- Update deployment to include observability environment variables
- Update deployment to expose metrics port (9090)
- Update service to expose metrics port
- Add ServiceMonitor template for Prometheus Operator
- Add PrometheusRule template with critical and warning alerts

**Dashboards:**
- Add comprehensive Grafana dashboard JSON with 6 panels:
  - Request Rate (by method and endpoint)
  - Error Rate (5xx errors percentage)
  - Request Latency (P50/P95 by endpoint)
  - Top MCP Tools (by invocation volume)
  - Nextcloud API Latency (by app)
  - Vector Sync Queue Size
- Add dashboard README with import instructions

**Alert Rules:**
- Critical: Server down, high error rate (>5%), high latency (>1s), dependency down
- Warning: Token validation errors (>1%), vector sync queue high (>100), Qdrant slow (>500ms)

All features are opt-in via values.yaml configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 09:10:11 +01:00
github-actions[bot] ae81f0334e bump: version 0.27.3 → 0.28.0 2025-11-09 08:04:06 +00:00
Chris Coutinho 23f3a231a5 Merge pull request #273 from cbcoutinho/feature/observability-monitoring
Feature/observability monitoring
2025-11-09 09:03:40 +01:00
Chris Coutinho 7be40a33e1 fix(vector): Handle missing 'modified' field in notes gracefully
The vector scanner crashed when encountering notes without a 'modified' field,
causing KeyError and preventing initial sync from completing.

Changes:
- Use dict.get() with fallback value (0) instead of direct key access
- Log warnings for notes missing 'modified' field
- Apply fix to both initial sync and incremental sync code paths

This ensures the scanner continues processing all notes even if some have
missing metadata fields, preventing scanner crashes that could affect
deployment readiness.

Fixes: Notes without 'modified' field causing scanner crash and readiness check failure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 09:03:05 +01:00
Chris Coutinho 578de4d7d6 feat(observability): Add comprehensive monitoring with Prometheus and OpenTelemetry
- Add Prometheus metrics for HTTP, MCP tools, Nextcloud API, OAuth, vector sync, and DB operations
- Add OpenTelemetry distributed tracing with OTLP export
- Add structured JSON logging with trace context correlation
- Add ObservabilityMiddleware for automatic HTTP instrumentation
- Add app_name attribute to all client classes for per-app metrics
- Add configuration for metrics, tracing, and logging via environment variables
- Add documentation in docs/observability.md
- Fix graceful degradation when tracing is disabled (default state)
- Fix uvicorn logging configuration to use observability formatters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 08:54:04 +01:00
github-actions[bot] 8f0f989c6d bump: version 0.27.2 → 0.27.3 2025-11-09 06:52:31 +00:00
Chris Coutinho f8a2935c22 fix(ci): Use helm dependency build instead of update to use Chart.lock 2025-11-09 07:52:00 +01:00
github-actions[bot] 137dc80075 bump: version 0.27.1 → 0.27.2 2025-11-09 06:45:44 +00:00
Chris Coutinho 725ac65e6a fix(helm): update Qdrant dependency condition to match new mode structure
The Qdrant subchart was being included by default even in memory/persistent
modes. Changed the dependency condition from `qdrant.enabled` to
`qdrant.networkMode.deploySubchart` to align with the three-mode structure.

Now the Qdrant subchart is ONLY deployed when:
- qdrant.mode: "network"
- qdrant.networkMode.deploySubchart: true

Verified all three modes:
- Memory mode (:memory:): No subchart, QDRANT_LOCATION=:memory:
- Persistent mode (path): No subchart, QDRANT_LOCATION=/app/data/qdrant, PVC created
- Network mode (subchart): Qdrant subchart deployed, QDRANT_URL=http://...:6333
- Network mode (external): No subchart, QDRANT_URL=<external-url>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 07:45:06 +01:00
github-actions[bot] f51edff25d bump: version 0.27.0 → 0.27.1 2025-11-09 06:22:00 +00:00
Chris Coutinho 50ba6ccc88 fix(ci): add Helm repository setup to chart release workflow
The chart-releaser was failing because it couldn't resolve the
dependencies (Qdrant and Ollama subcharts) when packaging.

Changes:
- Add azure/setup-helm action to install Helm v3.16.0
- Add step to add Qdrant and Ollama Helm repositories
- Run helm dependency update before chart-releaser runs

This fixes the error:
"Error: no repository definition for https://qdrant.github.io/qdrant-helm, https://otwld.github.io/ollama-helm"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 07:21:17 +01:00
github-actions[bot] 538bbc375e bump: version 0.26.1 → 0.27.0 2025-11-09 06:15:27 +00:00
Chris Coutinho d4c686eba7 Merge pull request #271 from cbcoutinho/docs/adr-007-background-vector-sync
feat: implement ADR-007 background vector sync and semantic search
2025-11-09 07:15:00 +01:00
34 changed files with 2923 additions and 27 deletions
+1 -1
View File
@@ -25,7 +25,7 @@ jobs:
github_token: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
changelog_increment_filename: body.md
- name: Release
uses: softprops/action-gh-release@6da8fa9354ddfdc4aeace5fc48d7f679b5214090 # v2.4.1
uses: softprops/action-gh-release@5be0e66d93ac7ed76da52eca8bb058f665c3a5fe # v2.4.2
with:
body_path: "body.md"
tag_name: v${{ env.REVISION }}
+12
View File
@@ -24,6 +24,18 @@ jobs:
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Install Helm
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # v4.3.1
with:
version: v3.16.0
- name: Add Helm repositories and update dependencies
run: |
helm repo add qdrant https://qdrant.github.io/qdrant-helm
helm repo add ollama https://otwld.github.io/ollama-helm
helm repo update
helm dependency build charts/nextcloud-mcp-server
- name: Run chart-releaser
uses: helm/chart-releaser-action@cae68fefc6b5f367a0275617c9f83181ba54714f # v1.7.0
env:
+74
View File
@@ -1,3 +1,77 @@
## v0.29.2 (2025-11-09)
### Fix
- **helm**: Set default strategy to Recreate
## v0.29.1 (2025-11-09)
### Fix
- **observability**: isolate metrics endpoint to dedicated port
## v0.29.0 (2025-11-09)
### Feat
- **helm**: Add observability support with ServiceMonitor and Grafana dashboard
### Fix
- **readiness**: Only check external Qdrant in network mode
## v0.28.0 (2025-11-09)
### Feat
- **observability**: Add comprehensive monitoring with Prometheus and OpenTelemetry
### Fix
- **vector**: Handle missing 'modified' field in notes gracefully
## v0.27.3 (2025-11-09)
### Fix
- **ci**: Use helm dependency build instead of update to use Chart.lock
## v0.27.2 (2025-11-09)
### Fix
- **helm**: update Qdrant dependency condition to match new mode structure
## v0.27.1 (2025-11-09)
### Fix
- **ci**: add Helm repository setup to chart release workflow
## v0.27.0 (2025-11-09)
### Feat
- **helm**: add Qdrant local mode support with three deployment options [skip ci]
- add Qdrant local mode support with in-memory and persistent storage
- implement ADR-009 - refactor semantic search to use generic semantic:read scope
- implement MCP sampling for semantic search RAG (ADR-008)
- add optional vector database and semantic search to helm chart
- add vector sync processing status to /user/page endpoint
- implement semantic search tool and fix vector sync issues (ADR-007 Phase 3)
- implement vector sync scanner and processor (ADR-007 Phase 2)
### Fix
- implement deletion grace period and vector sync status tool
- remove unnecessary urllib3<2.0 constraint
- integrate vector sync tasks with Starlette lifespan for streamable-http
### Refactor
- migrate vector sync from asyncio.Queue to anyio memory object streams
- update to Qdrant query_points API and fix Playwright Keycloak login
## v0.26.1 (2025-11-08)
### Fix
+4 -4
View File
@@ -1,9 +1,9 @@
dependencies:
- name: qdrant
repository: https://qdrant.github.io/qdrant-helm
version: 0.9.0
version: 1.15.5
- name: ollama
repository: https://otwld.github.io/ollama-helm
version: 1.33.0
digest: sha256:c53b7a604d202460f60408a62025ae837cad8d4da970b1e5bb404e2b41289f94
generated: "2025-11-08T23:44:59.709689907+01:00"
version: 1.34.0
digest: sha256:d51c97d05be2614b751c0dd7267ef7dc959eff5ebef859c5f895c5c554b7a874
generated: "2025-11-09T17:08:02.86648061Z"
+5 -5
View File
@@ -2,8 +2,8 @@ apiVersion: v2
name: nextcloud-mcp-server
description: A Helm chart for Nextcloud MCP Server - enables AI assistants to interact with Nextcloud
type: application
version: 0.26.1
appVersion: "0.26.1"
version: 0.29.2
appVersion: "0.29.2"
keywords:
- nextcloud
- mcp
@@ -23,10 +23,10 @@ sources:
icon: https://raw.githubusercontent.com/nextcloud/server/master/core/img/logo/logo.svg
dependencies:
- name: qdrant
version: "0.9.0"
version: "1.15.5"
repository: https://qdrant.github.io/qdrant-helm
condition: qdrant.enabled
condition: qdrant.networkMode.deploySubchart
- name: ollama
version: "1.33.0"
version: "1.34.0"
repository: https://otwld.github.io/ollama-helm
condition: ollama.enabled
@@ -0,0 +1,90 @@
# Grafana Dashboards
This directory contains example Grafana dashboards for monitoring the Nextcloud MCP Server.
## Dashboards
### nextcloud-mcp-server.json
Comprehensive dashboard with the following panels:
- **Request Rate**: HTTP requests per second by method and endpoint
- **Error Rate**: Percentage of 5xx errors
- **Request Latency**: P50 and P95 latency by endpoint
- **Top MCP Tools**: Most frequently called tools
- **Nextcloud API Latency**: API call latency by app (notes, calendar, etc.)
- **Vector Sync Queue**: Queue size for background document processing
## Importing to Grafana
### Manual Import
1. Open Grafana UI
2. Navigate to Dashboards → Import
3. Upload `nextcloud-mcp-server.json`
4. Select your Prometheus data source
5. Click "Import"
### Automated Import (Kubernetes)
If using the Grafana Operator or kube-prometheus-stack, you can create a ConfigMap:
```bash
kubectl create configmap nextcloud-mcp-dashboards \
--from-file=nextcloud-mcp-server.json \
-n monitoring
# Add label for Grafana sidecar to discover
kubectl label configmap nextcloud-mcp-dashboards \
grafana_dashboard=1 \
-n monitoring
```
Or add to your Helm values:
```yaml
# values.yaml for kube-prometheus-stack
grafana:
dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'nextcloud-mcp'
orgId: 1
folder: 'Nextcloud MCP'
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/nextcloud-mcp
dashboardsConfigMaps:
nextcloud-mcp: nextcloud-mcp-dashboards
```
## Dashboard Variables
The dashboard includes two variables:
- **Data Source**: Select your Prometheus data source
- **Namespace**: Filter metrics by Kubernetes namespace
## Customization
You can customize the dashboard by:
1. Adjusting refresh rate (default: 30s)
2. Modifying time range (default: last 6 hours)
3. Adding new panels for specific metrics
4. Adjusting thresholds in existing panels
## Metrics Reference
All metrics are documented in `/docs/observability.md`. Key metric prefixes:
- `mcp_http_*` - HTTP server metrics
- `mcp_tool_*` - MCP tool invocation metrics
- `mcp_nextcloud_api_*` - Nextcloud API call metrics
- `mcp_oauth_*` - OAuth token validation metrics
- `mcp_vector_sync_*` - Vector database sync metrics
- `mcp_db_*` - Database operation metrics
@@ -0,0 +1,630 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "grafana",
"uid": "-- Grafana --"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"tooltip": false,
"viz": false,
"legend": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "reqps"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "sum(rate(mcp_http_requests_total{namespace=\"$namespace\"}[5m])) by (method, endpoint)",
"legendFormat": "{{method}} {{endpoint}}",
"refId": "A"
}
],
"title": "Request Rate",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"tooltip": false,
"viz": false,
"legend": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "line"
}
},
"mappings": [],
"max": 100,
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 5
}
]
},
"unit": "percent"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"id": 2,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "sum(rate(mcp_http_requests_total{status_code=~\"5..\", namespace=\"$namespace\"}[5m])) / sum(rate(mcp_http_requests_total{namespace=\"$namespace\"}[5m])) * 100",
"legendFormat": "Error Rate",
"refId": "A"
}
],
"title": "Error Rate (%)",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"tooltip": false,
"viz": false,
"legend": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"id": 3,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "histogram_quantile(0.95, sum(rate(mcp_http_request_duration_seconds_bucket{namespace=\"$namespace\"}[5m])) by (le, endpoint))",
"legendFormat": "{{endpoint}} (p95)",
"refId": "A"
},
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "histogram_quantile(0.50, sum(rate(mcp_http_request_duration_seconds_bucket{namespace=\"$namespace\"}[5m])) by (le, endpoint))",
"legendFormat": "{{endpoint}} (p50)",
"refId": "B"
}
],
"title": "Request Latency (P50/P95)",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"tooltip": false,
"viz": false,
"legend": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"id": 4,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "topk(10, sum(rate(mcp_tool_calls_total{namespace=\"$namespace\"}[5m])) by (tool_name))",
"legendFormat": "{{tool_name}}",
"refId": "A"
}
],
"title": "Top MCP Tools by Volume",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"tooltip": false,
"viz": false,
"legend": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "s"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 16
},
"id": 5,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "histogram_quantile(0.95, sum(rate(mcp_nextcloud_api_duration_seconds_bucket{namespace=\"$namespace\"}[5m])) by (le, app))",
"legendFormat": "{{app}} (p95)",
"refId": "A"
}
],
"title": "Nextcloud API Latency by App",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"tooltip": false,
"viz": false,
"legend": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 16
},
"id": 6,
"options": {
"legend": {
"calcs": ["mean", "lastNotNull"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"expr": "mcp_vector_sync_queue_size{namespace=\"$namespace\"}",
"legendFormat": "Queue Size",
"refId": "A"
}
],
"title": "Vector Sync Queue Size",
"type": "timeseries"
}
],
"refresh": "30s",
"schemaVersion": 38,
"style": "dark",
"tags": ["nextcloud", "mcp", "observability"],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "Prometheus",
"value": "Prometheus"
},
"hide": 0,
"includeAll": false,
"label": "Data Source",
"multi": false,
"name": "datasource",
"options": [],
"query": "prometheus",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"type": "datasource"
},
{
"current": {
"selected": false,
"text": "default",
"value": "default"
},
"datasource": {
"type": "prometheus",
"uid": "${datasource}"
},
"definition": "label_values(mcp_http_requests_total, namespace)",
"hide": 0,
"includeAll": false,
"label": "Namespace",
"multi": false,
"name": "namespace",
"options": [],
"query": {
"query": "label_values(mcp_http_requests_total, namespace)",
"refId": "PrometheusVariableQueryEditor-VariableQuery"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Nextcloud MCP Server",
"uid": "nextcloud-mcp-server",
"version": 1,
"weekStart": ""
}
@@ -5,6 +5,8 @@ metadata:
labels:
{{- include "nextcloud-mcp-server.labels" . | nindent 4 }}
spec:
strategy:
type: Recreate
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
@@ -56,6 +58,11 @@ spec:
- name: http
containerPort: {{ include "nextcloud-mcp-server.port" . }}
protocol: TCP
{{- if .Values.observability.metrics.enabled }}
- name: metrics
containerPort: {{ .Values.observability.metrics.port }}
protocol: TCP
{{- end }}
env:
# Nextcloud connection
- name: NEXTCLOUD_HOST
@@ -200,6 +207,27 @@ spec:
value: {{ .Values.openai.baseUrl | quote }}
{{- end }}
{{- end }}
# Observability
- name: METRICS_ENABLED
value: {{ .Values.observability.metrics.enabled | quote }}
- name: METRICS_PORT
value: {{ .Values.observability.metrics.port | quote }}
{{- if .Values.observability.tracing.enabled }}
- name: OTEL_ENABLED
value: "true"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: {{ .Values.observability.tracing.endpoint | quote }}
- name: OTEL_SERVICE_NAME
value: {{ .Values.observability.tracing.serviceName | quote }}
- name: OTEL_TRACES_SAMPLER_ARG
value: {{ .Values.observability.tracing.samplingRate | quote }}
{{- end }}
- name: LOG_FORMAT
value: {{ .Values.observability.logging.format | quote }}
- name: LOG_LEVEL
value: {{ .Values.observability.logging.level | quote }}
- name: LOG_INCLUDE_TRACE_CONTEXT
value: {{ .Values.observability.logging.includeTraceContext | quote }}
{{- with .Values.extraEnv }}
{{- toYaml . | nindent 12 }}
{{- end }}
@@ -0,0 +1,92 @@
{{- if and .Values.observability.metrics.enabled .Values.prometheusRule.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: {{ include "nextcloud-mcp-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "nextcloud-mcp-server.labels" . | nindent 4 }}
{{- with .Values.prometheusRule.labels }}
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
groups:
- name: nextcloud-mcp-server.critical
interval: 30s
rules:
- alert: NextcloudMCPServerDown
expr: up{job="{{ include "nextcloud-mcp-server.fullname" . }}"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Nextcloud MCP Server is down"
description: "{{ `{{` }} $labels.pod {{ `}}` }} has been down for more than 5 minutes."
- alert: NextcloudMCPHighErrorRate
expr: |
sum(rate(mcp_http_requests_total{status_code=~"5..", job="{{ include "nextcloud-mcp-server.fullname" . }}"}[5m]))
/ sum(rate(mcp_http_requests_total{job="{{ include "nextcloud-mcp-server.fullname" . }}"}[5m])) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate on Nextcloud MCP Server"
description: "Error rate is {{ `{{` }} printf \"%.2f%%\" (mul $value 100) {{ `}}` }} (threshold: 5%)"
- alert: NextcloudMCPHighLatency
expr: |
histogram_quantile(0.95,
sum(rate(mcp_http_request_duration_seconds_bucket{job="{{ include "nextcloud-mcp-server.fullname" . }}"}[5m])) by (le, endpoint)
) > 1
for: 5m
labels:
severity: critical
annotations:
summary: "High latency on Nextcloud MCP Server"
description: "P95 latency is {{ `{{` }} printf \"%.2fs\" $value {{ `}}` }} on {{ `{{` }} $labels.endpoint {{ `}}` }} (threshold: 1s)"
- alert: NextcloudMCPDependencyDown
expr: mcp_dependency_health{job="{{ include "nextcloud-mcp-server.fullname" . }}"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Nextcloud MCP dependency is down"
description: "Dependency {{ `{{` }} $labels.dependency {{ `}}` }} has been down for more than 2 minutes."
- name: nextcloud-mcp-server.warning
interval: 30s
rules:
- alert: NextcloudMCPTokenValidationErrors
expr: |
sum(rate(mcp_oauth_token_validations_total{result="error", job="{{ include "nextcloud-mcp-server.fullname" . }}"}[10m]))
/ sum(rate(mcp_oauth_token_validations_total{job="{{ include "nextcloud-mcp-server.fullname" . }}"}[10m])) > 0.01
for: 10m
labels:
severity: warning
annotations:
summary: "High token validation error rate"
description: "Token validation error rate is {{ `{{` }} printf \"%.2f%%\" (mul $value 100) {{ `}}` }} (threshold: 1%)"
- alert: NextcloudMCPVectorSyncQueueHigh
expr: mcp_vector_sync_queue_size{job="{{ include "nextcloud-mcp-server.fullname" . }}"} > 100
for: 15m
labels:
severity: warning
annotations:
summary: "Vector sync queue is high"
description: "Vector sync queue size is {{ `{{` }} $value {{ `}}` }} (threshold: 100)"
- alert: NextcloudMCPQdrantSlowQueries
expr: |
histogram_quantile(0.95,
sum(rate(mcp_db_operation_duration_seconds_bucket{db="qdrant", job="{{ include "nextcloud-mcp-server.fullname" . }}"}[10m])) by (le)
) > 0.5
for: 10m
labels:
severity: warning
annotations:
summary: "Qdrant queries are slow"
description: "P95 Qdrant query latency is {{ `{{` }} printf \"%.2fs\" $value {{ `}}` }} (threshold: 0.5s)"
{{- end }}
@@ -15,5 +15,11 @@ spec:
targetPort: http
protocol: TCP
name: http
{{- if .Values.observability.metrics.enabled }}
- port: {{ .Values.observability.metrics.port }}
targetPort: metrics
protocol: TCP
name: metrics
{{- end }}
selector:
{{- include "nextcloud-mcp-server.selectorLabels" . | nindent 4 }}
@@ -0,0 +1,32 @@
{{- if and .Values.observability.metrics.enabled .Values.serviceMonitor.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: {{ include "nextcloud-mcp-server.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "nextcloud-mcp-server.labels" . | nindent 4 }}
{{- with .Values.serviceMonitor.labels }}
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
selector:
matchLabels:
{{- include "nextcloud-mcp-server.selectorLabels" . | nindent 6 }}
endpoints:
- port: metrics
path: {{ .Values.observability.metrics.path }}
interval: {{ .Values.serviceMonitor.interval }}
scrapeTimeout: {{ .Values.serviceMonitor.scrapeTimeout }}
scheme: http
relabelings:
# Add namespace label
- sourceLabels: [__meta_kubernetes_namespace]
targetLabel: namespace
# Add pod label
- sourceLabels: [__meta_kubernetes_pod_name]
targetLabel: pod
# Add service label
- sourceLabels: [__meta_kubernetes_service_name]
targetLabel: service
{{- end }}
+37
View File
@@ -168,6 +168,43 @@ securityContext:
runAsNonRoot: true
runAsUser: 1000
# Observability Configuration
observability:
# Prometheus metrics
metrics:
enabled: true
port: 9090
path: /metrics
# OpenTelemetry tracing
tracing:
enabled: false
endpoint: "" # e.g., "http://opentelemetry-collector:4317"
serviceName: "nextcloud-mcp-server"
samplingRate: 1.0
# Logging configuration
logging:
format: json # "json" or "text"
level: INFO
includeTraceContext: true
# Prometheus ServiceMonitor (requires Prometheus Operator)
serviceMonitor:
enabled: false
interval: 30s
scrapeTimeout: 10s
labels: {}
# Additional labels for ServiceMonitor (e.g., for Prometheus selector)
# Example: { prometheus: kube-prometheus }
# Prometheus alert rules (requires Prometheus Operator)
prometheusRule:
enabled: false
labels: {}
# Additional labels for PrometheusRule (e.g., for Prometheus selector)
# Example: { prometheus: kube-prometheus }
service:
type: ClusterIP
port: 8000
+3 -1
View File
@@ -88,11 +88,13 @@ services:
- VECTOR_SYNC_SCAN_INTERVAL=10
- VECTOR_SYNC_PROCESSOR_WORKERS=1
- LOG_FORMAT=json
# Qdrant configuration (three modes):
# 1. Network mode: Set QDRANT_URL=http://qdrant:6333 (requires qdrant service)
# 2. In-memory mode: Set QDRANT_LOCATION=:memory: (default if nothing set)
# 3. Persistent local: Set QDRANT_LOCATION=/app/data/qdrant (stored in mcp-data volume)
- QDRANT_LOCATION=:memory:
- QDRANT_LOCATION=/app/data/qdrant
# - QDRANT_URL=http://qdrant:6333 # Uncomment for network mode
# - QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key} # Only for network mode
- QDRANT_COLLECTION=nextcloud_content
+260
View File
@@ -0,0 +1,260 @@
# Observability and Monitoring
The Nextcloud MCP Server includes comprehensive observability features for production deployments:
- **Prometheus metrics** for monitoring performance and health
- **OpenTelemetry distributed tracing** for debugging request flows
- **Structured JSON logging** with trace correlation
- **Kubernetes integration** via ServiceMonitor and PrometheusRule
## Quick Start
### Local Development with Prometheus
```bash
# Enable metrics (enabled by default)
export METRICS_ENABLED=true
export METRICS_PORT=9090
# Enable tracing (optional)
export OTEL_ENABLED=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# Start the server
docker-compose up -d mcp
```
Access metrics at: `http://localhost:9090/metrics`
### Kubernetes Deployment
Metrics are automatically scraped if you have Prometheus Operator installed:
```bash
helm install nextcloud-mcp charts/nextcloud-mcp-server \
--set observability.metrics.enabled=true \
--set observability.tracing.enabled=true \
--set observability.tracing.endpoint=http://opentelemetry-collector:4317 \
--set serviceMonitor.enabled=true
```
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `METRICS_ENABLED` | `true` | Enable Prometheus metrics |
| `METRICS_PORT` | `9090` | Port for metrics endpoint |
| `OTEL_ENABLED` | `false` | Enable OpenTelemetry tracing |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | - | OTLP gRPC endpoint (e.g., `http://otel-collector:4317`) |
| `OTEL_SERVICE_NAME` | `nextcloud-mcp-server` | Service name in traces |
| `OTEL_TRACES_SAMPLER` | `always_on` | Trace sampling strategy |
| `OTEL_TRACES_SAMPLER_ARG` | `1.0` | Sampling rate (0.0-1.0) |
| `LOG_FORMAT` | `json` | Log format (`json` or `text`) |
| `LOG_LEVEL` | `INFO` | Minimum log level |
| `LOG_INCLUDE_TRACE_CONTEXT` | `true` | Include trace IDs in logs |
### Helm Chart Configuration
```yaml
observability:
metrics:
enabled: true
port: 9090
path: /metrics
tracing:
enabled: true
endpoint: "http://opentelemetry-collector:4317"
samplingRate: 1.0
logging:
format: json
level: INFO
includeTraceContext: true
serviceMonitor:
enabled: true
interval: 30s
scrapeTimeout: 10s
```
## Metrics
### HTTP Server Metrics (RED)
- `mcp_http_requests_total` - Total HTTP requests
- `mcp_http_request_duration_seconds` - Request latency histogram
- `mcp_http_requests_in_progress` - In-flight requests gauge
### MCP Tool Metrics
- `mcp_tool_calls_total` - Tool invocation count by status
- `mcp_tool_duration_seconds` - Tool execution latency
- `mcp_tool_errors_total` - Tool errors by type
### Nextcloud API Metrics
- `mcp_nextcloud_api_requests_total` - API calls by app and status
- `mcp_nextcloud_api_duration_seconds` - API latency by app
- `mcp_nextcloud_api_retries_total` - Retry count (429, timeout, etc.)
### OAuth Flow Metrics
- `mcp_oauth_token_validations_total` - Token validation count
- `mcp_oauth_token_exchange_total` - Token exchange operations
- `mcp_oauth_token_cache_hits_total` - Cache hit/miss rate
- `mcp_oauth_refresh_token_operations_total` - Refresh token storage ops
### Vector Sync Metrics (when enabled)
- `mcp_vector_sync_documents_scanned_total` - Documents discovered
- `mcp_vector_sync_documents_processed_total` - Processing results
- `mcp_vector_sync_processing_duration_seconds` - Processing latency
- `mcp_vector_sync_queue_size` - Current queue depth
- `mcp_qdrant_operations_total` - Qdrant DB operations
### Database Metrics
- `mcp_db_operations_total` - DB operations (SQLite, Qdrant)
- `mcp_db_operation_duration_seconds` - DB latency
### Dependency Health
- `mcp_dependency_health` - External dependency status (1=up, 0=down)
- `mcp_dependency_check_duration_seconds` - Health check latency
## Distributed Tracing
### Span Hierarchy
```
HTTP POST /messages
├── mcp.tool.nc_notes_create_note
│ └── nextcloud.api.notes.POST
│ └── httpx request (auto-instrumented)
└── oauth.token.validate (if OAuth mode)
└── httpx request to IdP
```
### Span Attributes
- **MCP tools**: `mcp.tool.name`, `mcp.tool.args` (sanitized)
- **Nextcloud API**: `nextcloud.app`, `http.method`, `http.status_code`
- **OAuth**: `oauth.operation`, `oauth.method`
- **Vector sync**: `vector_sync.operation`, `vector_sync.document_count`
### Trace Context in Logs
When tracing is enabled, all logs include `trace_id` and `span_id`:
```json
{
"timestamp": "2025-01-09T12:34:56.789Z",
"level": "INFO",
"logger": "nextcloud_mcp_server.server.notes",
"message": "Note created successfully",
"trace_id": "a1b2c3d4e5f6...",
"span_id": "123456789abc...",
"note_id": 42
}
```
## Dashboards
### Prometheus Queries
**Request Rate (req/s)**:
```promql
sum(rate(mcp_http_requests_total[5m])) by (method, endpoint)
```
**Error Rate (%)**:
```promql
sum(rate(mcp_http_requests_total{status_code=~"5.."}[5m]))
/ sum(rate(mcp_http_requests_total[5m])) * 100
```
**P95 Latency**:
```promql
histogram_quantile(0.95,
sum(rate(mcp_http_request_duration_seconds_bucket[5m])) by (le, endpoint)
)
```
**Top Tools by Volume**:
```promql
topk(10, sum(rate(mcp_tool_calls_total[5m])) by (tool_name))
```
**Nextcloud API Health**:
```promql
sum(rate(mcp_nextcloud_api_requests_total{status_code!~"2.."}[5m])) by (app)
```
## Alerts
### Recommended Alert Rules
**Critical**:
- Server down for >5min
- Error rate >5% for >5min
- P95 latency >1s for >5min
- Dependency down for >2min
**Warning**:
- Token validation errors >1% for >10min
- Vector sync queue >100 for >15min
- Qdrant slow (p95 >500ms) for >10min
See `charts/nextcloud-mcp-server/templates/prometheusrule.yaml` for complete definitions.
## Troubleshooting
### Metrics Not Appearing
1. Check metrics are enabled: `curl http://localhost:9090/metrics`
2. Verify ServiceMonitor labels match Prometheus selector
3. Check Prometheus target status: `http://prometheus:9090/targets`
### Traces Not Appearing
1. Verify OTLP endpoint is reachable: `curl http://otel-collector:4317`
2. Check collector logs for errors
3. Verify sampling rate is not 0.0
4. Check trace backend (Jaeger/Tempo) connectivity
### High Cardinality Metrics
If you see cardinality warnings:
- Middleware normalizes endpoints (e.g., `/user/123``/user/*`)
- OAuth tokens are never included in metric labels
- User IDs are not tracked (use tracing for per-user debugging)
## Performance Impact
- **Metrics**: <1% overhead (counters/histograms are very fast)
- **Tracing**: ~2-5% overhead at 100% sampling
- **JSON logging**: <1% overhead vs text logging
**Recommendation**: Always enable metrics. Enable tracing in staging/production with 10-50% sampling.
## Architecture
The observability stack integrates at multiple layers:
1. **HTTP Layer**: `ObservabilityMiddleware` tracks all HTTP requests
2. **MCP Layer**: Tools use `@trace_mcp_tool` for span creation
3. **Client Layer**: `BaseNextcloudClient` tracks all API calls
4. **OAuth Layer**: Token operations are traced and metered
5. **Background Tasks**: Vector sync operations emit metrics/traces
All components use shared Prometheus `Registry` and OpenTelemetry `TracerProvider`.
## References
- [Prometheus Best Practices](https://prometheus.io/docs/practices/)
- [OpenTelemetry Python SDK](https://opentelemetry.io/docs/languages/python/)
- [Prometheus Operator](https://prometheus-operator.dev/)
- [Grafana Dashboards](https://grafana.com/docs/grafana/latest/dashboards/)
+57 -7
View File
@@ -32,13 +32,17 @@ from nextcloud_mcp_server.auth import (
from nextcloud_mcp_server.auth.unified_verifier import UnifiedTokenVerifier
from nextcloud_mcp_server.client import NextcloudClient
from nextcloud_mcp_server.config import (
LOGGING_CONFIG,
get_document_processor_config,
get_settings,
setup_logging,
)
from nextcloud_mcp_server.context import get_client as get_nextcloud_client
from nextcloud_mcp_server.document_processors import get_registry
from nextcloud_mcp_server.observability import (
ObservabilityMiddleware,
get_uvicorn_logging_config,
setup_metrics,
setup_tracing,
)
from nextcloud_mcp_server.server import (
configure_calendar_tools,
configure_contacts_tools,
@@ -776,7 +780,28 @@ async def setup_oauth_config():
def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
setup_logging()
# Initialize observability (logging will be configured by uvicorn)
settings = get_settings()
# Setup Prometheus metrics (always enabled by default)
if settings.metrics_enabled:
setup_metrics(port=settings.metrics_port)
logger.info(
f"Prometheus metrics enabled on dedicated port {settings.metrics_port}"
)
# Setup OpenTelemetry tracing (optional)
if settings.tracing_enabled:
setup_tracing(
service_name=settings.otel_service_name,
otlp_endpoint=settings.otel_exporter_otlp_endpoint,
sampling_rate=settings.otel_traces_sampler_arg,
)
logger.info(
f"OpenTelemetry tracing enabled (endpoint: {settings.otel_exporter_otlp_endpoint})"
)
else:
logger.info("OpenTelemetry tracing disabled (set OTEL_ENABLED=true to enable)")
# Determine authentication mode
oauth_enabled = is_oauth_mode()
@@ -1148,13 +1173,15 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
checks["auth_configured"] = "error: credentials not set"
is_ready = False
# Check Qdrant status if vector sync is enabled
# Check Qdrant status if using network mode (external Qdrant service)
# In-memory and persistent modes use embedded Qdrant, no external service to check
vector_sync_enabled = (
os.getenv("VECTOR_SYNC_ENABLED", "false").lower() == "true"
)
if vector_sync_enabled:
qdrant_url = os.getenv("QDRANT_URL") # Only set in network mode
if vector_sync_enabled and qdrant_url:
try:
qdrant_url = os.getenv("QDRANT_URL", "http://qdrant:6333")
async with httpx.AsyncClient(timeout=2.0) as client:
response = await client.get(f"{qdrant_url}/readyz")
if response.status_code == 200:
@@ -1165,6 +1192,9 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
except Exception as e:
checks["qdrant"] = f"error: {str(e)}"
is_ready = False
elif vector_sync_enabled:
# Using embedded Qdrant (memory or persistent mode)
checks["qdrant"] = "embedded"
status_code = 200 if is_ready else 503
return JSONResponse(
@@ -1183,6 +1213,9 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
routes.append(Route("/health/ready", health_ready, methods=["GET"]))
logger.info("Health check endpoints enabled: /health/live, /health/ready")
# Note: Metrics endpoint is NOT exposed on main HTTP port for security reasons.
# Metrics are served on dedicated port via setup_metrics() (default: 9090)
if oauth_enabled:
# Import OAuth routes (ADR-004 Progressive Consent)
from nextcloud_mcp_server.auth.oauth_routes import oauth_authorize
@@ -1374,6 +1407,11 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
expose_headers=["*"],
)
# Add observability middleware (metrics + tracing)
if settings.metrics_enabled or settings.tracing_enabled:
app.add_middleware(ObservabilityMiddleware)
logger.info("Observability middleware enabled (metrics and/or tracing)")
# Add exception handler for scope challenges (OAuth mode only)
if oauth_enabled:
@@ -1630,8 +1668,20 @@ def run(
app = get_app(transport=transport, enabled_apps=enabled_apps)
# Get observability settings and create uvicorn logging config
settings = get_settings()
uvicorn_log_config = get_uvicorn_logging_config(
log_format=settings.log_format,
log_level=settings.log_level,
include_trace_context=settings.log_include_trace_context,
)
uvicorn.run(
app=app, host=host, port=port, log_level=log_level, log_config=LOGGING_CONFIG
app=app,
host=host,
port=port,
log_level=log_level,
log_config=uvicorn_log_config,
)
+57 -4
View File
@@ -7,6 +7,12 @@ from functools import wraps
from httpx import AsyncClient, HTTPStatusError, RequestError, codes
from nextcloud_mcp_server.observability.metrics import (
record_nextcloud_api_call,
record_nextcloud_api_retry,
)
from nextcloud_mcp_server.observability.tracing import trace_nextcloud_api_call
logger = logging.getLogger(__name__)
@@ -38,6 +44,9 @@ def retry_on_429(func):
logger.warning(
f"429 Client Error: Too Many Requests, Number of attempts: {retries}"
)
# Record retry metric (extract app name from args if available)
if len(args) > 0 and hasattr(args[0], "app_name"):
record_nextcloud_api_retry(app=args[0].app_name, reason="429")
time.sleep(5)
elif e.response.status_code == 404:
# 404 errors are often expected (e.g., checking if attachments exist)
@@ -72,6 +81,9 @@ def retry_on_429(func):
class BaseNextcloudClient(ABC):
"""Base class for all Nextcloud app clients."""
# Subclasses should set this to identify the app for metrics/tracing
app_name: str = "unknown"
def __init__(self, http_client: AsyncClient, username: str):
"""Initialize with shared HTTP client and username.
@@ -88,7 +100,7 @@ class BaseNextcloudClient(ABC):
@retry_on_429
async def _make_request(self, method: str, url: str, **kwargs):
"""Common request wrapper with logging and error handling.
"""Common request wrapper with logging, tracing, and error handling.
Args:
method: HTTP method
@@ -99,6 +111,47 @@ class BaseNextcloudClient(ABC):
Response object
"""
logger.debug(f"Making {method} request to {url}")
response = await self._client.request(method, url, **kwargs)
response.raise_for_status()
return response
# Start timer for metrics
start_time = time.time()
status_code = 0
try:
# Wrap request in trace span
with trace_nextcloud_api_call(
app=self.app_name,
method=method,
path=url,
):
response = await self._client.request(method, url, **kwargs)
status_code = response.status_code
response.raise_for_status()
# Record successful API call metrics
duration = time.time() - start_time
record_nextcloud_api_call(
app=self.app_name,
method=method,
status_code=status_code,
duration=duration,
)
return response
except (HTTPStatusError, RequestError) as e:
# Record error metrics
if isinstance(e, HTTPStatusError):
status_code = e.response.status_code
else:
status_code = 0 # Connection error, no status code
duration = time.time() - start_time
record_nextcloud_api_call(
app=self.app_name,
method=method,
status_code=status_code,
duration=duration,
)
# Re-raise the exception
raise
+2
View File
@@ -13,6 +13,8 @@ logger = logging.getLogger(__name__)
class ContactsClient(BaseNextcloudClient):
"""Client for NextCloud CardDAV contact operations."""
app_name = "contacts"
def _get_carddav_base_path(self) -> str:
"""Helper to get the base CardDAV path for contacts."""
return f"/remote.php/dav/addressbooks/users/{self.username}"
+2
View File
@@ -13,6 +13,8 @@ logger = logging.getLogger(__name__)
class CookbookClient(BaseNextcloudClient):
"""Client for Nextcloud Cookbook app operations."""
app_name = "cookbook"
async def get_version(self) -> Dict[str, Any]:
"""Get Cookbook app and API version."""
response = await self._make_request("GET", "/apps/cookbook/api/version")
+2
View File
@@ -17,6 +17,8 @@ from nextcloud_mcp_server.models.deck import (
class DeckClient(BaseNextcloudClient):
"""Client for Nextcloud Deck app operations."""
app_name = "deck"
def _get_deck_headers(
self, additional_headers: Optional[Dict[str, str]] = None
) -> Dict[str, str]:
+2
View File
@@ -11,6 +11,8 @@ logger = logging.getLogger(__name__)
class GroupsClient(BaseNextcloudClient):
"""Client for Nextcloud Groups API operations."""
app_name = "groups"
@retry_on_429
async def search_groups(
self,
+2
View File
@@ -11,6 +11,8 @@ logger = logging.getLogger(__name__)
class NotesClient(BaseNextcloudClient):
"""Client for Nextcloud Notes app operations."""
app_name = "notes"
async def get_settings(self) -> Dict[str, Any]:
"""Get Notes app settings."""
response = await self._make_request("GET", "/apps/notes/api/v1/settings")
+2
View File
@@ -11,6 +11,8 @@ logger = logging.getLogger(__name__)
class SharingClient(BaseNextcloudClient):
"""Client for Nextcloud OCS Sharing API operations."""
app_name = "sharing"
@retry_on_429
async def create_share(
self,
+2
View File
@@ -11,6 +11,8 @@ logger = logging.getLogger(__name__)
class TablesClient(BaseNextcloudClient):
"""Client for Nextcloud Tables app operations."""
app_name = "tables"
async def list_tables(self) -> List[Dict[str, Any]]:
"""List all tables available to the user."""
response = await self._make_request(
+2
View File
@@ -7,6 +7,8 @@ from nextcloud_mcp_server.models.users import UserDetails
class UsersClient(BaseNextcloudClient):
"""Client for Nextcloud User API operations."""
app_name = "users"
def _get_user_headers(
self, additional_headers: Optional[Dict[str, str]] = None
) -> Dict[str, str]:
+2
View File
@@ -15,6 +15,8 @@ logger = logging.getLogger(__name__)
class WebDAVClient(BaseNextcloudClient):
"""Client for Nextcloud WebDAV operations."""
app_name = "webdav"
async def delete_resource(self, path: str) -> Dict[str, Any]:
"""Delete a resource (file or directory) via WebDAV DELETE."""
# Ensure path ends with a slash if it's a directory
+24
View File
@@ -174,6 +174,18 @@ class Settings:
ollama_embedding_model: str = "nomic-embed-text"
ollama_verify_ssl: bool = True
# Observability settings
metrics_enabled: bool = True
metrics_port: int = 9090
tracing_enabled: bool = False
otel_exporter_otlp_endpoint: Optional[str] = None
otel_service_name: str = "nextcloud-mcp-server"
otel_traces_sampler: str = "always_on"
otel_traces_sampler_arg: float = 1.0
log_format: str = "json" # "json" or "text"
log_level: str = "INFO"
log_include_trace_context: bool = True
def __post_init__(self):
"""Validate Qdrant configuration and set defaults."""
logger = logging.getLogger(__name__)
@@ -253,4 +265,16 @@ def get_settings() -> Settings:
ollama_base_url=os.getenv("OLLAMA_BASE_URL"),
ollama_embedding_model=os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text"),
ollama_verify_ssl=os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true",
# Observability settings
metrics_enabled=os.getenv("METRICS_ENABLED", "true").lower() == "true",
metrics_port=int(os.getenv("METRICS_PORT", "9090")),
tracing_enabled=os.getenv("OTEL_ENABLED", "false").lower() == "true",
otel_exporter_otlp_endpoint=os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT"),
otel_service_name=os.getenv("OTEL_SERVICE_NAME", "nextcloud-mcp-server"),
otel_traces_sampler=os.getenv("OTEL_TRACES_SAMPLER", "always_on"),
otel_traces_sampler_arg=float(os.getenv("OTEL_TRACES_SAMPLER_ARG", "1.0")),
log_format=os.getenv("LOG_FORMAT", "json"),
log_level=os.getenv("LOG_LEVEL", "INFO"),
log_include_trace_context=os.getenv("LOG_INCLUDE_TRACE_CONTEXT", "true").lower()
== "true",
)
@@ -0,0 +1,31 @@
"""
Observability module for the Nextcloud MCP Server.
This module provides:
- Prometheus metrics collection
- OpenTelemetry distributed tracing
- Enhanced structured logging with trace correlation
- Monitoring middleware for Starlette/FastAPI
Usage:
from nextcloud_mcp_server.observability import setup_observability
# In app.py lifespan
setup_observability(app, config)
"""
from nextcloud_mcp_server.observability.logging_config import (
get_uvicorn_logging_config,
setup_logging,
)
from nextcloud_mcp_server.observability.metrics import setup_metrics
from nextcloud_mcp_server.observability.middleware import ObservabilityMiddleware
from nextcloud_mcp_server.observability.tracing import setup_tracing
__all__ = [
"setup_logging",
"get_uvicorn_logging_config",
"setup_metrics",
"setup_tracing",
"ObservabilityMiddleware",
]
@@ -0,0 +1,290 @@
"""
Enhanced logging configuration for the Nextcloud MCP Server.
This module provides:
- Structured JSON logging with python-json-logger
- Trace context injection (trace_id, span_id) for correlation with distributed traces
- Configurable log formats (JSON or text)
- Log level configuration per component
"""
import logging
import sys
from typing import Any
from pythonjsonlogger import jsonlogger
from nextcloud_mcp_server.observability.tracing import get_trace_context
class TraceContextFormatter(jsonlogger.JsonFormatter):
"""
JSON formatter that injects OpenTelemetry trace context into log records.
This allows logs to be correlated with distributed traces by including
trace_id and span_id in each log entry.
"""
def add_fields(
self,
log_record: dict[str, Any],
record: logging.LogRecord,
message_dict: dict[str, Any],
) -> None:
"""
Add custom fields to the log record, including trace context.
Args:
log_record: Dictionary to be serialized as JSON
record: LogRecord instance
message_dict: Dictionary of extra fields from log call
"""
# Call parent to add standard fields
super().add_fields(log_record, record, message_dict)
# Add trace context if available
trace_context = get_trace_context()
if trace_context:
log_record["trace_id"] = trace_context.get("trace_id")
log_record["span_id"] = trace_context.get("span_id")
# Add standard fields with consistent naming
log_record["timestamp"] = self.formatTime(record)
log_record["level"] = record.levelname
log_record["logger"] = record.name
log_record["message"] = record.getMessage()
# Include exception info if present
if record.exc_info:
log_record["exception"] = self.formatException(record.exc_info)
class TraceContextTextFormatter(logging.Formatter):
"""
Text formatter that includes OpenTelemetry trace context.
Format: [LEVEL] [timestamp] logger - message [trace_id=xxx span_id=yyy]
"""
def format(self, record: logging.LogRecord) -> str:
"""
Format log record with trace context.
Args:
record: LogRecord instance
Returns:
Formatted log string
"""
# Format base message
base_message = super().format(record)
# Add trace context if available
trace_context = get_trace_context()
if trace_context:
trace_id = trace_context.get("trace_id", "")
span_id = trace_context.get("span_id", "")
return f"{base_message} [trace_id={trace_id} span_id={span_id}]"
return base_message
def setup_logging(
log_format: str = "json",
log_level: str = "INFO",
include_trace_context: bool = True,
) -> None:
"""
Configure logging for the Nextcloud MCP Server.
Args:
log_format: "json" for JSON logging, "text" for human-readable text (default: "json")
log_level: Minimum log level (DEBUG, INFO, WARNING, ERROR, CRITICAL) (default: "INFO")
include_trace_context: Whether to include trace context in logs (default: True)
"""
# Get root logger
root_logger = logging.getLogger()
root_logger.setLevel(getattr(logging, log_level.upper(), logging.INFO))
# Remove existing handlers
root_logger.handlers.clear()
# Create console handler
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(getattr(logging, log_level.upper(), logging.INFO))
# Configure formatter based on format preference
if log_format.lower() == "json":
if include_trace_context:
formatter = TraceContextFormatter(
"%(timestamp)s %(level)s %(name)s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%S",
)
else:
formatter = jsonlogger.JsonFormatter(
"%(timestamp)s %(level)s %(name)s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%S",
)
else: # text format
if include_trace_context:
formatter = TraceContextTextFormatter(
"%(levelname)s [%(asctime)s] %(name)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
else:
formatter = logging.Formatter(
"%(levelname)s [%(asctime)s] %(name)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
console_handler.setFormatter(formatter)
root_logger.addHandler(console_handler)
# Configure specific logger levels
configure_component_loggers(log_level)
root_logger.info(
f"Logging configured: format={log_format}, level={log_level}, "
f"trace_context={include_trace_context}"
)
def configure_component_loggers(default_level: str = "INFO") -> None:
"""
Configure log levels for specific components.
This allows fine-grained control over logging verbosity for different
parts of the application.
Args:
default_level: Default log level for most components
"""
# Map of logger names to log levels
logger_levels = {
# Application loggers
"nextcloud_mcp_server": default_level,
"nextcloud_mcp_server.server": default_level,
"nextcloud_mcp_server.client": default_level,
"nextcloud_mcp_server.auth": default_level,
"nextcloud_mcp_server.observability": default_level,
# HTTP client loggers (less verbose by default)
"httpx": "WARNING",
"httpcore": "WARNING",
# Server loggers
"uvicorn": "INFO",
"uvicorn.access": "INFO",
"uvicorn.error": "INFO",
# MCP framework
"mcp": "INFO",
# OpenTelemetry (less verbose)
"opentelemetry": "WARNING",
}
for logger_name, level in logger_levels.items():
logger = logging.getLogger(logger_name)
logger.setLevel(getattr(logging, level.upper(), logging.INFO))
def get_logger(name: str) -> logging.Logger:
"""
Get a logger instance for a specific module.
This is a convenience function that wraps logging.getLogger()
to ensure consistent logger configuration.
Args:
name: Logger name (typically __name__)
Returns:
Logger instance
"""
return logging.getLogger(name)
def get_uvicorn_logging_config(
log_format: str = "json",
log_level: str = "INFO",
include_trace_context: bool = True,
) -> dict:
"""
Get uvicorn-compatible logging configuration.
This creates a logging config dict that uvicorn can use while maintaining
our observability setup (JSON format, trace context, etc.).
Args:
log_format: "json" or "text"
log_level: Minimum log level
include_trace_context: Whether to include trace IDs in logs
Returns:
Logging config dict compatible with uvicorn's log_config parameter
"""
# Determine formatter class based on format and trace context
if log_format.lower() == "json":
if include_trace_context:
formatter_class = "nextcloud_mcp_server.observability.logging_config.TraceContextFormatter"
else:
formatter_class = "pythonjsonlogger.jsonlogger.JsonFormatter"
format_string = "%(timestamp)s %(level)s %(name)s %(message)s"
else:
if include_trace_context:
formatter_class = "nextcloud_mcp_server.observability.logging_config.TraceContextTextFormatter"
else:
formatter_class = "logging.Formatter"
format_string = "%(levelname)s [%(asctime)s] %(name)s - %(message)s"
return {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"default": {
"()": formatter_class,
"format": format_string,
"datefmt": "%Y-%m-%d %H:%M:%S",
},
},
"handlers": {
"default": {
"formatter": "default",
"class": "logging.StreamHandler",
"stream": "ext://sys.stdout",
},
},
"loggers": {
"": {
"handlers": ["default"],
"level": log_level.upper(),
},
"uvicorn": {
"handlers": ["default"],
"level": "INFO",
"propagate": False,
},
"uvicorn.access": {
"handlers": ["default"],
"level": "INFO",
"propagate": False,
},
"uvicorn.error": {
"handlers": ["default"],
"level": "INFO",
"propagate": False,
},
"httpx": {
"handlers": ["default"],
"level": "WARNING",
"propagate": False,
},
"httpcore": {
"handlers": ["default"],
"level": "WARNING",
"propagate": False,
},
"opentelemetry": {
"handlers": ["default"],
"level": "WARNING",
"propagate": False,
},
},
}
@@ -0,0 +1,354 @@
"""
Prometheus metrics for the Nextcloud MCP Server.
This module defines all Prometheus metrics for monitoring server health, performance,
and resource usage. Metrics are organized by category:
- HTTP Server Metrics (RED: Rate, Errors, Duration)
- MCP Tool Metrics (per-tool invocation tracking)
- MCP Resource Metrics
- Nextcloud API Client Metrics
- OAuth Flow Metrics
- Vector Sync Metrics (conditional on feature flag)
- Database Operation Metrics
- External Dependency Health Metrics
"""
import logging
from prometheus_client import (
Counter,
Gauge,
Histogram,
start_http_server,
)
logger = logging.getLogger(__name__)
# =============================================================================
# HTTP Server Metrics (RED + System)
# =============================================================================
http_requests_total = Counter(
"mcp_http_requests_total",
"Total HTTP requests received",
["method", "endpoint", "status_code"],
)
http_request_duration_seconds = Histogram(
"mcp_http_request_duration_seconds",
"HTTP request latency in seconds",
["method", "endpoint"],
buckets=(0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0),
)
http_requests_in_progress = Gauge(
"mcp_http_requests_in_progress",
"Number of HTTP requests currently being processed",
["method", "endpoint"],
)
# =============================================================================
# MCP Tool Metrics
# =============================================================================
mcp_tool_calls_total = Counter(
"mcp_tool_calls_total",
"Total MCP tool invocations",
["tool_name", "status"], # status: success | error
)
mcp_tool_duration_seconds = Histogram(
"mcp_tool_duration_seconds",
"MCP tool execution duration in seconds",
["tool_name"],
buckets=(0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0, 30.0),
)
mcp_tool_errors_total = Counter(
"mcp_tool_errors_total",
"Total MCP tool errors by type",
["tool_name", "error_type"],
)
# =============================================================================
# MCP Resource Metrics
# =============================================================================
mcp_resource_requests_total = Counter(
"mcp_resource_requests_total",
"Total MCP resource requests",
["resource_uri", "status"],
)
mcp_resource_duration_seconds = Histogram(
"mcp_resource_duration_seconds",
"MCP resource request duration in seconds",
["resource_uri"],
buckets=(0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5),
)
# =============================================================================
# Nextcloud API Client Metrics
# =============================================================================
nextcloud_api_requests_total = Counter(
"mcp_nextcloud_api_requests_total",
"Total Nextcloud API requests",
["app", "method", "status_code"], # app: notes, calendar, contacts, etc.
)
nextcloud_api_duration_seconds = Histogram(
"mcp_nextcloud_api_duration_seconds",
"Nextcloud API request duration in seconds",
["app", "method"],
buckets=(0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0),
)
nextcloud_api_retries_total = Counter(
"mcp_nextcloud_api_retries_total",
"Total Nextcloud API retries",
["app", "reason"], # reason: 429 | timeout | connection_error
)
# =============================================================================
# OAuth Flow Metrics
# =============================================================================
oauth_token_validations_total = Counter(
"mcp_oauth_token_validations_total",
"Total OAuth token validation attempts",
["method", "result"], # method: introspect | jwt; result: valid | invalid | error
)
oauth_token_exchange_total = Counter(
"mcp_oauth_token_exchange_total",
"Total OAuth token exchange operations (RFC 8693)",
["status"], # status: success | error
)
oauth_token_cache_hits_total = Counter(
"mcp_oauth_token_cache_hits_total",
"Total OAuth token cache lookups",
["hit"], # hit: true | false
)
oauth_refresh_token_operations_total = Counter(
"mcp_oauth_refresh_token_operations_total",
"Total refresh token storage operations",
[
"operation",
"status",
], # operation: store | retrieve | delete; status: success | error
)
# =============================================================================
# Vector Sync Metrics (optional feature)
# =============================================================================
vector_sync_documents_scanned_total = Counter(
"mcp_vector_sync_documents_scanned_total",
"Total documents scanned for vector sync",
)
vector_sync_documents_processed_total = Counter(
"mcp_vector_sync_documents_processed_total",
"Total documents processed for vector sync",
["status"], # status: success | error
)
vector_sync_processing_duration_seconds = Histogram(
"mcp_vector_sync_processing_duration_seconds",
"Document processing duration in seconds",
buckets=(0.1, 0.5, 1.0, 2.5, 5.0, 10.0, 30.0, 60.0),
)
vector_sync_queue_size = Gauge(
"mcp_vector_sync_queue_size",
"Current number of documents in processing queue",
)
qdrant_operations_total = Counter(
"mcp_qdrant_operations_total",
"Total Qdrant vector database operations",
[
"operation",
"status",
], # operation: upsert | search | delete; status: success | error
)
# =============================================================================
# Database Metrics
# =============================================================================
db_operations_total = Counter(
"mcp_db_operations_total",
"Total database operations",
["db", "operation", "status"], # db: sqlite | qdrant; operation varies
)
db_operation_duration_seconds = Histogram(
"mcp_db_operation_duration_seconds",
"Database operation duration in seconds",
["db", "operation"],
buckets=(0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0),
)
# =============================================================================
# External Dependency Health Metrics
# =============================================================================
dependency_health = Gauge(
"mcp_dependency_health",
"External dependency health status (1=up, 0=down)",
["dependency"], # dependency: nextcloud | keycloak | qdrant | unstructured
)
dependency_check_duration_seconds = Histogram(
"mcp_dependency_check_duration_seconds",
"Dependency health check duration in seconds",
["dependency"],
buckets=(0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5),
)
# =============================================================================
# Metrics Setup and HTTP Handler
# =============================================================================
def setup_metrics(port: int = 9090) -> None:
"""
Initialize Prometheus metrics collection and start HTTP server.
Starts a dedicated HTTP server on the specified port to serve metrics.
This server runs in a separate thread and is isolated from the main application.
Args:
port: Port to serve metrics on (default: 9090)
Note:
Metrics endpoint (/metrics) is ONLY accessible on this dedicated port,
not on the main application HTTP port. This is a security best practice
to prevent external exposure of metrics.
"""
try:
start_http_server(port)
logger.info(f"Prometheus metrics server started on port {port}")
except OSError as e:
if "Address already in use" in str(e):
logger.warning(
f"Metrics port {port} already in use (metrics server likely already running)"
)
else:
logger.error(f"Failed to start metrics server on port {port}: {e}")
raise
# =============================================================================
# Convenience Functions for Common Metric Updates
# =============================================================================
def record_tool_call(tool_name: str, duration: float, status: str = "success") -> None:
"""
Record metrics for an MCP tool call.
Args:
tool_name: Name of the MCP tool
duration: Execution duration in seconds
status: "success" or "error"
"""
mcp_tool_calls_total.labels(tool_name=tool_name, status=status).inc()
mcp_tool_duration_seconds.labels(tool_name=tool_name).observe(duration)
def record_tool_error(tool_name: str, error_type: str) -> None:
"""
Record an MCP tool error.
Args:
tool_name: Name of the MCP tool
error_type: Type of error (e.g., "HTTPStatusError", "ValueError")
"""
mcp_tool_errors_total.labels(tool_name=tool_name, error_type=error_type).inc()
def record_nextcloud_api_call(
app: str,
method: str,
status_code: int,
duration: float,
) -> None:
"""
Record metrics for a Nextcloud API call.
Args:
app: Nextcloud app name (notes, calendar, contacts, etc.)
method: HTTP method (GET, POST, PUT, DELETE, PROPFIND, etc.)
status_code: HTTP status code
duration: Request duration in seconds
"""
nextcloud_api_requests_total.labels(
app=app, method=method, status_code=str(status_code)
).inc()
nextcloud_api_duration_seconds.labels(app=app, method=method).observe(duration)
def record_nextcloud_api_retry(app: str, reason: str) -> None:
"""
Record a Nextcloud API retry.
Args:
app: Nextcloud app name
reason: Retry reason (429, timeout, connection_error)
"""
nextcloud_api_retries_total.labels(app=app, reason=reason).inc()
def record_oauth_token_validation(method: str, result: str) -> None:
"""
Record an OAuth token validation.
Args:
method: Validation method ("introspect" or "jwt")
result: Validation result ("valid", "invalid", or "error")
"""
oauth_token_validations_total.labels(method=method, result=result).inc()
def record_db_operation(
db: str, operation: str, duration: float, status: str = "success"
) -> None:
"""
Record a database operation.
Args:
db: Database type ("sqlite" or "qdrant")
operation: Operation type (e.g., "insert", "select", "upsert", "search")
duration: Operation duration in seconds
status: "success" or "error"
"""
db_operations_total.labels(db=db, operation=operation, status=status).inc()
db_operation_duration_seconds.labels(db=db, operation=operation).observe(duration)
def set_dependency_health(dependency: str, is_healthy: bool) -> None:
"""
Update external dependency health status.
Args:
dependency: Dependency name (nextcloud, keycloak, qdrant, unstructured)
is_healthy: True if dependency is healthy, False otherwise
"""
dependency_health.labels(dependency=dependency).set(1 if is_healthy else 0)
def record_dependency_check(dependency: str, duration: float) -> None:
"""
Record a dependency health check duration.
Args:
dependency: Dependency name
duration: Check duration in seconds
"""
dependency_check_duration_seconds.labels(dependency=dependency).observe(duration)
@@ -0,0 +1,200 @@
"""
Observability middleware for the Nextcloud MCP Server.
This module provides Starlette middleware that automatically instruments
HTTP requests with:
- Prometheus metrics (request count, latency, in-flight requests)
- OpenTelemetry distributed tracing
- Request/response timing and error tracking
"""
import logging
import time
from typing import Callable
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response
from nextcloud_mcp_server.observability.metrics import (
http_request_duration_seconds,
http_requests_in_progress,
http_requests_total,
)
from nextcloud_mcp_server.observability.tracing import (
add_span_attribute,
trace_operation,
)
logger = logging.getLogger(__name__)
class ObservabilityMiddleware(BaseHTTPMiddleware):
"""
Starlette middleware for automatic HTTP request instrumentation.
This middleware:
- Records Prometheus metrics for each request (RED metrics)
- Creates OpenTelemetry spans for distributed tracing
- Tracks request timing and errors
- Handles in-flight request counting
"""
async def dispatch(
self,
request: Request,
call_next: Callable,
) -> Response:
"""
Process HTTP request with observability instrumentation.
Args:
request: Starlette request object
call_next: Next middleware or route handler
Returns:
Response from downstream handler
"""
# Extract request details
method = request.method
path = request.url.path
endpoint = self._get_endpoint_label(path)
# Increment in-flight requests counter
http_requests_in_progress.labels(method=method, endpoint=endpoint).inc()
# Record start time
start_time = time.time()
try:
# Create span for request (OpenTelemetry auto-instrumentation will create parent span)
with trace_operation(
f"HTTP {method} {endpoint}",
attributes={
"http.method": method,
"http.path": path,
"http.scheme": request.url.scheme,
"http.host": request.url.hostname,
},
):
# Process request
response = await call_next(request)
# Add response status to span
add_span_attribute("http.status_code", response.status_code)
# Record metrics
duration = time.time() - start_time
self._record_request_metrics(
method=method,
endpoint=endpoint,
status_code=response.status_code,
duration=duration,
)
return response
except Exception:
# Record error metrics
duration = time.time() - start_time
self._record_request_metrics(
method=method,
endpoint=endpoint,
status_code=500, # Internal server error
duration=duration,
)
logger.error(
f"Request failed: {method} {path}",
exc_info=True,
extra={
"method": method,
"path": path,
"duration_seconds": duration,
},
)
# Re-raise exception to be handled by error middleware
raise
finally:
# Decrement in-flight requests counter
http_requests_in_progress.labels(method=method, endpoint=endpoint).dec()
def _get_endpoint_label(self, path: str) -> str:
"""
Get endpoint label for metrics, normalizing dynamic path segments.
This prevents metric cardinality explosion by grouping similar paths.
Args:
path: Request path
Returns:
Normalized endpoint label
"""
# Health check endpoints
if path.startswith("/health/"):
return "/health/*"
# Metrics endpoint
if path == "/metrics":
return "/metrics"
# MCP protocol endpoints
if path == "/sse" or path.startswith("/sse/"):
return "/sse"
if path == "/messages" or path.startswith("/messages/"):
return "/messages"
# OAuth/OIDC endpoints
if path.startswith("/oauth/"):
return "/oauth/*"
if path.startswith("/oidc/"):
return "/oidc/*"
# Catch-all for other paths
return path
def _record_request_metrics(
self,
method: str,
endpoint: str,
status_code: int,
duration: float,
) -> None:
"""
Record Prometheus metrics for an HTTP request.
Args:
method: HTTP method
endpoint: Normalized endpoint label
status_code: HTTP status code
duration: Request duration in seconds
"""
# Record request count
http_requests_total.labels(
method=method,
endpoint=endpoint,
status_code=str(status_code),
).inc()
# Record request duration
http_request_duration_seconds.labels(
method=method,
endpoint=endpoint,
).observe(duration)
# Log slow requests (>1 second)
if duration > 1.0:
logger.warning(
f"Slow request: {method} {endpoint} took {duration:.3f}s",
extra={
"method": method,
"endpoint": endpoint,
"status_code": status_code,
"duration_seconds": duration,
},
)
@@ -0,0 +1,363 @@
"""
OpenTelemetry distributed tracing for the Nextcloud MCP Server.
This module provides:
- OpenTelemetry SDK initialization with OTLP exporter
- Auto-instrumentation for ASGI (Starlette/FastAPI) and httpx
- Helper functions for creating custom spans
- Context propagation utilities
- Span attribute standardization
"""
import logging
from contextlib import contextmanager
from typing import Any
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
from opentelemetry.instrumentation.logging import LoggingInstrumentor
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.trace import Status, StatusCode, Tracer
logger = logging.getLogger(__name__)
# Global tracer instance (initialized in setup_tracing)
_tracer: Tracer | None = None
def setup_tracing(
service_name: str = "nextcloud-mcp-server",
otlp_endpoint: str | None = None,
sampling_rate: float = 1.0,
) -> Tracer:
"""
Initialize OpenTelemetry tracing with OTLP exporter.
Args:
service_name: Service name for traces (default: "nextcloud-mcp-server")
otlp_endpoint: OTLP gRPC endpoint (e.g., "http://otel-collector:4317")
If None, tracing is initialized but no exporter is configured
sampling_rate: Sampling rate (0.0-1.0). Default 1.0 (100% sampling)
Returns:
Tracer instance for creating custom spans
"""
global _tracer
# Create resource with service name
resource = Resource.create(
{
"service.name": service_name,
"service.version": "0.27.2", # TODO: Extract from pyproject.toml
}
)
# Create tracer provider
provider = TracerProvider(resource=resource)
# Configure OTLP exporter if endpoint is provided
if otlp_endpoint:
try:
otlp_exporter = OTLPSpanExporter(endpoint=otlp_endpoint, insecure=True)
span_processor = BatchSpanProcessor(otlp_exporter)
provider.add_span_processor(span_processor)
logger.info(
f"OpenTelemetry tracing enabled with OTLP endpoint: {otlp_endpoint}"
)
except Exception as e:
logger.warning(
f"Failed to initialize OTLP exporter: {e}. Continuing without trace export."
)
else:
logger.info(
"OpenTelemetry tracing initialized without OTLP exporter (traces will be generated but not exported)"
)
# Set global tracer provider
trace.set_tracer_provider(provider)
# Auto-instrument httpx for Nextcloud API calls
HTTPXClientInstrumentor().instrument()
# Auto-instrument logging to inject trace context
LoggingInstrumentor().instrument(set_logging_format=True)
# Get and store tracer
_tracer = trace.get_tracer(__name__)
logger.info(f"OpenTelemetry tracing initialized for service: {service_name}")
return _tracer
def get_tracer() -> Tracer | None:
"""
Get the global tracer instance.
Returns:
Tracer instance for creating custom spans, or None if tracing is not enabled
Note:
Returns None if setup_tracing() was never called (tracing disabled).
Calling code should handle None gracefully.
"""
return _tracer
@contextmanager
def trace_operation(
operation_name: str,
attributes: dict[str, Any] | None = None,
record_exception: bool = True,
):
"""
Context manager for tracing an operation with automatic error handling.
Usage:
with trace_operation("mcp.tool.nc_notes_create_note", {"note.title": "My Note"}):
# Your code here
pass
Args:
operation_name: Name of the operation (span name)
attributes: Optional attributes to add to the span
record_exception: Whether to record exceptions in the span (default: True)
Yields:
Span instance for adding additional attributes (or None if tracing disabled)
"""
tracer = get_tracer()
# If tracing is not enabled, just yield without creating a span
if tracer is None:
yield None
return
with tracer.start_as_current_span(operation_name) as span:
# Set initial attributes
if attributes:
for key, value in attributes.items():
span.set_attribute(key, value)
try:
yield span
span.set_status(Status(StatusCode.OK))
except Exception as e:
if record_exception:
span.record_exception(e)
span.set_status(Status(StatusCode.ERROR, str(e)))
raise
def trace_mcp_tool(tool_name: str, tool_args: dict[str, Any] | None = None):
"""
Create a span for an MCP tool invocation.
Usage:
with trace_mcp_tool("nc_notes_create_note", {"title": "My Note"}):
# Tool implementation
pass
Args:
tool_name: Name of the MCP tool
tool_args: Optional tool arguments (sensitive data will be sanitized)
Returns:
Context manager for the span
"""
attributes = {
"mcp.tool.name": tool_name,
}
# Add sanitized tool args (avoid logging sensitive data)
if tool_args:
# Only include non-sensitive arguments
safe_args = {
k: v
for k, v in tool_args.items()
if k not in ("password", "token", "secret", "api_key", "etag")
}
if safe_args:
attributes["mcp.tool.args"] = str(safe_args)
return trace_operation(f"mcp.tool.{tool_name}", attributes)
def trace_nextcloud_api_call(
app: str,
method: str,
path: str | None = None,
):
"""
Create a span for a Nextcloud API call.
Usage:
with trace_nextcloud_api_call("notes", "POST", "/apps/notes/api/v1/notes"):
# API call implementation
pass
Args:
app: Nextcloud app name (notes, calendar, contacts, etc.)
method: HTTP method (GET, POST, PUT, DELETE, etc.)
path: Optional API path
Returns:
Context manager for the span
"""
attributes = {
"nextcloud.app": app,
"http.method": method,
}
if path:
attributes["http.path"] = path
return trace_operation(f"nextcloud.api.{app}.{method}", attributes)
def trace_oauth_operation(operation: str, details: dict[str, Any] | None = None):
"""
Create a span for an OAuth operation.
Usage:
with trace_oauth_operation("token.validate", {"method": "jwt"}):
# OAuth validation logic
pass
Args:
operation: OAuth operation name (e.g., "token.validate", "token.exchange")
details: Optional operation details (sensitive data will be sanitized)
Returns:
Context manager for the span
"""
attributes = {"oauth.operation": operation}
if details:
# Only include non-sensitive details
safe_details = {
k: v
for k, v in details.items()
if k not in ("token", "refresh_token", "access_token", "client_secret")
}
if safe_details:
attributes.update(safe_details)
return trace_operation(f"oauth.{operation}", attributes)
def trace_vector_sync_operation(
operation: str,
document_count: int | None = None,
):
"""
Create a span for a vector sync operation.
Usage:
with trace_vector_sync_operation("scan", document_count=10):
# Vector sync logic
pass
Args:
operation: Operation name (scan, process, embed, upsert)
document_count: Optional number of documents being processed
Returns:
Context manager for the span
"""
attributes = {"vector_sync.operation": operation}
if document_count is not None:
attributes["vector_sync.document_count"] = document_count
return trace_operation(f"vector_sync.{operation}", attributes)
def trace_db_operation(
db: str,
operation: str,
table: str | None = None,
):
"""
Create a span for a database operation.
Usage:
with trace_db_operation("sqlite", "insert", "refresh_tokens"):
# Database operation
pass
Args:
db: Database type (sqlite, qdrant)
operation: Operation type (insert, select, update, delete, upsert, search)
table: Optional table/collection name
Returns:
Context manager for the span
"""
attributes = {
"db.system": db,
"db.operation": operation,
}
if table:
attributes["db.table"] = table
return trace_operation(f"db.{db}.{operation}", attributes)
def add_span_attribute(key: str, value: Any) -> None:
"""
Add an attribute to the current span (if any).
Args:
key: Attribute key
value: Attribute value
Note:
This is a no-op if tracing is not enabled or there's no active span.
"""
if _tracer is None:
return # Tracing not enabled
span = trace.get_current_span()
if span.is_recording():
span.set_attribute(key, value)
def add_span_event(name: str, attributes: dict[str, Any] | None = None) -> None:
"""
Add an event to the current span (if any).
Args:
name: Event name
attributes: Optional event attributes
Note:
This is a no-op if tracing is not enabled or there's no active span.
"""
if _tracer is None:
return # Tracing not enabled
span = trace.get_current_span()
if span.is_recording():
span.add_event(name, attributes=attributes or {})
def get_trace_context() -> dict[str, str]:
"""
Get current trace context as a dictionary.
Returns:
Dictionary with trace_id and span_id (or empty dict if tracing disabled or no active span)
"""
if _tracer is None:
return {} # Tracing not enabled
span = trace.get_current_span()
if span.is_recording():
span_context = span.get_span_context()
return {
"trace_id": format(span_context.trace_id, "032x"),
"span_id": format(span_context.span_id, "016x"),
}
return {}
+17 -3
View File
@@ -105,13 +105,20 @@ async def scan_user_documents(
if initial_sync:
# Send everything on first sync
for note in notes:
# Handle missing 'modified' field (use 0 as fallback)
modified_at = note.get("modified", 0)
if modified_at == 0:
logger.warning(
f"Note {note['id']} missing 'modified' field, using 0 as fallback"
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=str(note["id"]),
doc_type="note",
operation="index",
modified_at=note["modified"],
modified_at=modified_at,
)
)
logger.info(f"Sent {len(notes)} documents for initial sync: {user_id}")
@@ -147,6 +154,13 @@ async def scan_user_documents(
doc_id = str(note["id"])
indexed_at = indexed_docs.get(doc_id)
# Handle missing 'modified' field (use 0 as fallback)
modified_at = note.get("modified", 0)
if modified_at == 0:
logger.warning(
f"Note {doc_id} missing 'modified' field, using 0 as fallback"
)
# If document reappeared, remove from potentially_deleted
doc_key = (user_id, doc_id)
if doc_key in _potentially_deleted:
@@ -156,14 +170,14 @@ async def scan_user_documents(
del _potentially_deleted[doc_key]
# Send if never indexed or modified since last index
if indexed_at is None or note["modified"] > indexed_at:
if indexed_at is None or modified_at > indexed_at:
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=doc_id,
doc_type="note",
operation="index",
modified_at=note["modified"],
modified_at=modified_at,
)
)
queued += 1
+10 -1
View File
@@ -1,6 +1,6 @@
[project]
name = "nextcloud-mcp-server"
version = "0.26.1"
version = "0.29.2"
description = "Model Context Protocol (MCP) server for Nextcloud integration - enables AI assistants to interact with Nextcloud data"
authors = [
{name = "Chris Coutinho", email = "chris@coutinho.io"}
@@ -22,6 +22,15 @@ dependencies = [
"aiosqlite>=0.20.0", # Async SQLite for refresh token storage
"authlib>=1.6.5",
"qdrant-client>=1.7.0",
# Observability dependencies
"prometheus-client>=0.21.0", # Prometheus metrics
"opentelemetry-api>=1.28.2", # OpenTelemetry API
"opentelemetry-sdk>=1.28.2", # OpenTelemetry SDK
"opentelemetry-instrumentation-asgi>=0.49b2", # Auto-instrument ASGI/Starlette
"opentelemetry-instrumentation-httpx>=0.49b2", # Auto-instrument httpx client
"opentelemetry-instrumentation-logging>=0.49b2", # Logging integration
"opentelemetry-exporter-otlp-proto-grpc>=1.28.2", # OTLP gRPC exporter
"python-json-logger>=3.2.0", # Structured JSON logging
]
classifiers = [
"Development Status :: 4 - Beta",
Generated
+228 -1
View File
@@ -57,6 +57,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/31/da/e42d7a9d8dd33fa775f467e4028a47936da2f01e4b0e561f9ba0d74cb0ca/argcomplete-3.6.2-py3-none-any.whl", hash = "sha256:65b3133a29ad53fb42c48cf5114752c7ab66c1c38544fdf6460f450c09b42591", size = 43708, upload-time = "2025-04-03T04:57:01.591Z" },
]
[[package]]
name = "asgiref"
version = "3.10.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/46/08/4dfec9b90758a59acc6be32ac82e98d1fbfc321cb5cfa410436dbacf821c/asgiref-3.10.0.tar.gz", hash = "sha256:d89f2d8cd8b56dada7d52fa7dc8075baa08fb836560710d38c292a7a3f78c04e", size = 37483, upload-time = "2025-10-05T09:15:06.557Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/17/9c/fc2331f538fbf7eedba64b2052e99ccf9ba9d6888e2f41441ee28847004b/asgiref-3.10.0-py3-none-any.whl", hash = "sha256:aef8a81283a34d0ab31630c9b7dfe70c812c95eba78171367ca8745e88124734", size = 24050, upload-time = "2025-10-05T09:15:05.11Z" },
]
[[package]]
name = "asttokens"
version = "3.0.0"
@@ -487,6 +496,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/c1/ea/53f2148663b321f21b5a606bd5f191517cf40b7072c0497d3c92c4a13b1e/executing-2.2.1-py2.py3-none-any.whl", hash = "sha256:760643d3452b4d777d295bb167ccc74c64a81df23fb5e08eff250c425a4b2017", size = 28317, upload-time = "2025-09-01T09:48:08.5Z" },
]
[[package]]
name = "googleapis-common-protos"
version = "1.72.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "protobuf" },
]
sdist = { url = "https://files.pythonhosted.org/packages/e5/7b/adfd75544c415c487b33061fe7ae526165241c1ea133f9a9125a56b39fd8/googleapis_common_protos-1.72.0.tar.gz", hash = "sha256:e55a601c1b32b52d7a3e65f43563e2aa61bcd737998ee672ac9b951cd49319f5", size = 147433, upload-time = "2025-11-06T18:29:24.087Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/c4/ab/09169d5a4612a5f92490806649ac8d41e3ec9129c636754575b3553f4ea4/googleapis_common_protos-1.72.0-py3-none-any.whl", hash = "sha256:4299c5a82d5ae1a9702ada957347726b167f9f8d1fc352477702a1e851ff4038", size = 297515, upload-time = "2025-11-06T18:29:13.14Z" },
]
[[package]]
name = "greenlet"
version = "3.2.4"
@@ -692,6 +713,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/0e/61/66938bbb5fc52dbdf84594873d5b51fb1f7c7794e9c0f5bd885f30bc507b/idna-3.11-py3-none-any.whl", hash = "sha256:771a87f49d9defaf64091e6e6fe9c18d4833f140bd19464795bc32d966ca37ea", size = 71008, upload-time = "2025-10-12T14:55:18.883Z" },
]
[[package]]
name = "importlib-metadata"
version = "8.7.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "zipp" },
]
sdist = { url = "https://files.pythonhosted.org/packages/76/66/650a33bd90f786193e4de4b3ad86ea60b53c89b669a5c7be931fac31cdb0/importlib_metadata-8.7.0.tar.gz", hash = "sha256:d13b81ad223b890aa16c5471f2ac3056cf76c5f10f82d6f9292f0b415f389000", size = 56641, upload-time = "2025-04-27T15:29:01.736Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/20/b0/36bd937216ec521246249be3bf9855081de4c5e06a0c9b4219dbeda50373/importlib_metadata-8.7.0-py3-none-any.whl", hash = "sha256:e5dd1551894c77868a30651cef00984d50e1002d06942a7101d34870c5f02afd", size = 27656, upload-time = "2025-04-27T15:29:00.214Z" },
]
[[package]]
name = "iniconfig"
version = "2.3.0"
@@ -1026,7 +1059,7 @@ wheels = [
[[package]]
name = "nextcloud-mcp-server"
version = "0.26.1"
version = "0.29.2"
source = { editable = "." }
dependencies = [
{ name = "aiosqlite" },
@@ -1036,9 +1069,17 @@ dependencies = [
{ name = "httpx" },
{ name = "icalendar" },
{ name = "mcp", extra = ["cli"] },
{ name = "opentelemetry-api" },
{ name = "opentelemetry-exporter-otlp-proto-grpc" },
{ name = "opentelemetry-instrumentation-asgi" },
{ name = "opentelemetry-instrumentation-httpx" },
{ name = "opentelemetry-instrumentation-logging" },
{ name = "opentelemetry-sdk" },
{ name = "pillow" },
{ name = "prometheus-client" },
{ name = "pydantic" },
{ name = "pyjwt", extra = ["crypto"] },
{ name = "python-json-logger" },
{ name = "pythonvcard4" },
{ name = "qdrant-client" },
]
@@ -1067,9 +1108,17 @@ requires-dist = [
{ name = "httpx", specifier = ">=0.28.1,<0.29.0" },
{ name = "icalendar", specifier = ">=6.0.0,<7.0.0" },
{ name = "mcp", extras = ["cli"], specifier = ">=1.21,<1.22" },
{ name = "opentelemetry-api", specifier = ">=1.28.2" },
{ name = "opentelemetry-exporter-otlp-proto-grpc", specifier = ">=1.28.2" },
{ name = "opentelemetry-instrumentation-asgi", specifier = ">=0.49b2" },
{ name = "opentelemetry-instrumentation-httpx", specifier = ">=0.49b2" },
{ name = "opentelemetry-instrumentation-logging", specifier = ">=0.49b2" },
{ name = "opentelemetry-sdk", specifier = ">=1.28.2" },
{ name = "pillow", specifier = ">=12.0.0,<12.1.0" },
{ name = "prometheus-client", specifier = ">=0.21.0" },
{ name = "pydantic", specifier = ">=2.11.4" },
{ name = "pyjwt", extras = ["crypto"], specifier = ">=2.8.0" },
{ name = "python-json-logger", specifier = ">=3.2.0" },
{ name = "pythonvcard4", specifier = ">=0.2.0" },
{ name = "qdrant-client", specifier = ">=1.7.0" },
]
@@ -1170,6 +1219,157 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/95/8e/2844c3959ce9a63acc7c8e50881133d86666f0420bcde695e115ced0920f/numpy-2.3.4-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:81b3a59793523e552c4a96109dde028aa4448ae06ccac5a76ff6532a85558a7f", size = 12973130, upload-time = "2025-10-15T16:18:09.397Z" },
]
[[package]]
name = "opentelemetry-api"
version = "1.38.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "importlib-metadata" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/08/d8/0f354c375628e048bd0570645b310797299754730079853095bf000fba69/opentelemetry_api-1.38.0.tar.gz", hash = "sha256:f4c193b5e8acb0912b06ac5b16321908dd0843d75049c091487322284a3eea12", size = 65242, upload-time = "2025-10-16T08:35:50.25Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/ae/a2/d86e01c28300bd41bab8f18afd613676e2bd63515417b77636fc1add426f/opentelemetry_api-1.38.0-py3-none-any.whl", hash = "sha256:2891b0197f47124454ab9f0cf58f3be33faca394457ac3e09daba13ff50aa582", size = 65947, upload-time = "2025-10-16T08:35:30.23Z" },
]
[[package]]
name = "opentelemetry-exporter-otlp-proto-common"
version = "1.38.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-proto" },
]
sdist = { url = "https://files.pythonhosted.org/packages/19/83/dd4660f2956ff88ed071e9e0e36e830df14b8c5dc06722dbde1841accbe8/opentelemetry_exporter_otlp_proto_common-1.38.0.tar.gz", hash = "sha256:e333278afab4695aa8114eeb7bf4e44e65c6607d54968271a249c180b2cb605c", size = 20431, upload-time = "2025-10-16T08:35:53.285Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/a7/9e/55a41c9601191e8cd8eb626b54ee6827b9c9d4a46d736f32abc80d8039fc/opentelemetry_exporter_otlp_proto_common-1.38.0-py3-none-any.whl", hash = "sha256:03cb76ab213300fe4f4c62b7d8f17d97fcfd21b89f0b5ce38ea156327ddda74a", size = 18359, upload-time = "2025-10-16T08:35:34.099Z" },
]
[[package]]
name = "opentelemetry-exporter-otlp-proto-grpc"
version = "1.38.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "googleapis-common-protos" },
{ name = "grpcio" },
{ name = "opentelemetry-api" },
{ name = "opentelemetry-exporter-otlp-proto-common" },
{ name = "opentelemetry-proto" },
{ name = "opentelemetry-sdk" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/a2/c0/43222f5b97dc10812bc4f0abc5dc7cd0a2525a91b5151d26c9e2e958f52e/opentelemetry_exporter_otlp_proto_grpc-1.38.0.tar.gz", hash = "sha256:2473935e9eac71f401de6101d37d6f3f0f1831db92b953c7dcc912536158ebd6", size = 24676, upload-time = "2025-10-16T08:35:53.83Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/28/f0/bd831afbdba74ca2ce3982142a2fad707f8c487e8a3b6fef01f1d5945d1b/opentelemetry_exporter_otlp_proto_grpc-1.38.0-py3-none-any.whl", hash = "sha256:7c49fd9b4bd0dbe9ba13d91f764c2d20b0025649a6e4ac35792fb8d84d764bc7", size = 19695, upload-time = "2025-10-16T08:35:35.053Z" },
]
[[package]]
name = "opentelemetry-instrumentation"
version = "0.59b0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-api" },
{ name = "opentelemetry-semantic-conventions" },
{ name = "packaging" },
{ name = "wrapt" },
]
sdist = { url = "https://files.pythonhosted.org/packages/04/ed/9c65cd209407fd807fa05be03ee30f159bdac8d59e7ea16a8fe5a1601222/opentelemetry_instrumentation-0.59b0.tar.gz", hash = "sha256:6010f0faaacdaf7c4dff8aac84e226d23437b331dcda7e70367f6d73a7db1adc", size = 31544, upload-time = "2025-10-16T08:39:31.959Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/10/f5/7a40ff3f62bfe715dad2f633d7f1174ba1a7dd74254c15b2558b3401262a/opentelemetry_instrumentation-0.59b0-py3-none-any.whl", hash = "sha256:44082cc8fe56b0186e87ee8f7c17c327c4c2ce93bdbe86496e600985d74368ee", size = 33020, upload-time = "2025-10-16T08:38:31.463Z" },
]
[[package]]
name = "opentelemetry-instrumentation-asgi"
version = "0.59b0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "asgiref" },
{ name = "opentelemetry-api" },
{ name = "opentelemetry-instrumentation" },
{ name = "opentelemetry-semantic-conventions" },
{ name = "opentelemetry-util-http" },
]
sdist = { url = "https://files.pythonhosted.org/packages/b7/a4/cfbb6fc1ec0aa9bf5a93f548e6a11ab3ac1956272f17e0d399aa2c1f85bc/opentelemetry_instrumentation_asgi-0.59b0.tar.gz", hash = "sha256:2509d6fe9fd829399ce3536e3a00426c7e3aa359fc1ed9ceee1628b56da40e7a", size = 25116, upload-time = "2025-10-16T08:39:36.092Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f3/88/fe02d809963b182aafbf5588685d7a05af8861379b0ec203d48e360d4502/opentelemetry_instrumentation_asgi-0.59b0-py3-none-any.whl", hash = "sha256:ba9703e09d2c33c52fa798171f344c8123488fcd45017887981df088452d3c53", size = 16797, upload-time = "2025-10-16T08:38:37.214Z" },
]
[[package]]
name = "opentelemetry-instrumentation-httpx"
version = "0.59b0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-api" },
{ name = "opentelemetry-instrumentation" },
{ name = "opentelemetry-semantic-conventions" },
{ name = "opentelemetry-util-http" },
{ name = "wrapt" },
]
sdist = { url = "https://files.pythonhosted.org/packages/18/6b/1bdf36b68cace9b4eae3cbbade4150c71c90aa392b127dda5bb5c2a49307/opentelemetry_instrumentation_httpx-0.59b0.tar.gz", hash = "sha256:a1cb9b89d9f05a82701cc9ab9cfa3db54fd76932489449778b350bc1b9f0e872", size = 19886, upload-time = "2025-10-16T08:39:48.428Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/58/16/c1e0745d20af392ec9060693531d7f01239deb2d81e460d0c379719691b8/opentelemetry_instrumentation_httpx-0.59b0-py3-none-any.whl", hash = "sha256:7dc9f66aef4ca3904d877f459a70c78eafd06131dc64d713b9b1b5a7d0a48f05", size = 15197, upload-time = "2025-10-16T08:38:55.507Z" },
]
[[package]]
name = "opentelemetry-instrumentation-logging"
version = "0.59b0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-api" },
{ name = "opentelemetry-instrumentation" },
]
sdist = { url = "https://files.pythonhosted.org/packages/be/88/9c5f70fa8b8d96d30be378fc6eb1776e13aea456db15009f4eaef4928847/opentelemetry_instrumentation_logging-0.59b0.tar.gz", hash = "sha256:1b51116444edc74f699daf9002ded61529397100c9bc903c8b9aaa75a5218c76", size = 9969, upload-time = "2025-10-16T08:39:51.653Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2c/a0/340cc45d71437c2f7e27f13c1d2e335b18bbc7a24fd7d174018500b3c7ba/opentelemetry_instrumentation_logging-0.59b0-py3-none-any.whl", hash = "sha256:fdd4eddbd093fc421df8f7d356ecb15b320a1f3396b56bce5543048a5c457eea", size = 12577, upload-time = "2025-10-16T08:38:58.064Z" },
]
[[package]]
name = "opentelemetry-proto"
version = "1.38.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "protobuf" },
]
sdist = { url = "https://files.pythonhosted.org/packages/51/14/f0c4f0f6371b9cb7f9fa9ee8918bfd59ac7040c7791f1e6da32a1839780d/opentelemetry_proto-1.38.0.tar.gz", hash = "sha256:88b161e89d9d372ce723da289b7da74c3a8354a8e5359992be813942969ed468", size = 46152, upload-time = "2025-10-16T08:36:01.612Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b6/6a/82b68b14efca5150b2632f3692d627afa76b77378c4999f2648979409528/opentelemetry_proto-1.38.0-py3-none-any.whl", hash = "sha256:b6ebe54d3217c42e45462e2a1ae28c3e2bf2ec5a5645236a490f55f45f1a0a18", size = 72535, upload-time = "2025-10-16T08:35:45.749Z" },
]
[[package]]
name = "opentelemetry-sdk"
version = "1.38.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-api" },
{ name = "opentelemetry-semantic-conventions" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/85/cb/f0eee1445161faf4c9af3ba7b848cc22a50a3d3e2515051ad8628c35ff80/opentelemetry_sdk-1.38.0.tar.gz", hash = "sha256:93df5d4d871ed09cb4272305be4d996236eedb232253e3ab864c8620f051cebe", size = 171942, upload-time = "2025-10-16T08:36:02.257Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2f/2e/e93777a95d7d9c40d270a371392b6d6f1ff170c2a3cb32d6176741b5b723/opentelemetry_sdk-1.38.0-py3-none-any.whl", hash = "sha256:1c66af6564ecc1553d72d811a01df063ff097cdc82ce188da9951f93b8d10f6b", size = 132349, upload-time = "2025-10-16T08:35:46.995Z" },
]
[[package]]
name = "opentelemetry-semantic-conventions"
version = "0.59b0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-api" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/40/bc/8b9ad3802cd8ac6583a4eb7de7e5d7db004e89cb7efe7008f9c8a537ee75/opentelemetry_semantic_conventions-0.59b0.tar.gz", hash = "sha256:7a6db3f30d70202d5bf9fa4b69bc866ca6a30437287de6c510fb594878aed6b0", size = 129861, upload-time = "2025-10-16T08:36:03.346Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/24/7d/c88d7b15ba8fe5c6b8f93be50fc11795e9fc05386c44afaf6b76fe191f9b/opentelemetry_semantic_conventions-0.59b0-py3-none-any.whl", hash = "sha256:35d3b8833ef97d614136e253c1da9342b4c3c083bbaf29ce31d572a1c3825eed", size = 207954, upload-time = "2025-10-16T08:35:48.054Z" },
]
[[package]]
name = "opentelemetry-util-http"
version = "0.59b0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/34/f7/13cd081e7851c42520ab0e96efb17ffbd901111a50b8252ec1e240664020/opentelemetry_util_http-0.59b0.tar.gz", hash = "sha256:ae66ee91be31938d832f3b4bc4eb8a911f6eddd38969c4a871b1230db2a0a560", size = 9412, upload-time = "2025-10-16T08:40:11.335Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/20/56/62282d1d4482061360449dacc990c89cad0fc810a2ed937b636300f55023/opentelemetry_util_http-0.59b0-py3-none-any.whl", hash = "sha256:6d036a07563bce87bf521839c0671b507a02a0d39d7ea61b88efa14c6e25355d", size = 7648, upload-time = "2025-10-16T08:39:25.706Z" },
]
[[package]]
name = "packaging"
version = "25.0"
@@ -1327,6 +1527,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/4b/a6/38c8e2f318bf67d338f4d629e93b0b4b9af331f455f0390ea8ce4a099b26/portalocker-3.2.0-py3-none-any.whl", hash = "sha256:3cdc5f565312224bc570c49337bd21428bba0ef363bbcf58b9ef4a9f11779968", size = 22424, upload-time = "2025-06-14T13:20:38.083Z" },
]
[[package]]
name = "prometheus-client"
version = "0.23.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/23/53/3edb5d68ecf6b38fcbcc1ad28391117d2a322d9a1a3eff04bfdb184d8c3b/prometheus_client-0.23.1.tar.gz", hash = "sha256:6ae8f9081eaaaf153a2e959d2e6c4f4fb57b12ef76c8c7980202f1e57b48b2ce", size = 80481, upload-time = "2025-09-18T20:47:25.043Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b8/db/14bafcb4af2139e046d03fd00dea7873e48eafe18b7d2797e73d6681f210/prometheus_client-0.23.1-py3-none-any.whl", hash = "sha256:dd1913e6e76b59cfe44e7a4b83e01afc9873c1bdfd2ed8739f1e76aeca115f99", size = 61145, upload-time = "2025-09-18T20:47:23.875Z" },
]
[[package]]
name = "prompt-toolkit"
version = "3.0.51"
@@ -1655,6 +1864,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/5f/ed/539768cf28c661b5b068d66d96a2f155c4971a5d55684a514c1a0e0dec2f/python_dotenv-1.1.1-py3-none-any.whl", hash = "sha256:31f23644fe2602f88ff55e1f5c79ba497e01224ee7737937930c448e4d0e24dc", size = 20556, upload-time = "2025-06-24T04:21:06.073Z" },
]
[[package]]
name = "python-json-logger"
version = "4.0.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/29/bf/eca6a3d43db1dae7070f70e160ab20b807627ba953663ba07928cdd3dc58/python_json_logger-4.0.0.tar.gz", hash = "sha256:f58e68eb46e1faed27e0f574a55a0455eecd7b8a5b88b85a784519ba3cff047f", size = 17683, upload-time = "2025-10-06T04:15:18.984Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/51/e5/fecf13f06e5e5f67e8837d777d1bc43fac0ed2b77a676804df5c34744727/python_json_logger-4.0.0-py3-none-any.whl", hash = "sha256:af09c9daf6a813aa4cc7180395f50f2a9e5fa056034c9953aec92e381c5ba1e2", size = 15548, upload-time = "2025-10-06T04:15:17.553Z" },
]
[[package]]
name = "python-multipart"
version = "0.0.20"
@@ -2317,3 +2535,12 @@ sdist = { url = "https://files.pythonhosted.org/packages/79/2b/8ae5f59ab852c8fe3
wheels = [
{ url = "https://files.pythonhosted.org/packages/0f/b7/4bac35b4079b76c07d8faddf89467e9891b1610cfe8d03b0ebb5610e4423/x_wr_timezone-2.0.1-py3-none-any.whl", hash = "sha256:e74a53b9f4f7def8138455c240e65e47c224778bce3c024fcd6da2cbe91ca038", size = 11102, upload-time = "2025-02-06T17:10:39.192Z" },
]
[[package]]
name = "zipp"
version = "3.23.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/e3/02/0f2892c661036d50ede074e376733dca2ae7c6eb617489437771209d4180/zipp-3.23.0.tar.gz", hash = "sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166", size = 25547, upload-time = "2025-06-08T17:06:39.4Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2e/54/647ade08bf0db230bfea292f893923872fd20be6ac6f53b2b936ba839d75/zipp-3.23.0-py3-none-any.whl", hash = "sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e", size = 10276, upload-time = "2025-06-08T17:06:38.034Z" },
]