nextcloud-mcp-server

Author	SHA1	Message	Date
Chris Coutinho	c126c3ec03	fix: Preserve 3D plot camera and improve documentation This commit addresses PR feedback and fixes plot camera behavior. ## JavaScript Fix - Camera Preservation - Changed plot update strategy from recreating layout to using Plotly.restyle() - Query point visibility now toggles via restyle() which only modifies trace visibility - Camera position/zoom naturally preserved since layout remains untouched - Resolves jumpy plot behavior when toggling "Show Query Point" checkbox Related: nextcloud_mcp_server/auth/static/vector-viz.js:58-73 ## Documentation Improvements - Condensed vector-sync-ui.md from 316 to 94 lines (~70% reduction) - Removed redundant FAQ section (content merged into main sections) - Simplified use cases from 4 detailed sections to 3 focused paragraphs - Streamlined troubleshooting to 3 common issues - Merged technical details into overview section - Retained all essential information while improving readability ## Screenshot Updates Removed old/outdated images (5 files): - rag-workflow-bidirectional-final.png - rag-workflow-prominent-llm.png - rag-workflow-simple-final.png - vector-viz-interface.png - welcome-page.png Replaced with current screenshots (3 files): - vector-viz-document-types-2col.png - Now shows plot + results - vector-viz-chunk-context.png - Centered content view - vector-viz-results.png - Updated results list 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 14:10:53 +01:00
Chris Coutinho	53689d076b	feat: Improve vector visualization with static assets and fixes - Extract CSS and JavaScript into separate static files - Created nextcloud_mcp_server/auth/static/vector-viz.css - Created nextcloud_mcp_server/auth/static/vector-viz.js - Updated templates to reference external assets - Fix vector visualization issues: - Normalize vectors before PCA to match Qdrant's cosine distance - Add zero-norm and NaN detection/handling for large datasets - Enable responsive Plotly sizing (autosize + responsive config) - Widen plot area to full viewport width with minimized margins - Improve visualization accuracy: - Query point now positioned correctly relative to documents - Handles 200+ points without JSON serialization errors - Full-width plot maximizes screen space utilization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 04:10:44 +01:00
Chris Coutinho	eec923eff5	feat: Replace custom document chunker with LangChain MarkdownTextSplitter Migrates from custom word-based chunking to LangChain's MarkdownTextSplitter for better semantic search quality. This implements the chunking portion of ADR-011. Changes: - Replace custom regex word chunker with MarkdownTextSplitter - Optimized for Markdown content (headers, code blocks, lists) - Convert from word-based (512 words) to character-based (2048 chars) chunking - Maintain backward-compatible ChunkWithPosition interface - Update configuration defaults and validation - Update all unit tests (12/12 passing) Benefits: - Respects markdown structure boundaries - Never breaks code blocks or headers mid-chunk - Preserves semantic coherence within chunks - Expected 20-30% improvement in recall quality - Industry-standard approach (used by production RAG systems) Note: Full reindex required to apply new chunking to existing documents. Current vector database still contains old word-based chunks. Related: ADR-011 (Improving Semantic Search Quality) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-18 12:17:23 +01:00
Chris Coutinho	6cfd7e2729	feat: add configurable fusion algorithms for BM25 hybrid search Added support for two fusion algorithms (RRF and DBSF) to combine dense semantic and sparse BM25 search results, with comprehensive documentation and unit tests. Changes: - Added fusion parameter to nc_semantic_search and nc_semantic_search_answer tools - Updated ADR-014 with detailed comparison of RRF vs DBSF fusion algorithms - Added unit tests for fusion algorithm initialization and validation - Updated search_method in responses to include fusion type (e.g., "bm25_hybrid_rrf") Fusion Algorithms: - RRF (Reciprocal Rank Fusion): Default, rank-based, general-purpose - DBSF (Distribution-Based Score Fusion): Score normalization using statistics RRF is recommended for most use cases due to its robustness and established track record. DBSF may provide better results when retrieval systems have very different score distributions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 06:48:43 +01:00
Chris Coutinho	ed33b39062	docs: fix ADR-014 template text and numbering - Remove template instruction text from line 1 - Fix ADR numbering from 007 to 014 to match filename	2025-11-16 12:08:37 +01:00
Chris Coutinho	1504df6fb5	Merge branch 'master' into feature/bedrock	2025-11-16 12:08:23 +01:00
Chris Coutinho	c28fc955ca	Merge origin/master into feature/bm25 Resolved conflicts: - viz_routes.py: Kept bm25's extract_dense_vector() function for robust vector handling - hybrid.py: Removed (bm25 uses native Qdrant RRF fusion instead) - uv.lock: Regenerated after accepting master's dependencies This merge brings in: - RAG evaluation framework (ADR-013) - Performance optimizations (double-fetch elimination) - Migration from asyncio to anyio - OpenTelemetry tracing improvements - Notes app enhancements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:52:40 +01:00
Chris Coutinho	5b484c9226	feat: add unified provider architecture with Amazon Bedrock support Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities. ## Major Changes ### Unified Provider Architecture (ADR-015) - Created `nextcloud_mcp_server/providers/` with unified Provider ABC - Providers now support optional capabilities (embeddings and/or generation) - Auto-detection registry with priority: Bedrock → Ollama → Simple - Backward compatible - existing code continues to work ### New Providers - BedrockProvider: Full Amazon Bedrock integration - Embeddings: Titan Embed, Cohere Embed models - Generation: Claude, Llama, Titan Text, Mistral models - Model-specific request/response handling - AWS credential chain integration - OllamaProvider: Migrated with both capabilities support - AnthropicProvider: Moved from test code to production providers - SimpleProvider: Migrated in-memory fallback provider ### Breaking Changes None - full backward compatibility maintained: - `embedding.get_embedding_service()` still works - RAG evaluation tests updated to use unified providers - All existing tests pass (127 unit tests) ### Testing - Added 9 comprehensive Bedrock unit tests with mocked boto3 - All existing unit tests pass - Type checking (ty) and linting (ruff) pass - Verified backward compatibility ### Documentation - `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR - `docs/bedrock-setup.md`: AWS setup guide with IAM permissions - `CLAUDE.md`: Updated with provider architecture section ### Dependencies - Added `boto3>=1.35.0` to dev dependencies (optional) ## Environment Variables ### Bedrock - `AWS_REGION`: AWS region (e.g., "us-east-1") - `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings - `BEDROCK_GENERATION_MODEL`: Model ID for generation - `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials ### Ollama - `OLLAMA_BASE_URL`: API URL - `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text") - `OLLAMA_GENERATION_MODEL`: Generation model ## AWS Bedrock Permissions Required Minimal IAM policy: ```json { "Effect": "Allow", "Action": ["bedrock:InvokeModel"], "Resource": ["arn:aws:bedrock:::foundation-model/"] } ``` See `docs/bedrock-setup.md` for detailed setup instructions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:36:58 +01:00
Chris Coutinho	8799450c7d	Merge pull request #306 from cbcoutinho/rag-evaluation feat: RAG evaluation framework with performance improvements	2025-11-16 11:17:41 +01:00
Chris Coutinho	c4bf077050	feat: Add OpenTelemetry tracing to @instrument_tool decorator Enhances the @instrument_tool decorator to create distributed traces for all MCP tool executions, improving observability and debugging. Changes: - Modified @instrument_tool to wrap tool execution in trace_operation - Added automatic span creation with mcp.tool.* span names - Sanitized tool arguments before adding to span attributes (excludes password, token, secret, api_key, etag, ctx) - Limited argument strings to 500 characters to prevent huge spans - Maintained existing Prometheus metrics functionality - Updated docs/observability.md to reflect correct decorator name - Added comprehensive unit tests All ~50+ MCP tools now emit traces automatically without code changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:16:05 +01:00
Chris Coutinho	2aa82d849c	Merge branch 'feature/bm25'	2025-11-16 07:57:36 +01:00
Chris Coutinho	6fe5596c13	feat: Implement BM25 hybrid search with native Qdrant RRF fusion Replace custom keyword/fuzzy search algorithms with industry-standard BM25 sparse vectors, combined with dense semantic vectors using Qdrant's native Reciprocal Rank Fusion (RRF). This consolidates search architecture and improves relevance for both semantic and keyword queries. Key changes: - Add fastembed dependency for BM25 sparse vector generation - Update Qdrant collection schema to support named vectors (dense + sparse) - Create BM25SparseEmbeddingProvider using FastEmbed's Qdrant/bm25 model - Implement BM25HybridSearchAlgorithm with native Qdrant RRF prefetch - Update document processor to generate both dense and sparse embeddings - Simplify nc_semantic_search() tool to use BM25 hybrid only - Remove legacy keyword.py, fuzzy.py, and custom hybrid.py (736 lines) - Update ADR-014 with implementation notes and test results Benefits: - Consolidated architecture (single Qdrant database) - Native database-level RRF fusion (more efficient) - Industry-standard BM25 (replaces brittle custom keyword search) - Better relevance across semantic and keyword queries - Simplified codebase (-285 net lines) Tests: All 125 tests passing (118 unit, 7 integration) Implements ADR-014: Replace Custom Keyword Search with BM25 Hybrid Search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 06:59:44 +01:00
Chris Coutinho	f5bc3e3bc3	docs: init ADR	2025-11-16 06:24:25 +01:00
Chris Coutinho	fca8ab0cfd	Merge remote-tracking branch 'origin/master' into rag-evaluation	2025-11-16 00:32:59 +01:00
Chris Coutinho	c272ddd82d	feat: implement RAG evaluation framework with CLI tooling - Add ADR-013 documenting RAG evaluation architecture - Implement two-part evaluation: Context Recall (retrieval) + Answer Correctness (generation) - Create Click CLI for ground truth generation and corpus upload - Add pytest fixtures and tests for retrieval/generation quality - Use BeIR/nfcorpus dataset with 5 selected test queries - Support Ollama and Anthropic LLM providers - Generate synthetic ground truth answers offline - Add comprehensive documentation in tests/rag_evaluation/README.md The framework separates one-time setup (generate/upload) from test execution, making tests much faster (~6-12 min vs ~15-25 min per run). Tests are manual only (not in CI) and require external LLM access. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 23:11:21 +01:00
Chris Coutinho	56bd85c0f7	docs: Emphasize server-side processing in ADR-012 viz pane Updates ADR-012 to clarify that all search and filtering operations must happen server-side, not in the browser. Key changes: - Enhanced viz pane data flow showing server-side processing - Added performance benefits section (384x bandwidth reduction) - Detailed server-side filtering approach: * Query execution via search/algorithms.py * User ID filtering (multi-tenant security) * Document type filtering * PCA reduction (768-dim → 2D) on server * Only 2D coordinates + metadata sent to client - Updated Phase 3 implementation plan: * Remove ALL client-side search logic * Implement /app/vector-viz server endpoint * htmx form submission for queries * Performance optimizations (caching, streaming) This ensures: - Minimal bandwidth usage (only 2 floats per doc vs 768) - Client handles only visualization, not computation - Can visualize 10,000+ documents without client lag - Raw vectors never leave server (security) - Same search logic as MCP tool (consistency) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 00:02:54 +01:00
Chris Coutinho	5e67277049	docs: Add architecture diagrams and viz pane UI to ADR-012 Enhances ADR-012 with detailed architecture visualization and UI mockup for the vector visualization pane. Added sections: - Architecture diagram showing MCP tool and viz pane integration - Data flow diagrams for both MCP requests and viz pane interactions - Detailed UI mockup with ASCII art showing: * Search configuration controls * Algorithm selector with weight sliders * Interactive 2D scatter plot (Plotly.js) * Results panel with scores * Performance comparison table - Technology stack details (htmx, Alpine.js, Plotly.js, Tailwind CSS) The diagrams illustrate how the viz pane and MCP tool share the same search algorithm implementations from search/algorithms.py, ensuring consistency between user testing interface and programmatic API. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 00:00:40 +01:00
Chris Coutinho	66a7109130	docs: Add ADR-012 for unified multi-algorithm search Proposes unified search architecture with client-configurable algorithm selection and weighting. Addresses the need for flexible search options beyond pure semantic search. Key features: - Four algorithms: semantic, keyword, fuzzy, hybrid - Client-configurable weights for hybrid search - Shared implementation between viz pane and MCP tools - Reciprocal Rank Fusion (RRF) for result combination - Backward compatible with existing nc_semantic_search() Implements designs from: - ADR-003: Hybrid search with RRF (previously unimplemented) - ADR-001: Token-based keyword search (previously unimplemented) Supersedes ADR-011's placeholder for "ADR-013: Hybrid Search" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-14 23:56:09 +01:00
Chris Coutinho	73b3d80026	Merge pull request #294 from cbcoutinho/feature/app_api docs: Add ADR-011 for hybrid OAuth + AppAPI deployment architecture	2025-11-13 23:43:25 +01:00
Chris Coutinho	26099d643d	docs: Update ADR-011 to rejected status with Context Agent validation After comprehensive research, the hybrid OAuth + AppAPI architecture is NOT being implemented due to fundamental architectural incompatibilities. Key updates: - Status: Proposed → Not Planned - Added validation from Nextcloud Context Agent project - Context Agent (official NC ExApp with MCP) faces IDENTICAL limitations - Proves constraints are architectural, not implementation-specific Context Agent findings: - ExApp with MCP server endpoint (~28 tools exposed) - Uses Task Processing API for confirmations (NOT MCP elicitation) - Works around AppAPI proxy limitations by changing protocol - MCP endpoint is secondary feature with documented constraints - Primary use: In-app Assistant integration, not external MCP clients Critical features impossible through AppAPI proxy: - ❌ MCP sampling (eliminates RAG/LLM features) - ❌ MCP elicitation (user prompts) - ❌ Real-time progress updates - ❌ Bidirectional streaming - Validated by Context Agent facing same limitations Decision rationale: - MCP requires multi-turn nested interactions - AppAPI provides stateless request/response proxy only - No implementation effort can bridge this fundamental gap - Would require complete AppAPI redesign (WebSocket, message routing) - Even official Nextcloud projects work around these limitations Alternative considered for future: - Register as Task Processing provider (different product) - Use Nextcloud Assistant UI (not external MCP clients) - Accept different capabilities (no sampling, custom flows) OAuth mode remains sole solution for external MCP client integration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 23:30:14 +01:00
Chris Coutinho	c3023d2cc3	feat: Complete Phase 5 - Instrument all 93 MCP tools Applied @instrument_tool decorator to all 86 remaining tools across 8 server files. Instrumented files: - calendar.py: 16 tools - contacts.py: 7 tools - deck.py: 25 tools - webdav.py: 11 tools - tables.py: 6 tools - sharing.py: 5 tools - cookbook.py: 13 tools - semantic.py: 3 tools Total: 93 tools instrumented (7 in notes.py + 86 in other files) These metrics populate: - MCP Tool Calls panel (by tool name and status) - MCP Tool Duration panel (histogram) - MCP Tool Errors panel (by tool name and error type) This completes PR #295 - All 5 phases of metrics instrumentation done: ✅ Phase 1: Queue size metrics (2 locations) ✅ Phase 2: Health checks (1 location) ✅ Phase 3: Database operations (3 methods) ✅ Phase 4: OAuth token metrics (3 locations) ✅ Phase 5: MCP tool metrics (93 tools) All 34 dashboard panels now have data sources.	2025-11-13 16:58:44 +01:00
Chris Coutinho	ff3123a190	docs: Add ADR-011 for hybrid OAuth + AppAPI deployment architecture This ADR documents the architectural decision to support both OAuth and AppAPI (ExApp) deployment modes in a single codebase with 90%+ code sharing. Key additions: - Comprehensive analysis of AppAPI limitations and challenges - Feature parity matrix comparing OAuth vs AppAPI modes - Resolution of critical open questions via research: * Non-browser client authentication (app passwords/OAuth) * Streaming transport compatibility (buffered, not real-time) * Callbacks/webhooks (MCP notifications not possible in AppAPI) - Detailed implementation plan with 4 phases (10 days) - Mode-aware architecture with abstraction layer Critical findings: - AppAPI mode does NOT support MCP sampling (RAG features) - No real-time progress updates (use Nextcloud notifications) - Buffered streaming only (Streamable HTTP works, WebSocket doesn't) - Requires app password support in AppAPI proxy Deployment mode selection: - OAuth: Multi-tenant, external clients, sampling/RAG, real-time updates - AppAPI: Single-tenant, simplified install, native UI, admin-controlled Related to investigation of ~/Software/app_api/ and ~/Software/nc_py_api/ for AppAPI integration patterns. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 13:10:21 +01:00
Chris Coutinho	d86a185e04	refactor: move webapp from /user/page to /app Simplified the webapp routing structure by consolidating the admin UI to a single clean endpoint. Changes: - Moved webapp from /user/page to /app (root of mount) - Removed /user JSON endpoint (no longer needed) - Updated mount point from /user to /app in app.py - Updated all route path checks (3 locations) - Updated OAuth redirects to point to /app - Updated all HTMX endpoint references - Updated documentation (ADR-007, CHANGELOG) - Added redirect from /app to /app/ for trailing slash handling New Route Structure: - /app - Main webapp (HTML UI with tabs) - /app/revoke - Revoke background access - /app/webhooks - Webhook management UI - /app/webhooks/enable/{preset_id} - Enable webhook preset - /app/webhooks/disable/{preset_id} - Disable webhook preset Breaking Change: Existing bookmarks to /user or /user/page will no longer work. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-11 20:53:43 +01:00
Chris Coutinho	f4759e424d	feat: add webhook management UI and BeforeNodeDeletedEvent support Added comprehensive webhook management capabilities including: Webhook Client & API: - Added WebhooksClient for Nextcloud webhooks API integration - Create, list, update, and delete webhooks programmatically - Support for event filters in webhook registration Webhook Presets: - Added preset system for common webhook configurations - notes_sync: BeforeNodeDeletedEvent for Notes file operations - calendar_sync: Calendar events (create, update, delete) - deck_sync: Deck card operations - files_sync: File system changes - forms_sync: Form submissions (conditional) - Filter presets by installed apps Admin UI: - Added multi-pane app view with tabs (User Info, Vector Sync, Webhooks) - Webhooks tab for admin users only - Enable/disable preset webhooks via UI - View currently registered webhooks - Uses htmx for dynamic loading and Alpine.js for tab state - Admin permission checking via OCS API CLI Improvements: - Refactored CLI to separate module (cli.py) - Updated entry point in pyproject.toml BeforeNodeDeletedEvent Fix: - Updated ADR-010 to document NodeDeletedEvent issue - BeforeNodeDeletedEvent includes node.id before deletion - NodeDeletedEvent lacks node.id (file already deleted) - Implemented per Nextcloud maintainer recommendation Testing: - Added comprehensive webhook client tests - Added webhook preset filtering tests - Added admin permission tests Configuration: - Updated docker-compose.yml Qdrant settings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-11 20:35:08 +01:00
Chris Coutinho	b58e7238ae	feat: validate Nextcloud webhook schemas and document findings Manual testing of Nextcloud webhook_listeners app to validate webhook payloads against ADR-010 expected schemas and document implementation requirements for webhook-based vector synchronization. ## Changes - Add test webhook endpoint at /webhooks/nextcloud in app.py - Captures and logs webhook payloads for analysis - Returns 200 OK immediately for webhook delivery confirmation - Create webhook-testing-findings.md with comprehensive test results - Captured payloads for 5/6 webhook event types - Critical findings: missing node.id in deletions, type mismatches - Implementation recommendations with code examples - Update ADR-010 with Appendix A: Manual Webhook Testing Results - Document actual vs expected webhook behavior - Update event mapping table with tested webhook status - Add 6 specific implementation recommendations - Include testing implications for future development ## Testing Results ✅ NodeCreatedEvent - fires correctly, includes node.id (integer) ✅ NodeWrittenEvent - fires correctly, includes node.id (integer) ✅ NodeDeletedEvent - fires but missing node.id field (path only) ✅ CalendarObjectCreatedEvent - fires correctly with full iCal ✅ CalendarObjectUpdatedEvent - fires correctly with full iCal ❌ CalendarObjectDeletedEvent - does not fire (potential NC bug) ## Key Findings 1. NodeDeletedEvent missing node.id field - requires path-based fallback 2. node.id returns integer not string - needs casting for consistency 3. Multiple webhooks fire per operation - needs deduplication logic 4. Calendar deletion webhooks don't fire - reported as issue #53497 5. Calendar webhooks include full iCal content - enables rich parsing ## GitHub Issues - Created issue #56371: NodeDeletedEvent missing node.id field - Commented on issue #53497: CalendarObjectDeletedEvent not firing Closes #283 --- _This commit was generated with the help of AI, and reviewed by a Human_	2025-11-11 12:13:20 +01:00
Chris Coutinho	a6e5f3d8ff	refactor: simplify OpenTelemetry tracing configuration Simplifies the OpenTelemetry tracing setup by removing the redundant OTEL_ENABLED flag and using the presence of OTEL_EXPORTER_OTLP_ENDPOINT to determine if tracing should be enabled. This follows the standard OpenTelemetry environment variable conventions more closely. Changes: - Remove OTEL_ENABLED/tracing_enabled flag in favor of checking if OTEL_EXPORTER_OTLP_ENDPOINT is set - Add OTEL_EXPORTER_VERIFY_SSL configuration option for OTLP endpoints with self-signed certificates (defaults to false for development) - Move HTTPXClientInstrumentor initialization to module level to ensure httpx calls are traced across all Nextcloud API requests - Add tracing spans to vector sync operations (scan_user_documents) - Fix authorization header logging to only warn about missing headers in OAuth mode (BasicAuth mode doesn't use Authorization headers) - Update observability documentation to reflect simplified configuration - Refactor Dockerfile to use --no-editable flag for uv sync Breaking changes: - OTEL_ENABLED environment variable is removed - Tracing is now automatically enabled when OTEL_EXPORTER_OTLP_ENDPOINT is set Migration guide: - Remove OTEL_ENABLED=true from environment configuration - Tracing will be enabled automatically if OTEL_EXPORTER_OTLP_ENDPOINT is configured 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 22:48:37 +01:00
Chris Coutinho	3a41860d27	docs: Add ADR-010 for webhook-based vector sync Add architecture decision record for integrating Nextcloud webhooks into the vector database synchronization system. Key features: - Webhook endpoint at /webhooks/nextcloud receives push notifications - Complements existing polling (ADR-007) without replacing it - Optional authentication via WEBHOOK_SECRET - Simple architecture: webhooks are just another DocumentTask producer - Administrators can reduce polling frequency when webhooks are configured Benefits: - Reduced latency: seconds to minutes instead of up to 1 hour - Lower API load: ~95% reduction when polling frequency is increased - Better scalability: only process changed documents - No changes required to scanner or processor components 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 05:28:36 +01:00
Chris Coutinho	cb39b3fca4	feat(vector): Add configurable chunk size and overlap for document embedding Enable users to tune document chunking parameters to match their embedding model and content type by adding DOCUMENT_CHUNK_SIZE and DOCUMENT_CHUNK_OVERLAP environment variables. - config.py: Added `document_chunk_size` (default: 512) and `document_chunk_overlap` (default: 50) configuration fields with validation: - Ensures overlap < chunk_size - Warns if chunk_size < 100 words - Prevents negative overlap values - processor.py: Updated DocumentChunker instantiation to use config settings instead of hardcoded values (line 174-177) - tests/unit/test_config.py: Added TestChunkConfigValidation class with 9 tests covering: - Default values - Valid configurations - Validation errors (overlap >= chunk_size, negative overlap) - Warning for small chunk sizes - Environment variable loading - docs/configuration.md: Added comprehensive "Document Chunking Configuration" section with: - Chunk size selection guidance (256-384 vs 512 vs 768-1024 words) - Overlap recommendations (10-20% of chunk size) - Configuration examples for different use cases - Added env vars to reference table - docs/semantic-search-architecture.md: Added "Document Chunking Strategy" section with: - Chunking process explanation - Example showing sliding window behavior - Search behavior with chunks - Tuning recommendations - env.sample: Added complete "Semantic Search & Vector Sync Configuration" section with: - Vector sync settings - Qdrant configuration (3 modes) - Ollama embedding service - Document chunking configuration - docker-compose.yml: Added commented examples for DOCUMENT_CHUNK_SIZE and DOCUMENT_CHUNK_OVERLAP with usage notes \`\`\`bash DOCUMENT_CHUNK_SIZE=512 DOCUMENT_CHUNK_OVERLAP=50 \`\`\` 1. \`overlap\` must be less than \`chunk_size\` 2. \`overlap\` cannot be negative 3. Warning issued if \`chunk_size\` < 100 words Precise matching (small notes, specific queries): \`\`\`bash DOCUMENT_CHUNK_SIZE=256 DOCUMENT_CHUNK_OVERLAP=25 \`\`\` Balanced (default, general purpose): \`\`\`bash DOCUMENT_CHUNK_SIZE=512 DOCUMENT_CHUNK_OVERLAP=50 \`\`\` Contextual (long documents, broader topics): \`\`\`bash DOCUMENT_CHUNK_SIZE=1024 DOCUMENT_CHUNK_OVERLAP=100 \`\`\` ✅ User control - Tune chunking to match embedding model capabilities ✅ Experimentation - Test different chunk sizes for optimal results ✅ Model alignment - Match chunk size to embedding context window ✅ Backward compatible - Defaults maintain existing behavior ✅ Well validated - Comprehensive tests prevent misconfiguration All 22 config validation tests pass (9 new tests for chunking): - Default values work correctly - Validation prevents invalid configurations - Environment variables load properly - Warning system works as expected With configurable chunk sizes, users can now experiment with different Ollama embedding models and tune chunk parameters for optimal semantic search quality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 02:47:57 +01:00
Chris Coutinho	e575c8e57b	feat(vector): Support multiple embedding models with auto-generated collection names This PR enables safe switching between embedding models and multi-server deployments by implementing auto-generated Qdrant collection names based on deployment ID and model name. ## Problem Previously, all deployments used a single hardcoded collection name "nextcloud_content", which caused two critical issues: 1. Dimension mismatches when switching models: Changing OLLAMA_EMBEDDING_MODEL (e.g., nomic-embed-text at 768D → all-minilm at 384D) would cause runtime errors as vectors couldn't be inserted into a collection with incompatible dimensions. 2. Collection collisions in multi-server setups: Multiple MCP servers sharing a single Qdrant instance would overwrite each other's data, making horizontal scaling impossible. ## Solution ### Auto-Generated Collection Naming Collections are now automatically named using the pattern: \`{deployment-id}-{model-name}\` Deployment ID: Uses \`OTEL_SERVICE_NAME\` if configured (and not default value), otherwise falls back to \`hostname\` for simple Docker deployments. Model Name: From \`OLLAMA_EMBEDDING_MODEL\` with path separators sanitized. Examples: - \`my-mcp-server-nomic-embed-text\` (with OTEL_SERVICE_NAME=my-mcp-server) - \`mcp-container-all-minilm\` (simple Docker, hostname=mcp-container) Override: Users can still set \`QDRANT_COLLECTION\` explicitly to bypass auto-generation for backward compatibility. ### Dimension Validation Added startup validation that checks collection dimensions match the embedding service. If a mismatch is detected, the server fails fast with a clear error message explaining: - Expected vs actual dimensions - Likely cause (model change) - Solutions (delete collection, use different name, or revert model) ### Improved Sampling Error Handling Enhanced MCP sampling rejection handling to treat user rejections as normal behavior rather than errors: - User rejections ("rejected", "denied") → INFO log, no traceback - Unsupported clients → INFO log, no traceback - Other MCP errors → WARNING log, no traceback - Unexpected errors → ERROR log WITH traceback This aligns with the MCP specification where clients SHOULD prompt users for approval/denial of sampling requests. ## Changes ### Core Implementation - nextcloud_mcp_server/config.py: Added \`get_collection_name()\` method with deployment ID detection and model name sanitization - nextcloud_mcp_server/vector/qdrant_client.py: Dimension validation on collection open with helpful error messages - nextcloud_mcp_server/vector/{scanner,processor}.py: Updated to use \`get_collection_name()\` - nextcloud_mcp_server/auth/userinfo_routes.py: Vector sync status uses \`get_collection_name()\` - nextcloud_mcp_server/server/semantic.py: - Updated semantic search tools to use \`get_collection_name()\` - Improved sampling rejection error handling (McpError vs Exception) ### Documentation - docs/semantic-search-architecture.md: New comprehensive architecture document (557 lines) covering background sync, semantic search flow, RAG implementation, and deployment modes - docs/configuration.md: Added detailed "Qdrant Collection Naming" section with examples and multi-server deployment guidance - docker-compose.yml: Added comments explaining collection naming behavior - README.md: Updated semantic search descriptions to clarify experimental status, Notes-only support, and infrastructure requirements ## Migration Guide For existing single-server deployments: Option 1 (Recommended): Use explicit collection name for continuity \`\`\`bash QDRANT_COLLECTION=nextcloud_content # Keep existing collection \`\`\` Option 2: Allow auto-generation and re-embed \`\`\`bash # Remove QDRANT_COLLECTION override # New collection will be created based on deployment ID + model # Requires re-embedding all documents (may take time) \`\`\` For new multi-server deployments: Set unique OTEL service names per server: \`\`\`bash # Server 1 OTEL_SERVICE_NAME=mcp-prod OLLAMA_EMBEDDING_MODEL=nomic-embed-text # → Collection: "mcp-prod-nomic-embed-text" # Server 2 OTEL_SERVICE_NAME=mcp-staging OLLAMA_EMBEDDING_MODEL=nomic-embed-text # → Collection: "mcp-staging-nomic-embed-text" \`\`\` ## Benefits ✅ Safe model switching: Each model gets its own collection, preventing dimension mismatch errors ✅ Multi-server support: Multiple MCP servers can share one Qdrant instance without conflicts ✅ Clear ownership: Collection names show which deployment and model owns the data ✅ Better error messages: Dimension validation provides actionable guidance ✅ Backward compatible: Existing deployments can continue using \`QDRANT_COLLECTION\` override ## Testing Validated with: - Single-server deployments (default hostname-based naming) - Multi-server deployments (OTEL service name-based naming) - Model switching scenarios (dimension validation) - Collection override scenarios (backward compatibility) Next steps: Testing various Ollama embedding models to investigate optimal chunk sizes and performance characteristics. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 01:18:30 +01:00
Chris Coutinho	578de4d7d6	feat(observability): Add comprehensive monitoring with Prometheus and OpenTelemetry - Add Prometheus metrics for HTTP, MCP tools, Nextcloud API, OAuth, vector sync, and DB operations - Add OpenTelemetry distributed tracing with OTLP export - Add structured JSON logging with trace context correlation - Add ObservabilityMiddleware for automatic HTTP instrumentation - Add app_name attribute to all client classes for per-app metrics - Add configuration for metrics, tracing, and logging via environment variables - Add documentation in docs/observability.md - Fix graceful degradation when tracing is disabled (default state) - Fix uvicorn logging configuration to use observability formatters 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 08:54:04 +01:00
Chris Coutinho	857d8f2152	feat: add Qdrant local mode support with in-memory and persistent storage Adds flexible Qdrant deployment modes to reduce infrastructure requirements for local development and smaller deployments: Configuration Changes: - Add QDRANT_LOCATION environment variable (mutually exclusive with QDRANT_URL) - Three modes: network (URL), in-memory (:memory:, default), persistent (file path) - Settings dataclass validation via __post_init__ ensures mutual exclusivity - API key warning when set in local mode (ignored, only for network mode) Client Initialization: - Auto-detect mode: network (url + api_key) vs local (:memory: or path=) - In-memory: AsyncQdrantClient(":memory:") - zero config default - Persistent: AsyncQdrantClient(path="/app/data/qdrant") - file storage - Network: AsyncQdrantClient(url, api_key) - production mode Docker Compose Updates: - Qdrant service moved to optional profile (--profile qdrant) - MCP service uses QDRANT_LOCATION=:memory: by default - Added mcp-data volume for persistent storage (/app/data) - No hard dependency on qdrant service Documentation: - Comprehensive configuration guide in docs/configuration.md - All three modes documented with pros/cons - Docker Compose examples for each mode - Environment variable reference table Tests: - 13 new config validation tests (mutual exclusivity, defaults, warnings) - Persistent mode integration test (create, close, reopen, verify persistence) - All 82 unit tests + 5 smoke tests pass Breaking Change: - Default changed from QDRANT_URL=http://qdrant:6333 to QDRANT_LOCATION=:memory: - Simplifies local development (no external service needed) - Production deployments: explicitly set QDRANT_URL or QDRANT_LOCATION Related: ADR-007 background vector sync implementation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 07:07:07 +01:00
Chris Coutinho	31799ffd9a	docs: remove VECTOR_SYNC_ENABLED_APPS env var, use per-user database settings Replace static VECTOR_SYNC_ENABLED_APPS environment variable with per-user database storage for which apps to index. This allows each user to control their own indexing preferences (e.g., enable notes and calendar but not deck or files). Rationale: - Nextcloud doesn't support granular OAuth scopes at the app level - Per-user settings provide flexibility for multi-user deployments - Users control app enablement via nc_enable_vector_sync MCP tool - Aligns with OAuth architecture where users manage their own settings Changes: - ADR-007: Remove VECTOR_SYNC_ENABLED_APPS from configuration section - ADR-007: Update scanner implementation to read from database - ADR-007: Add explanation of per-user app enablement mechanism - ADR-007: Clarify that nc_enable_vector_sync tool manages this setting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 05:11:56 +01:00
Chris Coutinho	5cc598e1b1	docs: refactor semantic search from notes-specific to multi-app architecture Update ADRs to reflect that vector database and semantic search support multiple Nextcloud apps (notes, calendar, deck, files, contacts) rather than being notes-specific. Introduce semantic:read/write OAuth scopes to replace app-specific scope requirements for cross-app search. Changes: - ADR-007: Add plugin architecture (DocumentScanner, DocumentProcessor, DocumentVerifier) for multi-app vector sync - ADR-008: Rename tools from nc_notes_semantic_* to nc_semantic_*, update scope from notes:read to semantic:read - ADR-009: NEW - Document decision to use generic semantic:read scope with dual-phase authorization instead of requiring all app scopes - oauth-architecture.md: Add semantic:read/write scope documentation - README.md: Move semantic search to dedicated section separate from Notes This is a breaking change that correctly positions semantic search as a cross-app capability before broader adoption. Existing deployments will need to re-authenticate with the new semantic:read scope. Relates to user request to decouple vector database from notes-only model and establish proper OAuth scope boundaries for multi-app semantic search. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 04:47:20 +01:00
Chris Coutinho	bb5d4f464f	feat: implement MCP sampling for semantic search RAG (ADR-008) Add nc_notes_semantic_search_answer tool that combines semantic search with MCP sampling to generate natural language answers from retrieved Nextcloud Notes. This enables Retrieval-Augmented Generation (RAG) patterns without requiring a server-side LLM. Key features: - Client-side LLM generation via ctx.session.create_message() - Graceful fallback when sampling unavailable - Proper source citations in generated answers - No results optimization (skips sampling when no docs found) - Comprehensive unit and integration tests Implementation details: - SamplingSearchResponse model with generated_answer and sources - Fixed prompt template with document context and citation instructions - Model preferences hint Claude Sonnet for balanced performance - Falls back to returning documents without answer on sampling failure Updates: - Add ADR-008 documenting sampling architecture decision - Add MCP sampling pattern guidance to CLAUDE.md - Update README.md and docs/notes.md (7 → 9 tools) - Add 4 unit tests and 6 integration tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 01:00:18 +01:00
Chris Coutinho	dc93da2ea0	docs: add ADR-007 for background vector database synchronization Add comprehensive ADR-007 documenting background vector database synchronization architecture using anyio TaskGroups for in-process concurrency. This supersedes ADR-003's conceptual background worker. Key decisions: - In-process architecture using anyio TaskGroups (not Celery) - Scanner task runs hourly, detects changes via timestamp comparison - In-memory asyncio.Queue for pending documents - Pool of 3 concurrent processor tasks for I/O-bound embedding workloads - Qdrant metadata as single source of truth for indexing state - Simple user controls: enable/disable with status visibility Benefits: - Single container deployment (was 3: mcp, celery-worker, celery-beat) - No distributed task queue infrastructure - Shared process state (no volume coordination) - Sufficient throughput for I/O-bound embedding APIs - Simpler debugging and deployment Update ADR-003 status to "Superseded by ADR-007" with reference link. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 20:32:49 +01:00
Chris Coutinho	0c9a9ea24d	fix: Consolidate OAuth callbacks and implement PKCE for all flows This PR fixes multiple OAuth-related issues: ## Unified OAuth Callback - Consolidated `/oauth/callback-nextcloud` and `/oauth/login-callback` into single `/oauth/callback` endpoint - Flow type determined by session lookup via state parameter (no query params in redirect_uri) - Fixes redirect_uri validation issues with IdPs requiring exact match - Legacy endpoints kept as aliases for backwards compatibility ## PKCE Implementation - Implemented PKCE (RFC 7636) for Flow 2 (resource provisioning) - Generate code_verifier and code_challenge - Store code_verifier in session storage - Retrieve and use in token exchange - Fixed PKCE for browser login (integrated mode) - Previously only worked for external IdP (Keycloak) - Now works for both Nextcloud OIDC and external IdP ## Login Elicitation Fixes (ADR-006) - Fixed elicitation URL to route through MCP server endpoint - Changed from direct Nextcloud URL to `/oauth/authorize-nextcloud` - Ensures PKCE is properly handled by server - Fixed login detection after OAuth flow completes - Look up refresh token by state parameter instead of user_id - Works even when Flow 1 token not present - Added `get_refresh_token_by_provisioning_client_id()` method ## Session Authentication - Fixed `/user/page` redirect loop - Shared oauth_context with mounted browser_app - SessionAuthBackend can now validate sessions correctly ## Tests - Added comprehensive login elicitation test suite - Updated scope authorization test expectations - All 43 OAuth tests passing ## Files Changed - `app.py`: Shared oauth_context, unified callback route - `oauth_routes.py`: Unified callback, PKCE for Flow 2 - `browser_oauth_routes.py`: PKCE for integrated mode - `oauth_tools.py`: Fixed elicitation URL generation - `refresh_token_storage.py`: Added lookup by provisioning_client_id - `test_login_elicitation.py`: New test suite 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-07 21:08:55 +01:00
Chris Coutinho	659087e4c7	fix: Implement proper OAuth resource parameters and PRM-based discovery This commit completes the OAuth audience validation implementation per RFC 7519, RFC 8707 (Resource Indicators), and RFC 9728 (Protected Resource Metadata). ## Key Changes ### OAuth Resource Parameters (RFC 8707) - Add `resource` parameter to Flow 1 (MCP client auth) with MCP server audience - Add `resource` parameter to Flow 2 (Nextcloud access) with Nextcloud audience - Add `nextcloud_resource_uri` to oauth_context configuration - Fix undefined variable error in starlette_lifespan ### PRM-Based Resource Discovery (RFC 9728) - Update tests to fetch resource identifier from PRM endpoint - Add fallback to hardcoded value if PRM fetch fails - Demonstrate correct OAuth client implementation pattern ### ADR-005 Documentation Updates - Update to reflect simplified RFC 7519 compliant implementation - Document that MCP validates only its own audience (not Nextcloud's) - Add section on OAuth resource parameters and PRM discovery - Update implementation checklist to show completed items - Mark status as "Implemented" with update date ## Implementation Details The solution follows RFC 7519 Section 4.1.3: resource servers validate only their own presence in the audience claim. This simplifies the logic while maintaining security: - MCP server validates MCP audience only - Nextcloud independently validates its own audience - No dual validation required at MCP layer - Token reuse is allowed per RFC 8707 for multi-audience tokens ## Test Results ✅ test_mcp_oauth_server_connection - PASSED ✅ test_deck_board_view_permissions - PASSED ✅ test_prm_endpoint - PASSED All OAuth flows now properly specify target resources, resulting in correct audience validation throughout the system. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-05 23:19:03 +01:00
Chris Coutinho	7d9ab5559c	fix: Simplify token verifier to be RFC 7519 compliant Per RFC 7519 Section 4.1.3, resource servers should only validate their own presence in the audience claim, not check for other resource servers. Changes: - UnifiedTokenVerifier now validates only MCP audience (not Nextcloud's) - Nextcloud independently validates its own audience when receiving API calls - This is NOT token passthrough (we validate tokens before use) - This IS token reuse which is explicitly allowed by RFC 8707 Updates: - Simplified _validate_multi_audience() to follow OAuth spec - Updated docstrings and comments to clarify RFC 7519 compliance - Fixed unit tests that expected dual-audience validation - Updated ADR-005 to document the correct OAuth interpretation - All tests pass: unit (65), smoke (5), OAuth integration This makes the implementation simpler, more maintainable, and properly aligned with OAuth 2.0 specifications while maintaining security. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-05 21:44:04 +01:00
Chris Coutinho	28c2debf3e	docs: Add ADR-005 for unified token verifier architecture This ADR addresses the critical token passthrough vulnerability identified in Issue #261 by proposing a unified token verifier that eliminates the security issue while maintaining flexibility. Key changes: - Consolidates two non-compliant verifiers into single UnifiedTokenVerifier - Implements two-layer architecture (verification + exchange) - Supports multi-audience mode (default) and token exchange mode (opt-in) - Removes all token passthrough paths to comply with MCP security spec - Works within python-sdk constraints using proper separation of concerns The solution provides: - Single source of truth for token validation - MCP specification compliance - Minimal performance impact (1-2% of LLM request time) - Clear migration path for existing deployments BREAKING CHANGE: All OAuth deployments must be reconfigured to specify resource URIs (NEXTCLOUD_MCP_SERVER_URL and NEXTCLOUD_RESOURCE_URI) and choose between multi-audience or token exchange mode. Related: #261 Supersedes: Token passthrough mode in ADR-004 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-05 18:34:43 +01:00
Chris Coutinho	d14f2f666d	feat: Add userinfo route/page	2025-11-04 00:03:24 +01:00
Chris Coutinho	71e77e95bc	refactor: integrate token exchange into unified get_client() pattern Resolves the token exchange implementation gap where get_session_client() was implemented but never used by tools. Unifies token acquisition into a single async get_client() method that handles both pass-through and token exchange modes transparently. Core Changes: - Make get_client() async and merge token exchange logic into it - Remove scopes parameter from token exchange (Nextcloud doesn't support OAuth scopes) - Update all 8 tool modules to use await get_client(ctx) - Fix provisioning decorator to skip checks in BasicAuth mode Token Acquisition Modes: 1. BasicAuth: Returns shared client (no token operations) 2. OAuth pass-through (default): Verifies and passes Flow 1 token to Nextcloud 3. OAuth token exchange (opt-in): Exchanges Flow 1 token for ephemeral token via RFC 8693 Key Architectural Clarifications: - Progressive Consent (Flow 1/2) = Authorization architecture - Token Exchange = Token acquisition pattern during tool execution - Refresh tokens from Flow 2 are NEVER used for tool calls (only background jobs) - Nextcloud scopes are "soft-scopes" enforced by MCP server, not IdP Documentation Updates: - ADR-004: Added comprehensive token acquisition patterns section - CRITICAL-TOKEN-EXCHANGE-PATTERN.md: Updated to reflect implementation status - CLAUDE.md: Updated architectural patterns with async get_client() Testing: - All 36 unit tests passing - All 4 smoke tests passing (BasicAuth mode) - Linting issues fixed (ruff) Configuration: ENABLE_TOKEN_EXCHANGE=false (default) - pass-through mode ENABLE_TOKEN_EXCHANGE=true (opt-in) - token exchange mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 20:33:56 +01:00
Chris Coutinho	027fc0b2d6	docs: Add critical token exchange pattern documentation Documents the architectural flaw in current implementation where session tokens and background tokens are not properly separated. Key issues identified: - Session tokens should be exchanged on-demand (RFC 8693) - Background tokens should use separate refresh token grant - Current implementation reuses refresh tokens incorrectly - No separation between foreground and background operations This is a P0 blocker that must be fixed before production use. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 20:33:55 +01:00
Chris Coutinho	9d514f52b0	docs: Refactor ADR-004 to Progressive Consent architecture with dual OAuth flows Replace hybrid flow model with true progressive consent where MCP client authenticates directly to IdP (Flow 1) and server requests separate explicit provisioning for Nextcloud access (Flow 2). This separates client authentication from resource authorization, uses distinct client_id for each flow, and keeps server stateless by default until user explicitly grants offline access via provision_nextcloud_access tool. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 02:55:27 +01:00
Chris Coutinho	0d45120470	docs: Update ADR-004 with progressive consent architecture Refactor ADR-004 to document the proper OAuth architecture where MCP clients are registered at the IdP level (not with MCP server) and use a progressive consent pattern with dual OAuth flows. ## Key Changes ### MCP Client Registration - Document that MCP clients (Claude Desktop, etc.) register at IdP level - Show DCR and pre-registration options - Clarify client validation happens against IdP registry ### Progressive Consent Architecture Replace single "Hybrid Flow" with three-phase progressive consent: Phase 1: MCP Client Authentication (Always) - MCP client uses own client_id (e.g., "claude-desktop") - User consents to "Claude Desktop accessing MCP Server" - MCP server validates client exists at IdP - Stores MCP client access token Phase 2: Nextcloud Consent (Conditional) - Only if MCP server doesn't have refresh token for user - MCP server uses own client_id ("nextcloud-mcp-server") - User consents to "MCP Server accessing Nextcloud offline" - MCP server stores master refresh token - SSO: If already authenticated, only consent needed Phase 3: Token Exchange (Standard PKCE) - Client exchanges MCP authorization code - Validates PKCE code_verifier - Returns access token (aud: mcp-server) - Client never sees master refresh token ### Implementation Status Section - Document current implementation as "simplified hybrid flow" - List what's implemented vs what needs refactoring - Clarify current tests use simplified version - Note progressive consent is target architecture ## Benefits of Progressive Consent ✅ Standards-compliant: Proper OAuth clients at IdP level ✅ Secure: Client validation against IdP registry ✅ Efficient: Nextcloud consent only once per user ✅ Transparent: Users understand each authorization step ✅ SSO-friendly: Minimal re-authentication in Phase 2 ## Implementation Tracking The refactoring from simplified hybrid flow to progressive consent will be tracked in a separate issue. Current implementation demonstrates: - MCP server can intercept OAuth callbacks - Refresh tokens stored securely - PKCE flow works end-to-end - Tool execution succeeds 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 02:34:30 +01:00
Chris Coutinho	babd60e08b	feat: Implement ADR-004 Hybrid Flow with comprehensive integration tests Implement the ADR-004 Hybrid Flow OAuth pattern where the MCP server intercepts the OAuth callback to obtain master refresh tokens while maintaining PKCE security for clients. ## Implementation ### OAuth Routes (ADR-004 Hybrid Flow) - Add `/oauth/authorize` endpoint: Intercepts client OAuth initiation - Add `/oauth/callback` endpoint: Receives IdP callback, stores master token - Add `/oauth/token` endpoint: Exchanges MCP code for client access token - Implement PKCE code challenge/verifier validation - Store OAuth sessions with state/challenge correlation ### MCP Server Integration - Update `setup_oauth_config()` to return client_id and client_secret - Initialize OAuth context in Starlette lifespan for login routes - Add OAuth session storage to RefreshTokenStorage - Configure authlib dependency for OAuth flow management ### Integration Tests - Create `test_adr004_hybrid_flow.py` with Playwright automation - Add `adr004_hybrid_flow_mcp_client` session-scoped fixture - Test MCP session establishment with hybrid flow token - Test tool execution using stored refresh tokens (on-behalf-of pattern) - Test persistent access across multiple operations - All tests passing: ✅ 3 passed in 8.82s ### Documentation - Update ADR-004 with comprehensive Testing section - Add integration test commands and coverage details - Document test implementation and verification steps - Create TESTING_INSTRUCTIONS.md for manual and automated testing - Include manual test scripts for reference/debugging ## What This Enables ✅ PKCE code challenge/verifier flow ✅ MCP server intercepts OAuth callback and stores master refresh token ✅ Client receives MCP access token (not master token) ✅ MCP session establishment with hybrid flow token ✅ Tool execution using stored refresh tokens (on-behalf-of pattern) ✅ Multiple operations without re-authentication ✅ Proper token isolation (client never sees master token) ## Testing Run ADR-004 integration tests: ```bash uv run pytest tests/server/oauth/test_adr004_hybrid_flow.py --browser firefox -v ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 02:18:30 +01:00
Chris Coutinho	f48e039e9e	docs: WIP with Hybrid token	2025-11-03 01:19:46 +01:00
Chris Coutinho	14a8f70503	docs: Correct ADR-004 to Token Broker Architecture with strict audience isolation Critical architectural corrections to properly implement secure token brokering: ## Key Changes: 1. Removed Dual Token Concept: MCP server no longer generates its own JWTs. Instead, it acts as a token broker using IdP-issued tokens with proper audience validation. 2. Strict Audience Isolation: - Tokens with `aud: "mcp-server"` can ONLY authenticate to MCP server - Tokens with `aud: "nextcloud"` can ONLY access Nextcloud APIs - No tokens have multiple audiences (security boundary violation) - Compromised MCP tokens cannot access Nextcloud directly 3. Linked Authorization Pattern: Single OAuth flow obtains a master refresh token capable of minting tokens for different audiences as needed. This solves the challenge of needing both MCP authentication and Nextcloud access from a single user authorization. 4. Token Broker Implementation: - Validates incoming tokens have `audience: "mcp-server"` - Uses stored refresh tokens to obtain `audience: "nextcloud"` tokens - Never exposes Nextcloud tokens to MCP clients - Maintains short-lived cache for performance 5. PKCE and Native Client Updates: - Proper 302 redirects (no HTML pages) - Complete PKCE verification in token endpoint - IdP tokens returned directly (not MCP-generated) 6. Security Enhancements: - Comprehensive audience validation examples - Token exchange pattern documentation - Keycloak configuration for audience mapping - Trust boundary diagrams This architecture maintains strict security boundaries while enabling the MCP server to act on behalf of users for both authentication and resource access, following OAuth best practices and enterprise security standards. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 00:44:34 +01:00
Chris Coutinho	bf8120682e	docs: Rewrite ADR-004 for Federated Authentication Architecture Major rewrite of ADR-004 to reflect federated authentication pattern with shared identity provider (IdP) instead of direct Nextcloud authentication. Key changes: - Replaced "Sign-in with Nextcloud" with "Federated Authentication" - Added shared IdP (Keycloak, Okta, Azure AD) as central auth provider - MCP server now acts as OAuth client to shared IdP, not Nextcloud - Single user authentication grants both identity and Nextcloud access - Updated all diagrams to show 4-party architecture - Removed authorize_nextcloud tool - uses standard 401 flow - Added proper token rotation with reuse detection - Clarified Pattern 3 vs Pattern 4 differences in comparison doc - Pattern 3 can use external IdPs via user_oidc (not limited to NC) Architecture benefits: - True single sign-on with enterprise IdP support - OAuth-compliant on-behalf-of pattern - Supports SAML/LDAP backends through IdP - Nextcloud validates IdP tokens, not MCP-specific tokens 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-02 23:58:15 +01:00
Chris Coutinho	f2af5a39a8	docs: Add ADR-004 - MCP Server as OAuth Client for Offline Access - Supersedes ADR-002 which fundamentally misunderstood MCP protocol constraints - Introduces "Sign-in with Nextcloud" architecture pattern - MCP server becomes OAuth client to enable offline/background operations - Implements full token rotation with reuse detection for security - Includes comprehensive implementation details and migration strategy Key architectural shift: - From: Pass-through authentication (stateless, no offline access) - To: MCP server as OAuth client (stateful, full offline capabilities) The solution enables background workers to operate independently of MCP sessions by storing and rotating refresh tokens securely. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-02 23:31:39 +01:00
Chris Coutinho	34df5f5b9a	feat: Implement dual-tier token exchange (Standard V2 + Legacy V1 impersonation) This commit implements and documents both RFC 8693 token exchange tiers from ADR-002, enabling both production-ready delegation and advanced impersonation capabilities. - Enable Keycloak preview features (`--features=preview`) to support both Standard V2 and Legacy V1 token exchange modes - Update Tier 1 status from "NOT IMPLEMENTED" to "IMPLEMENTED (Legacy V1)" - Add detailed empirical testing results showing: - Standard V2 rejects `requested_subject` parameter - Legacy V1 accepts parameter but requires impersonation permissions - Complete configuration steps for enabling impersonation - Add comparison table showing when to use each tier - Add "When to Use" guidance for both tiers - Document that Tier 2 (Delegation) is the recommended default - Update docstring to document both Tier 1 and Tier 2 support - Add tier-specific logging (shows which tier is being used) - Document permission requirements for Tier 1 impersonation tests/integration/auth/test_token_exchange_standard_v2.py: - Test delegation without impersonation (Tier 2) - Verify sub claim remains unchanged (service account identity) - Verify no special permissions required - Test exchanged tokens work with Nextcloud APIs - All tests PASS ✅ tests/integration/auth/test_token_exchange_legacy_v1.py: - Test impersonation with `requested_subject` (Tier 1) - Verify sub claim changes to target user - Auto-skip if impersonation permissions not configured - Document permission requirements in test docstrings - Test exchanged tokens work with Nextcloud APIs tests/manual/test_impersonation.py: - Comprehensive impersonation validation script - Tests both Standard V2 and Legacy V1 behavior - Decodes JWT tokens to verify sub claim changes - Validates tokens against Nextcloud APIs tests/manual/configure_impersonation.py: - Automated permission configuration helper - Documents manual Keycloak CLI configuration steps Both token exchange tiers are now fully implemented and tested: - Tier 2 (Delegation) - ✅ RECOMMENDED - Standard V2 (production-ready) - No special permissions required - Service account identity preserved - Tier 1 (Impersonation) - ✅ Advanced use only - Legacy V1 (--features=preview required) - Requires manual permission grant via Keycloak CLI - Subject claim changes to target user 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-02 22:03:22 +01:00

1 2

83 Commits