nextcloud-mcp-server

Author	SHA1	Message	Date
Chris Coutinho	e575c8e57b	feat(vector): Support multiple embedding models with auto-generated collection names This PR enables safe switching between embedding models and multi-server deployments by implementing auto-generated Qdrant collection names based on deployment ID and model name. ## Problem Previously, all deployments used a single hardcoded collection name "nextcloud_content", which caused two critical issues: 1. Dimension mismatches when switching models: Changing OLLAMA_EMBEDDING_MODEL (e.g., nomic-embed-text at 768D → all-minilm at 384D) would cause runtime errors as vectors couldn't be inserted into a collection with incompatible dimensions. 2. Collection collisions in multi-server setups: Multiple MCP servers sharing a single Qdrant instance would overwrite each other's data, making horizontal scaling impossible. ## Solution ### Auto-Generated Collection Naming Collections are now automatically named using the pattern: \`{deployment-id}-{model-name}\` Deployment ID: Uses \`OTEL_SERVICE_NAME\` if configured (and not default value), otherwise falls back to \`hostname\` for simple Docker deployments. Model Name: From \`OLLAMA_EMBEDDING_MODEL\` with path separators sanitized. Examples: - \`my-mcp-server-nomic-embed-text\` (with OTEL_SERVICE_NAME=my-mcp-server) - \`mcp-container-all-minilm\` (simple Docker, hostname=mcp-container) Override: Users can still set \`QDRANT_COLLECTION\` explicitly to bypass auto-generation for backward compatibility. ### Dimension Validation Added startup validation that checks collection dimensions match the embedding service. If a mismatch is detected, the server fails fast with a clear error message explaining: - Expected vs actual dimensions - Likely cause (model change) - Solutions (delete collection, use different name, or revert model) ### Improved Sampling Error Handling Enhanced MCP sampling rejection handling to treat user rejections as normal behavior rather than errors: - User rejections ("rejected", "denied") → INFO log, no traceback - Unsupported clients → INFO log, no traceback - Other MCP errors → WARNING log, no traceback - Unexpected errors → ERROR log WITH traceback This aligns with the MCP specification where clients SHOULD prompt users for approval/denial of sampling requests. ## Changes ### Core Implementation - nextcloud_mcp_server/config.py: Added \`get_collection_name()\` method with deployment ID detection and model name sanitization - nextcloud_mcp_server/vector/qdrant_client.py: Dimension validation on collection open with helpful error messages - nextcloud_mcp_server/vector/{scanner,processor}.py: Updated to use \`get_collection_name()\` - nextcloud_mcp_server/auth/userinfo_routes.py: Vector sync status uses \`get_collection_name()\` - nextcloud_mcp_server/server/semantic.py: - Updated semantic search tools to use \`get_collection_name()\` - Improved sampling rejection error handling (McpError vs Exception) ### Documentation - docs/semantic-search-architecture.md: New comprehensive architecture document (557 lines) covering background sync, semantic search flow, RAG implementation, and deployment modes - docs/configuration.md: Added detailed "Qdrant Collection Naming" section with examples and multi-server deployment guidance - docker-compose.yml: Added comments explaining collection naming behavior - README.md: Updated semantic search descriptions to clarify experimental status, Notes-only support, and infrastructure requirements ## Migration Guide For existing single-server deployments: Option 1 (Recommended): Use explicit collection name for continuity \`\`\`bash QDRANT_COLLECTION=nextcloud_content # Keep existing collection \`\`\` Option 2: Allow auto-generation and re-embed \`\`\`bash # Remove QDRANT_COLLECTION override # New collection will be created based on deployment ID + model # Requires re-embedding all documents (may take time) \`\`\` For new multi-server deployments: Set unique OTEL service names per server: \`\`\`bash # Server 1 OTEL_SERVICE_NAME=mcp-prod OLLAMA_EMBEDDING_MODEL=nomic-embed-text # → Collection: "mcp-prod-nomic-embed-text" # Server 2 OTEL_SERVICE_NAME=mcp-staging OLLAMA_EMBEDDING_MODEL=nomic-embed-text # → Collection: "mcp-staging-nomic-embed-text" \`\`\` ## Benefits ✅ Safe model switching: Each model gets its own collection, preventing dimension mismatch errors ✅ Multi-server support: Multiple MCP servers can share one Qdrant instance without conflicts ✅ Clear ownership: Collection names show which deployment and model owns the data ✅ Better error messages: Dimension validation provides actionable guidance ✅ Backward compatible: Existing deployments can continue using \`QDRANT_COLLECTION\` override ## Testing Validated with: - Single-server deployments (default hostname-based naming) - Multi-server deployments (OTEL service name-based naming) - Model switching scenarios (dimension validation) - Collection override scenarios (backward compatibility) Next steps: Testing various Ollama embedding models to investigate optimal chunk sizes and performance characteristics. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 01:18:30 +01:00
Chris Coutinho	72232f937a	refactor: migrate vector sync from asyncio.Queue to anyio memory object streams Replace asyncio.Queue with anyio.create_memory_object_stream() throughout the vector sync system for better library consistency and improved shutdown semantics. ## Changes Made scanner.py: - Changed parameter type from `asyncio.Queue` to `MemoryObjectSendStream[DocumentTask]` - Replaced all `await document_queue.put()` calls with `await send_stream.send()` - Wrapped scanner loop in `async with send_stream:` context manager for automatic cleanup - Updated log messages: "Queued" → "Sent" - Removed `import asyncio` (no longer needed) processor.py: - Changed parameter type from `asyncio.Queue` to `MemoryObjectReceiveStream[DocumentTask]` - Replaced `asyncio.wait_for(document_queue.get(), timeout=1.0)` with `anyio.fail_after(1.0)` + `await receive_stream.receive()` - Removed all `document_queue.task_done()` calls (not needed with streams) - Added `anyio.EndOfStream` exception handling for graceful shutdown when scanner closes - Removed `import asyncio` (no longer needed) app.py: - Removed `import asyncio` from top-level imports - Added `from anyio.streams.memory import MemoryObjectReceiveStream, MemoryObjectSendStream` - Updated AppContext dataclass: - Replaced `document_queue: Optional[asyncio.Queue]` with: - `document_send_stream: Optional[MemoryObjectSendStream]` - `document_receive_stream: Optional[MemoryObjectReceiveStream]` - Updated `app_lifespan_basic()`: - Replaced `asyncio.Queue(maxsize=...)` with `anyio.create_memory_object_stream(max_buffer_size=...)` - Pass `send_stream` to scanner_task - Pass `receive_stream.clone()` to each processor_task (enables multiple consumers) - Updated AppContext yield to include both streams - Updated `starlette_lifespan()`: - Same changes as app_lifespan_basic for streamable-http transport - Removed `import asyncio as asyncio_module` (no longer needed) - Updated app.state storage to use send_stream and receive_stream semantic.py: - Updated `nc_get_vector_sync_status()` tool: - Access `document_receive_stream` instead of `document_queue` from lifespan context - Use `stream_stats.current_buffer_used` instead of `queue.qsize()` for pending count - More reliable metrics (qsize() was not guaranteed accurate) ## Benefits 1. Library Consistency: Pure anyio throughout codebase (was mixing asyncio.Queue with anyio.Event and anyio.create_task_group) 2. Graceful Shutdown: `async with send_stream:` automatically closes stream on exit, signaling EndOfStream to all processors 3. Better Timeout Handling: `anyio.fail_after()` is more idiomatic than `asyncio.wait_for()` 4. Stream Cloning: Easy to add multiple consumers via `receive_stream.clone()` 5. Better Statistics: `.statistics()` provides accurate buffer metrics (qsize() was unreliable) 6. Type Safety: Separate send/receive types prevent accidental misuse 7. No task_done() tracking: Streams handle completion automatically ## Testing - ✅ All 69 unit tests passing - ✅ All 5 smoke tests passing - ✅ No regressions in functionality - ✅ Graceful shutdown behavior improved ## References - https://anyio.readthedocs.io/en/stable/why.html#queue-fix - https://anyio.readthedocs.io/en/stable/streams.html#memory-object-streams 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 06:43:44 +01:00
Chris Coutinho	4b026e9aa0	feat: implement ADR-009 - refactor semantic search to use generic semantic:read scope This implements ADR-009, which documents the decision to use a generic `semantic:read` OAuth scope instead of requiring all app-specific scopes for semantic search functionality. Changes: - Created new `nextcloud_mcp_server/models/semantic.py` with semantic search models - SemanticSearchResult (with new doc_type field for multi-app support) - SemanticSearchResponse - SamplingSearchResponse - VectorSyncStatusResponse - Created new `nextcloud_mcp_server/server/semantic.py` with semantic search tools - nc_semantic_search (renamed from nc_notes_semantic_search) - nc_semantic_search_answer (renamed from nc_notes_semantic_search_answer) - nc_get_vector_sync_status (renamed from nc_notes_get_vector_sync_status) - All tools now use @require_scopes("semantic:read") instead of "notes:read" - Updated `nextcloud_mcp_server/server/notes.py` - Removed semantic search tools (moved to semantic.py) - Removed semantic search model imports - Removed unused MCP imports (ModelHint, ModelPreferences, etc.) - Updated `nextcloud_mcp_server/models/notes.py` - Removed semantic search models (moved to semantic.py) - Updated `nextcloud_mcp_server/app.py` - Import configure_semantic_tools - Register semantic tools when VECTOR_SYNC_ENABLED=true - Updated `nextcloud_mcp_server/server/__init__.py` - Export configure_semantic_tools - Updated tests - tests/integration/test_sampling.py: Use new tool names - tests/unit/test_response_models.py: Import from semantic.py, add doc_type field Architecture: - Semantic search is now a cross-app feature, not tied to Notes - Uses dual-phase authorization: semantic:read scope + per-document verification - Supports future multi-app indexing (notes, calendar, deck, files, contacts) Test results: - All 69 unit tests passing - All 5 smoke tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 05:53:53 +01:00
Chris Coutinho	a6c76c5cc1	chore: Add openid scope to nc_notes_get_vector_sync_status	2025-11-09 03:27:17 +01:00
Chris Coutinho	a854656d3c	fix: implement deletion grace period and vector sync status tool This commit addresses issues with vector database synchronization that were causing test failures: 1. Deletion Grace Period (scanner.py) - Fixed premature deletion of documents due to pagination cursor inconsistencies in Notes API - Implemented 2-scan verification with 1.5x scan interval grace period (15 seconds default) - Documents must be missing for 2 consecutive scans before deletion - Documents that reappear are removed from deletion tracking - Prevents false deletions during concurrent note creation/indexing 2. Vector Sync Status Tool (server/notes.py, models/notes.py) - Added nc_notes_get_vector_sync_status MCP tool - Returns indexed_count, pending_count, status, and enabled fields - Enables tests and clients to wait for vector sync completion - Uses lifespan context to access document queue and Qdrant client 3. Test Improvements (test_sampling.py, conftest.py) - Added temporary_note_factory fixture for creating multiple test notes - Updated all sampling tests to wait for vector sync completion - Adjusted score_threshold to 0.0 for SimpleEmbeddingProvider (feature hashing produces low-quality embeddings) - Fixed CallToolResult extraction (removed ["result"] key access) - Removed invalid @pytest.mark.asyncio markers (anyio mode) All integration tests now pass successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 03:11:39 +01:00
Chris Coutinho	bb5d4f464f	feat: implement MCP sampling for semantic search RAG (ADR-008) Add nc_notes_semantic_search_answer tool that combines semantic search with MCP sampling to generate natural language answers from retrieved Nextcloud Notes. This enables Retrieval-Augmented Generation (RAG) patterns without requiring a server-side LLM. Key features: - Client-side LLM generation via ctx.session.create_message() - Graceful fallback when sampling unavailable - Proper source citations in generated answers - No results optimization (skips sampling when no docs found) - Comprehensive unit and integration tests Implementation details: - SamplingSearchResponse model with generated_answer and sources - Fixed prompt template with document context and citation instructions - Model preferences hint Claude Sonnet for balanced performance - Falls back to returning documents without answer on sampling failure Updates: - Add ADR-008 documenting sampling architecture decision - Add MCP sampling pattern guidance to CLAUDE.md - Update README.md and docs/notes.md (7 → 9 tools) - Add 4 unit tests and 6 integration tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 01:00:18 +01:00
Chris Coutinho	1a57f97d3a	refactor: update to Qdrant query_points API and fix Playwright Keycloak login - Replace deprecated qdrant_client.search() with query_points() API - Update semantic search implementation in notes.py - Update all integration tests to use query_points() - Fix Keycloak login in test_keycloak_dcr.py to use form.submit() instead of button click - Remove unnecessary popup handler code - Simplify consent screen logging	2025-11-08 22:41:14 +01:00
Chris Coutinho	fdd82f59e2	feat: implement semantic search tool and fix vector sync issues (ADR-007 Phase 3) Completes the ADR-007 implementation by adding user-facing semantic search functionality. Previous phases implemented scanner and processor for background indexing; this adds the query interface. Changes: - Add nc_notes_semantic_search MCP tool for natural language queries - Fix Qdrant point IDs to use UUIDs instead of strings (was causing 400 errors) - Reduce scan interval default from 1 hour to 5 minutes for faster updates - Add SemanticSearchResult and SemanticSearchNotesResponse models - Implement dual-phase authorization (Qdrant filter + Nextcloud API verification) The semantic search enables finding notes by meaning rather than exact keywords, using vector embeddings to understand query intent. Point ID fix resolves critical bug where all document indexing failed with "invalid point ID" errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 21:51:12 +01:00
Chris Coutinho	71326384da	feat: add real elicitation integration test with python-sdk MCP client This commit adds proper integration testing of the login elicitation flow (ADR-006) using python-sdk's MCP client with actual elicitation callback support, and fixes user_id extraction to support both JWT and opaque tokens. ## Changes ### 1. Enhanced create_mcp_client_session helper (tests/conftest.py) - Added `elicitation_callback` parameter to function signature - Pass callback to ClientSession constructor - Added necessary imports: RequestContext, ElicitRequestParams, ElicitResult, ErrorData from mcp package - Allows fixtures to provide custom elicitation handlers ### 2. New fixture: nc_mcp_oauth_client_with_elicitation (tests/conftest.py) - Creates MCP client with Playwright-based elicitation callback - Callback implementation: - Extracts OAuth URL from elicitation message using regex - Uses Playwright browser to complete OAuth flow automatically - Handles Nextcloud login form (username/password) - Handles consent screen if present - Waits for OAuth callback completion - Returns ElicitResult(action="accept") on success - Function-scoped to allow independent test state - Tracks elicitation invocations via session.elicitation_triggered ### 3. Fixed user_id extraction for opaque tokens (oauth_tools.py) - Created extract_user_id_from_token() helper to handle both JWT and opaque tokens by calling userinfo endpoint when needed - Fixed check_logged_in to use helper instead of broken ctx.authorization - Fixed revoke_nextcloud_access to use helper instead of ctx.context.get() - Both tools now properly extract user_id from access tokens ### 4. Enhanced integration tests (test_elicitation_integration.py) - Updated tests to revoke refresh tokens via MCP tool - All 4 tests now pass: - test_check_logged_in_with_real_elicitation_callback: Complete flow - test_elicitation_callback_url_extraction: URL extraction validation - test_elicitation_stores_refresh_token: Token persistence verification - test_second_check_logged_in_does_not_elicit: No redundant elicitations ### 5. Added diagnostic logging (oauth_routes.py) - Track user_id extraction from ID tokens during OAuth callbacks - Log refresh token storage with user_id and flow_type ## Test Results ✅ 4/4 tests pass The test suite successfully validates: - Elicitation callback is triggered when no refresh token exists - Playwright automation completes OAuth flow - Refresh token is stored after OAuth with correct user_id - Tool returns "yes" after successful login - Already-logged-in users don't get redundant elicitations ## Why This Matters Previous tests (test_login_elicitation.py) only validated response formats and acknowledged they couldn't test real elicitation protocol. This test exercises the REAL MCP elicitation protocol end-to-end: 1. MCP server calls ctx.elicit() 2. python-sdk ClientSession invokes custom callback 3. Callback completes OAuth via Playwright 4. Client returns acceptance to server 5. Tool proceeds with authenticated state This proves the python-sdk MCP client can handle elicitation in production environments with both JWT and opaque tokens. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-07 22:55:49 +01:00
Chris Coutinho	11cdab475f	feat: unify session architecture and enhance login status visibility This commit addresses the "Login not detected" issue after completing OAuth login via elicitation by unifying the session architecture and adding comprehensive visibility into background session status. ## Changes ### 1. Enhanced check_logged_in with comprehensive logging (oauth_tools.py) - Added detailed logging at each step of token lookup - Implemented fallback strategy: first search by provisioning_client_id, then fall back to user_id lookup - This allows detection of refresh tokens created via any flow (elicitation or browser login) - Log messages include flow_type, provisioned_at, and provisioning_client_id for debugging ### 2. Unified session architecture (browser_oauth_routes.py) - Browser login now stores provisioning_client_id=state when saving refresh token - This makes browser and elicitation flows consistent - both can be found by the same state parameter - Treats Flow 2 (elicitation) and browser login as the same "background session" ### 3. Enhanced /user/page with session status (userinfo_routes.py) - Added comprehensive background access section showing: - Background Access: Granted/Not Granted (with visual indicators) - Flow Type: browser/flow2/hybrid - Provisioned At: timestamp - Token Audience: nextcloud/mcp - Scopes: detailed scope list - Status displayed regardless of which flow created the session (browser login or elicitation) ### 4. Added revoke functionality (userinfo_routes.py, app.py) - New POST endpoint: /user/revoke - Allows users to revoke background access (delete refresh token) - Browser session cookie remains valid for UI access - Confirmation dialog before revocation - Success page with auto-redirect back to /user/page - Registered route in app.py browser_routes ## Testing All tests pass: - 6/6 login elicitation tests pass - 21/21 core OAuth tests pass - Comprehensive logging helps debug future issues ## Fixes Resolves: "Login not detected. Please ensure you completed the login at the provided URL before clicking OK." The issue occurred because elicitation and browser login created separate sessions. Now they are unified under the same architecture. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-07 21:50:55 +01:00
Chris Coutinho	0c9a9ea24d	fix: Consolidate OAuth callbacks and implement PKCE for all flows This PR fixes multiple OAuth-related issues: ## Unified OAuth Callback - Consolidated `/oauth/callback-nextcloud` and `/oauth/login-callback` into single `/oauth/callback` endpoint - Flow type determined by session lookup via state parameter (no query params in redirect_uri) - Fixes redirect_uri validation issues with IdPs requiring exact match - Legacy endpoints kept as aliases for backwards compatibility ## PKCE Implementation - Implemented PKCE (RFC 7636) for Flow 2 (resource provisioning) - Generate code_verifier and code_challenge - Store code_verifier in session storage - Retrieve and use in token exchange - Fixed PKCE for browser login (integrated mode) - Previously only worked for external IdP (Keycloak) - Now works for both Nextcloud OIDC and external IdP ## Login Elicitation Fixes (ADR-006) - Fixed elicitation URL to route through MCP server endpoint - Changed from direct Nextcloud URL to `/oauth/authorize-nextcloud` - Ensures PKCE is properly handled by server - Fixed login detection after OAuth flow completes - Look up refresh token by state parameter instead of user_id - Works even when Flow 1 token not present - Added `get_refresh_token_by_provisioning_client_id()` method ## Session Authentication - Fixed `/user/page` redirect loop - Shared oauth_context with mounted browser_app - SessionAuthBackend can now validate sessions correctly ## Tests - Added comprehensive login elicitation test suite - Updated scope authorization test expectations - All 43 OAuth tests passing ## Files Changed - `app.py`: Shared oauth_context, unified callback route - `oauth_routes.py`: Unified callback, PKCE for Flow 2 - `browser_oauth_routes.py`: PKCE for integrated mode - `oauth_tools.py`: Fixed elicitation URL generation - `refresh_token_storage.py`: Added lookup by provisioning_client_id - `test_login_elicitation.py`: New test suite 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-07 21:08:55 +01:00
Chris Coutinho	6cccd92b3b	build: Add type checking	2025-11-05 15:19:55 +01:00
Chris Coutinho	881b0ba03c	feat: add scope protection to OAuth provisioning tools Add @require_scopes("openid") decorator to OAuth backend tools (provision_nextcloud_access, revoke_nextcloud_access, check_provisioning_status) to ensure they're only visible to authenticated OIDC users. Design rationale: - OAuth provisioning tools are "meta-tools" that manage authentication itself - They don't access Nextcloud resources, so don't need resource scopes - Requiring 'openid' ensures user is authenticated via OIDC - Enables Progressive Consent: users authenticate first, then provision access - Aligns with dual OAuth flow architecture (Flow 1 + Flow 2) Changes: - Add @require_scopes("openid") to all three OAuth provisioning tools - Update test expectations: users with only OIDC default scopes see OAuth provisioning tools but not resource tools - All tests pass (13/13 in test_scope_authorization.py) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-04 09:25:20 +01:00
Chris Coutinho	01d1cf9190	feat: integrate token exchange into MCP server application Wire up RFC 8693 token exchange throughout the MCP server to support stateless per-request token conversion for external IdP scenarios. Changes: Authentication Flow: - Add exchange_token_for_audience() for pure RFC 8693 exchange - Update context_helper to use stateless token exchange - Remove fallback to standard OAuth on exchange failure - Make storage initialization lazy (only for delegation, not MCP tools) Application Configuration: - Add ENABLE_TOKEN_EXCHANGE environment variable support - Skip provisioning tools when token exchange enabled - Pass mcp_client_id to token broker for proper validation - Update docker-compose.yml with token exchange config Token Exchange Service: - Add TOKEN_EXCHANGE_GRANT constant - Implement exchange_token_for_audience() method - Support both "mcp-server" and client_id audiences - Lazy storage initialization for delegation scenarios - Enhanced error handling and logging Progressive Token Verifier: - Add mcp_client_id parameter for external IdP validation - Accept both "mcp-server" and configured client_id - Support external IdP token verification Key Behavior Changes: - When ENABLE_TOKEN_EXCHANGE=true: Each MCP tool call triggers stateless token exchange (client token → Nextcloud token) - When ENABLE_TOKEN_EXCHANGE=false: Uses pass-through mode (validates Flow 1 token and passes to Nextcloud) - No provisioning tools registered in exchange mode - No refresh tokens needed for request-time operations This completes the token exchange implementation. The MCP server now supports both pass-through (default) and exchange (opt-in) modes for federated authentication architectures. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-04 02:32:40 +01:00
Chris Coutinho	71e77e95bc	refactor: integrate token exchange into unified get_client() pattern Resolves the token exchange implementation gap where get_session_client() was implemented but never used by tools. Unifies token acquisition into a single async get_client() method that handles both pass-through and token exchange modes transparently. Core Changes: - Make get_client() async and merge token exchange logic into it - Remove scopes parameter from token exchange (Nextcloud doesn't support OAuth scopes) - Update all 8 tool modules to use await get_client(ctx) - Fix provisioning decorator to skip checks in BasicAuth mode Token Acquisition Modes: 1. BasicAuth: Returns shared client (no token operations) 2. OAuth pass-through (default): Verifies and passes Flow 1 token to Nextcloud 3. OAuth token exchange (opt-in): Exchanges Flow 1 token for ephemeral token via RFC 8693 Key Architectural Clarifications: - Progressive Consent (Flow 1/2) = Authorization architecture - Token Exchange = Token acquisition pattern during tool execution - Refresh tokens from Flow 2 are NEVER used for tool calls (only background jobs) - Nextcloud scopes are "soft-scopes" enforced by MCP server, not IdP Documentation Updates: - ADR-004: Added comprehensive token acquisition patterns section - CRITICAL-TOKEN-EXCHANGE-PATTERN.md: Updated to reflect implementation status - CLAUDE.md: Updated architectural patterns with async get_client() Testing: - All 36 unit tests passing - All 4 smoke tests passing (BasicAuth mode) - Linting issues fixed (ruff) Configuration: ENABLE_TOKEN_EXCHANGE=false (default) - pass-through mode ENABLE_TOKEN_EXCHANGE=true (opt-in) - token exchange mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 20:33:56 +01:00
Chris Coutinho	d768909fd4	feat: Implement ADR-004 Progressive Consent foundation (partial) Implements Progressive Consent architecture with dual OAuth flows: - Flow 1: Direct client authentication (aud: "mcp-server") - Flow 2: Resource provisioning with refresh tokens Components added: - Client registry with validation (client_registry.py) - Progressive token verifier (progressive_token_verifier.py) - Token broker service integration - Provisioning decorator for MCP tools - OAuth provisioning tools (provision_nextcloud_access, etc.) Configuration: - Progressive Consent enabled by default (ENABLE_PROGRESSIVE_CONSENT=true) - Client validation with pre-registered clients - Audience separation framework KNOWN ISSUE - Token Exchange Pattern Incorrect: The current implementation does NOT properly implement token exchange. MCP session tokens should be EXCHANGED for delegated Nextcloud tokens during tool calls, not stored/reused. Critical corrections needed: 1. Session tokens: Flow 1 token → exchange → ephemeral Nextcloud token - Generated on-demand per tool call - Short-lived, not stored - Scopes limited to tool requirements 2. Background tokens: Flow 2 refresh token → background Nextcloud token - Only for offline/background jobs - Potentially different scopes than session tokens - Must NOT be used for MCP session tool calls The token exchange mechanism needs to be implemented to properly separate session-time delegation from background job authorization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 20:33:55 +01:00
Chris Coutinho	d16bcdcfbb	feat: Implement ADR-004 Progressive Consent foundation components - Token Broker Service manages Nextcloud access tokens with audience validation - Implements short-lived token caching (5-minute TTL) with early refresh - Enhanced token storage schema with ADR-004 fields (flow_type, audience, provisioning) - MCP provisioning tools for explicit Flow 2 resource authorization - Comprehensive unit tests for Token Broker Service (14 tests, all passing) - Environment configuration for Progressive Consent mode This implements the foundation for the dual OAuth flow architecture where: - Flow 1: MCP clients authenticate to MCP server (aud: "mcp-server") - Flow 2: MCP server gets delegated Nextcloud access (aud: "nextcloud") Users must explicitly call provision_nextcloud_access tool to grant resource access, implementing the "stateless by default" principle from ADR-004. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 07:51:07 +01:00
Chris Coutinho	a36038422b	feat: Add text processing background worker for telling client about progress	2025-10-25 19:52:45 +02:00
Chris Coutinho	2147fc1696	refactor: Transform document parsing into pluggable processor architecture Refactors PR #190's hardcoded Unstructured.io integration into a flexible, extensible plugin system supporting multiple text extraction engines. - `DocumentProcessor` ABC: Abstract interface for all processors - `ProcessorRegistry`: Central registry for discovery and routing - `ProcessingResult`: Standardized output format across processors - `UnstructuredProcessor`: Refactored from `UnstructuredClient` - `TesseractProcessor`: Local OCR for images (lightweight alternative) - `CustomHTTPProcessor`: Generic wrapper for custom HTTP APIs - New `get_document_processor_config()` returns structured config - Supports enabling/disabling individual processors - Per-processor configuration via environment variables - Breaking Change: `ENABLE_UNSTRUCTURED_PARSING` replaced with: - `ENABLE_DOCUMENT_PROCESSING=true/false` (master switch) - `ENABLE_UNSTRUCTURED=true/false` (per-processor) - `ENABLE_TESSERACT=true/false` - `ENABLE_CUSTOM_PROCESSOR=true/false` - `parse_document()` now uses `ProcessorRegistry` - Auto-selects appropriate processor based on MIME type - Processor priority system (Unstructured=10, Tesseract=5, Custom=1) - `initialize_document_processors()` registers processors at startup - Integrated into both BasicAuth and OAuth lifespans - Graceful degradation if processors fail to initialize ```env ENABLE_DOCUMENT_PROCESSING=false ENABLE_UNSTRUCTURED=false UNSTRUCTURED_API_URL=http://unstructured:8000 UNSTRUCTURED_STRATEGY=auto # auto\|fast\|hi_res UNSTRUCTURED_LANGUAGES=eng,deu ENABLE_TESSERACT=false TESSERACT_LANG=eng ENABLE_CUSTOM_PROCESSOR=false CUSTOM_PROCESSOR_URL=http://localhost:9000/process CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg ``` - Removed: `tests/test_unstructured_config.py` (legacy tests) - Added: `tests/unit/test_document_processor_config.py` - 7 unit tests for new config system - Tests individual and multi-processor configurations - Added: - `nextcloud_mcp_server/document_processors/__init__.py` - `nextcloud_mcp_server/document_processors/base.py` - `nextcloud_mcp_server/document_processors/registry.py` - `nextcloud_mcp_server/document_processors/unstructured.py` - `nextcloud_mcp_server/document_processors/tesseract.py` - `nextcloud_mcp_server/document_processors/custom_http.py` - `tests/unit/test_document_processor_config.py` - Modified: - `nextcloud_mcp_server/config.py` - New plugin config system - `nextcloud_mcp_server/app.py` - Processor initialization - `nextcloud_mcp_server/utils/document_parser.py` - Uses registry - `nextcloud_mcp_server/server/webdav.py` - Import updates - `env.sample` - New configuration format - `docker-compose.yml` - (profile changes from previous work) - Removed: - `nextcloud_mcp_server/client/unstructured_client.py` - Replaced by UnstructuredProcessor - `tests/test_unstructured_config.py` - Replaced with new tests ✅ Extensible: Add processors without modifying core code ✅ Testable: Mock processors for unit tests ✅ Configurable: Enable only needed processors ✅ Flexible: Choose fast (Tesseract) vs accurate (Unstructured) ✅ Opt-in: Disabled by default, no mandatory dependencies Users upgrading from PR #190 need to update environment variables: ```bash ENABLE_UNSTRUCTURED_PARSING=true ENABLE_DOCUMENT_PROCESSING=true ENABLE_UNSTRUCTURED=true ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 19:28:35 +02:00
yuisheaven	f0e5333e43	Merge branch 'master' into feature/introduce_files_parsing_with_unstructured_service_for_webdav_files_retrieval	2025-10-25 17:23:38 +02:00
Chris Coutinho	81ca799410	fix: Update webdav models for proper serialization	2025-10-24 06:01:02 +02:00
Chris Coutinho	d452684535	feat: Split read/write scopes into app:read/write scopes	2025-10-24 04:38:49 +02:00
yuisheaven	29df645d53	Merge branch 'master' into feature/introduce_files_parsing_with_unstructured_service_for_webdav_files_retrieval	2025-10-23 21:30:09 +02:00
Chris Coutinho	c069d78f80	feat: Initialize JWT-scoped tools	2025-10-22 06:21:16 +02:00
yuisheaven	64649c902d	Merge branch 'master' into feature/introduce_files_parsing_with_unstructured_service_for_webdav_files_retrieval	2025-10-21 20:37:00 +02:00
Chris Coutinho	92e18825bc	feat(caldav): Add support for tasks	2025-10-19 18:02:43 +02:00
Chris Coutinho	6158a890af	feat(webdav): Add search and list favorite response tools	2025-10-18 22:02:26 +02:00
Chris Coutinho	37164dbdbc	chore: sort imports	2025-10-18 22:02:25 +02:00
Chris Coutinho	dbcf9d93ca	chore: Improve RequestError message details Show exception type and cause when str(e) is empty for better debugging	2025-10-17 04:37:31 +02:00
Chris Coutinho	2999d4b65e	fix: Handle RequestError in mcp tools	2025-10-17 04:17:41 +02:00
Chris Coutinho	9de59db718	feat(cookbook): Add full Cookbook app support with 13 tools and 2 resources - Import recipes from URLs using schema.org metadata - Full CRUD operations for recipes - Search, categorize, and organize recipes - Manage keywords/tags and categories - Configure app settings and trigger reindexing	2025-10-17 03:08:16 +02:00
Chris Coutinho	a38c795124	feat: add sharing API client and server tools	2025-10-15 02:59:26 +02:00
Chris Coutinho	13e4915e38	test: Remove unused pytest fixtures	2025-10-14 01:23:39 +02:00
Chris Coutinho	4d7e4b9a4b	feat(server): Experimental support for OAuth2/OIDC authentication	2025-10-14 01:22:15 +02:00
yuisheaven	3ff6346c03	ran ruff format via uv	2025-10-05 02:16:42 +02:00
yuisheaven	76dce41ed9	added first versoin of the new document_parser utility and added it to the webdav file retrieval logic	2025-10-04 04:28:24 +02:00
Chris Coutinho	79e6250377	update deprecated log warnings	2025-09-24 00:17:57 +02:00
Chris Coutinho	cc9650b077	refactor: Add tools for all resources to enable tool-only workflows	2025-09-24 00:13:24 +02:00
Chris Coutinho	7498b501eb	chore: Remove remaining tools	2025-09-11 09:31:13 +02:00
Chris Coutinho	e7a5caa0d6	Merge remote-tracking branch 'origin/master' into feature/deck	2025-09-11 00:37:58 +02:00
Chris Coutinho	d2d413afcd	feat(deck): Add support for stack, cards, labels	2025-09-11 00:35:02 +02:00
Chris Coutinho	167053578d	feat(deck): Initialize Deck app client/server	2025-09-11 00:10:25 +02:00
Pedro Ruiz	5d4902a73e	feat: Add WebDAV resource copy functionality	2025-09-10 22:15:16 +02:00
Pedro Ruiz	b55b9640c6	feat: Add WebDAV resource move/rename functionality	2025-09-10 22:12:17 +02:00
Chris Coutinho	f79b957644	test: Update tests with McpError	2025-08-31 21:08:04 +02:00
Chris Coutinho	ef1fb9e9aa	fix(server): Replace ErrorResponses with standard McpErrors	2025-08-31 20:58:12 +02:00
Chris Coutinho	892a8d2d23	fix(notes): Include ETags in responses to avoid accidently updates	2025-08-31 19:20:51 +02:00
Chris Coutinho	949fb7124b	fix(notes): Remove note contents from responses to reduce token usage	2025-08-31 11:55:15 +02:00
Chris Coutinho	4cf5f2a95a	feat(client): Preserve fields when modifying contacts/calendar resources	2025-08-30 19:19:20 +02:00
Chris Coutinho	9b00530e8e	feat(server): Add structured output to all tool/resource output BREAKING CHANGE	2025-08-30 18:27:32 +02:00

1 2

61 Commits