nextcloud-mcp-server

Author	SHA1	Message	Date
Chris Coutinho	9fab6cb550	feat: Implement ADR-005 unified token verifier to eliminate token passthrough vulnerability Replace two non-compliant token verifiers (NextcloudTokenVerifier and ProgressiveConsentTokenVerifier) with a single UnifiedTokenVerifier that properly validates token audiences per MCP Security Best Practices specification. The previous implementation had a critical security vulnerability where tokens intended for the MCP server were passed directly to Nextcloud APIs without proper audience validation (token passthrough anti-pattern). This violates OAuth 2.0 security principles and the MCP specification. Changes: - Add UnifiedTokenVerifier supporting two compliant modes: * Multi-audience mode (default): Validates tokens contain BOTH MCP and Nextcloud audiences, enabling direct use without exchange * Token exchange mode (opt-in): Validates MCP audience only, exchanges for Nextcloud tokens via RFC 8693 with caching to minimize latency - Remove token passthrough vulnerability from context.py and context_helper.py - Implement token exchange caching (5-minute TTL default) to reduce network calls - Add required environment variables for audience validation: * NEXTCLOUD_MCP_SERVER_URL - MCP server URL (used as audience) * NEXTCLOUD_RESOURCE_URI - Nextcloud resource identifier * TOKEN_EXCHANGE_CACHE_TTL - Cache TTL for exchanged tokens - Update docker-compose.yml with resource URI configuration for both OAuth modes - Add comprehensive test suite (29 tests) covering both authentication modes - Remove legacy NextcloudTokenVerifier and ProgressiveConsentTokenVerifier Security improvements: - Eliminates token passthrough anti-pattern - Enforces proper audience separation between MCP and Nextcloud - Complies with MCP Security Best Practices and RFC 8707/8693 - Maintains performance with token exchange caching Test results: 65/65 unit tests passed, 5/5 smoke tests passed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-05 18:53:14 +01:00
Chris Coutinho	15113dbb03	fix: remove Hybrid Flow, make Progressive Consent default (ADR-004) Eliminates scope escalation security vulnerability by removing Hybrid Flow and making Progressive Consent the only OAuth mode. Changes: - Delete oauth_callback() and oauth_token() (Hybrid Flow only, ~314 lines) - Fix scope flows: Flow 1 requests resource scopes, Flow 2 requests identity+offline - Remove ENABLE_PROGRESSIVE_CONSENT flag (always enabled in OAuth mode) - Update documentation to reflect Progressive Consent as default - Delete test_adr004_hybrid_flow.py test file - Remove unused variables (ruff lint fixes) Security improvements: - No scope escalation: client gets exactly what it requests - Clear separation: MCP session tokens vs Nextcloud offline tokens - OAuth2 compliant: follows best practices for scope handling 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-04 00:26:07 +01:00
Chris Coutinho	71e77e95bc	refactor: integrate token exchange into unified get_client() pattern Resolves the token exchange implementation gap where get_session_client() was implemented but never used by tools. Unifies token acquisition into a single async get_client() method that handles both pass-through and token exchange modes transparently. Core Changes: - Make get_client() async and merge token exchange logic into it - Remove scopes parameter from token exchange (Nextcloud doesn't support OAuth scopes) - Update all 8 tool modules to use await get_client(ctx) - Fix provisioning decorator to skip checks in BasicAuth mode Token Acquisition Modes: 1. BasicAuth: Returns shared client (no token operations) 2. OAuth pass-through (default): Verifies and passes Flow 1 token to Nextcloud 3. OAuth token exchange (opt-in): Exchanges Flow 1 token for ephemeral token via RFC 8693 Key Architectural Clarifications: - Progressive Consent (Flow 1/2) = Authorization architecture - Token Exchange = Token acquisition pattern during tool execution - Refresh tokens from Flow 2 are NEVER used for tool calls (only background jobs) - Nextcloud scopes are "soft-scopes" enforced by MCP server, not IdP Documentation Updates: - ADR-004: Added comprehensive token acquisition patterns section - CRITICAL-TOKEN-EXCHANGE-PATTERN.md: Updated to reflect implementation status - CLAUDE.md: Updated architectural patterns with async get_client() Testing: - All 36 unit tests passing - All 4 smoke tests passing (BasicAuth mode) - Linting issues fixed (ruff) Configuration: ENABLE_TOKEN_EXCHANGE=false (default) - pass-through mode ENABLE_TOKEN_EXCHANGE=true (opt-in) - token exchange mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-03 20:33:56 +01:00
Chris Coutinho	a36038422b	feat: Add text processing background worker for telling client about progress	2025-10-25 19:52:45 +02:00
Chris Coutinho	2147fc1696	refactor: Transform document parsing into pluggable processor architecture Refactors PR #190's hardcoded Unstructured.io integration into a flexible, extensible plugin system supporting multiple text extraction engines. - `DocumentProcessor` ABC: Abstract interface for all processors - `ProcessorRegistry`: Central registry for discovery and routing - `ProcessingResult`: Standardized output format across processors - `UnstructuredProcessor`: Refactored from `UnstructuredClient` - `TesseractProcessor`: Local OCR for images (lightweight alternative) - `CustomHTTPProcessor`: Generic wrapper for custom HTTP APIs - New `get_document_processor_config()` returns structured config - Supports enabling/disabling individual processors - Per-processor configuration via environment variables - Breaking Change: `ENABLE_UNSTRUCTURED_PARSING` replaced with: - `ENABLE_DOCUMENT_PROCESSING=true/false` (master switch) - `ENABLE_UNSTRUCTURED=true/false` (per-processor) - `ENABLE_TESSERACT=true/false` - `ENABLE_CUSTOM_PROCESSOR=true/false` - `parse_document()` now uses `ProcessorRegistry` - Auto-selects appropriate processor based on MIME type - Processor priority system (Unstructured=10, Tesseract=5, Custom=1) - `initialize_document_processors()` registers processors at startup - Integrated into both BasicAuth and OAuth lifespans - Graceful degradation if processors fail to initialize ```env ENABLE_DOCUMENT_PROCESSING=false ENABLE_UNSTRUCTURED=false UNSTRUCTURED_API_URL=http://unstructured:8000 UNSTRUCTURED_STRATEGY=auto # auto\|fast\|hi_res UNSTRUCTURED_LANGUAGES=eng,deu ENABLE_TESSERACT=false TESSERACT_LANG=eng ENABLE_CUSTOM_PROCESSOR=false CUSTOM_PROCESSOR_URL=http://localhost:9000/process CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg ``` - Removed: `tests/test_unstructured_config.py` (legacy tests) - Added: `tests/unit/test_document_processor_config.py` - 7 unit tests for new config system - Tests individual and multi-processor configurations - Added: - `nextcloud_mcp_server/document_processors/__init__.py` - `nextcloud_mcp_server/document_processors/base.py` - `nextcloud_mcp_server/document_processors/registry.py` - `nextcloud_mcp_server/document_processors/unstructured.py` - `nextcloud_mcp_server/document_processors/tesseract.py` - `nextcloud_mcp_server/document_processors/custom_http.py` - `tests/unit/test_document_processor_config.py` - Modified: - `nextcloud_mcp_server/config.py` - New plugin config system - `nextcloud_mcp_server/app.py` - Processor initialization - `nextcloud_mcp_server/utils/document_parser.py` - Uses registry - `nextcloud_mcp_server/server/webdav.py` - Import updates - `env.sample` - New configuration format - `docker-compose.yml` - (profile changes from previous work) - Removed: - `nextcloud_mcp_server/client/unstructured_client.py` - Replaced by UnstructuredProcessor - `tests/test_unstructured_config.py` - Replaced with new tests ✅ Extensible: Add processors without modifying core code ✅ Testable: Mock processors for unit tests ✅ Configurable: Enable only needed processors ✅ Flexible: Choose fast (Tesseract) vs accurate (Unstructured) ✅ Opt-in: Disabled by default, no mandatory dependencies Users upgrading from PR #190 need to update environment variables: ```bash ENABLE_UNSTRUCTURED_PARSING=true ENABLE_DOCUMENT_PROCESSING=true ENABLE_UNSTRUCTURED=true ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-25 19:28:35 +02:00
yuisheaven	64649c902d	Merge branch 'master' into feature/introduce_files_parsing_with_unstructured_service_for_webdav_files_retrieval	2025-10-21 20:37:00 +02:00
Chris Coutinho	5e829fc7e7	refactor: Unify logging & remove factory deployment	2025-10-18 01:15:06 +02:00
yuisheaven	3ff6346c03	ran ruff format via uv	2025-10-05 02:16:42 +02:00
yuisheaven	c9a687171a	added envs for unstructured to control OCR quality and OCR languages	2025-10-04 05:21:02 +02:00
yuisheaven	76dce41ed9	added first versoin of the new document_parser utility and added it to the webdav file retrieval logic	2025-10-04 04:28:24 +02:00
Chris Coutinho	a2c78ee1ef	test: Add tests for MCP tools and resources	2025-07-27 17:43:55 +02:00
Chris Coutinho	ee32a1bfe8	feat: Switch to using async client	2025-06-06 18:41:57 +02:00
Chris Coutinho	e93eb9d302	fix: Configure logging	2025-05-25 11:46:41 +02:00
Chris Coutinho	b0012d6e4a	wip Move testing to container	2025-05-10 12:47:10 +02:00
Chris Coutinho	0d8666a2d7	Initial commit	2025-05-04 23:24:55 +02:00

15 Commits