nextcloud-mcp-server

Author	SHA1	Message	Date
Chris Coutinho	4a5766b84e	feat(config): enable DCR for multi-user BasicAuth with offline access Allows multi-user BasicAuth mode to use Dynamic Client Registration (DCR) for OAuth credentials when ENABLE_OFFLINE_ACCESS is enabled, making it consistent with OAuth modes and reducing configuration burden. Changes: Configuration Validation: - Relaxed OAuth credential requirements for multi-user BasicAuth - OAuth credentials now optional when offline access enabled - Will use DCR as fallback if NEXTCLOUD_OIDC_CLIENT_ID/SECRET not set - Updated validation to log info instead of error when DCR will be used Startup Logic (app.py): - Added DCR workflow for multi-user BasicAuth before uvicorn starts - Creates oauth_context for management APIs when offline access enabled - Allows Astrolabe to authenticate management API calls with OAuth - DCR runs synchronously at same lifecycle point as OAuth modes - Added traceback import for better error logging - Fixed type assertions for nextcloud_host - Fixed undefined variable references in vector sync logging Management API: - Improved auth mode detection using proper detect_auth_mode() - Added auth_mode field to /status endpoint: * "basic" - Single-user BasicAuth * "multi_user_basic" - Multi-user BasicAuth * "oauth" - OAuth modes * "smithery" - Smithery stateless - Added supports_app_passwords indicator for multi-user BasicAuth Docker Compose: - Updated mcp-multi-user-basic service configuration: * Enabled vector sync (VECTOR_SYNC_ENABLED=true) * Added ENABLE_OFFLINE_ACCESS=true for app password support * Added NEXTCLOUD_MCP_SERVER_URL for Astrolabe integration * Documented optional static OAuth credentials Testing: - Updated test_config_validators.py to expect DCR fallback - Enhanced configure_astrolabe_for_mcp_server fixture with verification - Added debug logging to test_users_setup fixture Workflow: 1. User configures ENABLE_OFFLINE_ACCESS=true 2. Server checks for static NEXTCLOUD_OIDC_CLIENT_ID/SECRET 3. If not found, performs DCR before uvicorn starts 4. DCR registers client with Nextcloud OIDC provider 5. OAuth credentials used for Astrolabe management API auth 6. Background sync can retrieve user app passwords via Astrolabe 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-22 19:43:24 +01:00
Chris Coutinho	1a5bb10cd0	feat(config): consolidate configuration with smart dependency resolution (ADR-021) Simplifies configuration by consolidating overlapping settings and adding automatic dependency resolution. This makes semantic search configuration significantly easier for users while maintaining 100% backward compatibility. ## Key Changes ### Variable Renaming (Backward Compatible) - `VECTOR_SYNC_ENABLED` → `ENABLE_SEMANTIC_SEARCH` (old name still works) - `ENABLE_OFFLINE_ACCESS` → `ENABLE_BACKGROUND_OPERATIONS` (old name still works) - Deprecation warnings logged when old names used - Old names will be removed in v1.0.0 ### Smart Dependency Resolution - `ENABLE_SEMANTIC_SEARCH` automatically enables background operations in multi-user modes - No need to set both `ENABLE_OFFLINE_ACCESS` and `VECTOR_SYNC_ENABLED` anymore - Single-user mode doesn't auto-enable background ops (not needed) ### Explicit Mode Selection (Optional) - New `MCP_DEPLOYMENT_MODE` environment variable - Valid values: single_user_basic, multi_user_basic, oauth_single_audience, oauth_token_exchange, smithery - Removes ambiguity about which deployment mode is active - Falls back to auto-detection if not set (existing behavior) ### Configuration Templates - Reorganized `env.sample` by deployment mode with clear sections - Added mode-specific quick-start templates: - `env.sample.single-user` - Simplest configuration - `env.sample.oauth-multi-user` - Recommended multi-user - `env.sample.oauth-advanced` - Token exchange mode ## Implementation Details ### Files Modified - `nextcloud_mcp_server/config.py` - Smart dependency resolution helpers - `nextcloud_mcp_server/config_validators.py` - Simplified validation, explicit mode - `tests/unit/test_config_validators.py` - 19 new tests (60 total, all passing) - `env.sample` - Reorganized by deployment mode - `docs/configuration.md` - Complete rewrite with consolidated approach - `docs/troubleshooting.md` - New consolidation troubleshooting section - `README.md` - Updated variable references ### New Files - `docs/ADR-021-configuration-consolidation.md` - Architecture decision record - `docs/configuration-migration-v2.md` - Comprehensive migration guide - `env.sample.single-user` - Single-user quick-start template - `env.sample.oauth-multi-user` - OAuth multi-user quick-start template - `env.sample.oauth-advanced` - Token exchange quick-start template ## User Impact ### Before (Confusing) ```bash ENABLE_OFFLINE_ACCESS=true # Why both? VECTOR_SYNC_ENABLED=true # What's the relationship? ``` ### After (Simplified) ```bash MCP_DEPLOYMENT_MODE=oauth_single_audience # Explicit (optional) ENABLE_SEMANTIC_SEARCH=true # Auto-enables background ops! ``` ### Benefits - 📉 2 fewer variables to understand for semantic search - 📋 Clear intent ("I want semantic search") - 🎯 Explicit mode declaration available - 🔄 100% backward compatible - ✅ All 265 unit tests passing ## Testing - All 60 config validation tests passing - 10 new tests for configuration consolidation - 9 new tests for explicit mode selection - Full unit test suite: 265 tests passing - Backward compatibility verified ## Migration Users can migrate at their own pace. Old variable names continue working with deprecation warnings. See docs/configuration-migration-v2.md for detailed migration instructions. Related: ADR-021 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-21 20:36:36 +01:00
Chris Coutinho	981f102b27	fix(config): address reviewer feedback - Restore CI test filter (-m unit -m smoke) for faster CI runs - Replace local path reference with ADR-020 reference in config_validators.py - Add comprehensive BasicAuthMiddleware unit tests (10 tests covering all edge cases) Addresses critical CI issue and improves test coverage for multi-user BasicAuth mode.	2025-12-20 21:16:17 +01:00
Chris Coutinho	286a3eb20f	feat(auth): add multi-user BasicAuth pass-through mode Implement multi-user BasicAuth pass-through mode (ADR-020) where each request includes BasicAuth credentials that are forwarded to Nextcloud APIs without persistent storage. Changes: - Add _get_client_from_basic_auth() in context.py to extract credentials from Authorization header (set by BasicAuthMiddleware) - Add AstrolabeClient for app password provisioning via Astrolabe API - Update oauth_sync.py with dual credential support (app passwords first, then refresh tokens as fallback) - Simplify oauth_tools.py provisioning logic - Add integration tests for app password provisioning and multi-user BasicAuth Features: - Stateless multi-user mode: credentials passed per-request - Optional background sync via app passwords (stored in Astrolabe) - Falls back to refresh tokens if app password not available - Test coverage for provisioning flow and pass-through mode Related: ADR-019 (Multi-user BasicAuth), ADR-020 (Deployment Modes) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-20 20:55:31 +01:00
Chris Coutinho	4507359760	refactor(config): centralize configuration validation and simplify startup Implement centralized configuration validation (ADR-020) to simplify deployment mode detection and improve error messages. Changes: - Create ADR-020 documenting 5 deployment modes with required/optional config - Add config_validators.py with validate_configuration() and mode detection - Simplify app.py startup with single validation point at get_app() - Remove duplicate is_oauth_mode() function (43 lines) - Fix DeploymentMode mapping (only SELF_HOSTED and SMITHERY_STATELESS exist) - Add comprehensive unit tests (41 tests covering all modes and edge cases) - Add enable_multi_user_basic_auth to Settings and BasicAuthMiddleware Docker Compose: - Remove conflicting ENABLE_MULTI_USER_BASIC_AUTH from mcp-oauth service - Add dedicated mcp-multi-user-basic service on port 8003 Test Results: - 237/237 integration tests PASSED - All deployment modes verified: single-user BasicAuth, multi-user BasicAuth, OAuth single-audience, OAuth token exchange (Keycloak), Smithery stateless 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-20 20:49:28 +01:00
Chris Coutinho	daabd90359	fix(security): address critical security issues from PR #401 code review Implemented 6 critical security fixes identified during PR #401 review: 1. Token Rotation Race Condition (Issue 1) - Added in-progress marker pattern to prevent concurrent refresh - Prevents token invalidation when multiple requests refresh simultaneously - File: token_broker.py:324, 343-390 2. Hardcoded Localhost URL (Issue 2) - Added getNextcloudBaseUrl() with fallback chain - Supports overwrite.cli.url, trusted_domains, and localhost fallback - File: IdpTokenRefresher.php:38-61, 116 3. Error Information Leakage (Issue 3) - Replaced 13 instances of str(e) with sanitized errors - Prevents exposure of stack traces, paths, and tokens - File: management.py:368, 444, 492, 510, 546, 571, 625, 643, 695, 750, 919, 956, 1121 4. Input Validation Gaps (Issue 4) - Added validation helpers: _parse_int_param, _parse_float_param, _validate_query_string - Applied bounds checking to get_chunk_context and unified_search - File: management.py:119-164, 807-835, 1197-1212 5. PHP Refresh Token Validation (Issue 5) - Added explicit refresh_token presence check - Prevents silent token rotation failures - File: IdpTokenRefresher.php:122-132 6. Cookie Security Configuration (Issue 6) - Added _should_use_secure_cookies() with auto-detection - Supports explicit COOKIE_SECURE env var or auto-detect from NEXTCLOUD_HOST - Files: browser_oauth_routes.py:27-44, 470; env.sample:54-57 Testing: - Unit tests: 195 passed - Integration tests: 102 passed, 4 skipped - OAuth tests: 9 passed - All linting and type checks passed Follow-up work tracked in issues #408-#417 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-19 13:57:33 +01:00
Chris Coutinho	fe54733a39	fix(oauth): enable PKCE for all clients and add token_broker to oauth_context This commit fixes two OAuth issues in the Astrolabe app: 1. Always use PKCE (RFC 9207): - PKCE is now used for all OAuth flows (public and confidential clients) - Previous code only used PKCE for public clients, causing failures - Confidential clients now use both PKCE + client_secret (defense in depth) - Nextcloud OIDC provider requires PKCE, so token exchange was failing 2. Add token_broker to oauth_context: - Token broker is now stored in oauth_context for management API access - Fixes "Token broker not configured" error when revoking access - Revoke endpoint needs token_broker to delete refresh tokens and invalidate cache Changes: - OAuthController.php: Always generate PKCE verifier/challenge for all clients - OAuthController.php: Always include code_verifier in token exchange - app.py: Store token_broker in oauth_context after creation Fixes: - Astrolabe OAuth flow now works with Nextcloud OIDC - Revoke/disconnect functionality now works in Astrolabe settings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 01:55:04 +01:00
Chris Coutinho	e4f3beee01	fix: resolve type checking warnings for CI - Add type casts for Starlette app state access - Add assertions for cipher, card, board, stack after initialization - Add None checks for XML element text attributes - Handle __package__ being None in tracing setup - Fix TokenBrokerService initialization to use storage credentials Resolves 42 type warnings from ty-check, enabling CI linting to pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:44:58 +01:00
Chris Coutinho	54b69f0d68	fix: move Alembic to package submodule for Docker compatibility - Move alembic/ directory to nextcloud_mcp_server/alembic/ subpackage - Update migrations.py to use package location instead of alembic.ini - Update env.py to set script_location dynamically - Update alembic.ini for development CLI usage - Fix Dockerfile typo: .vnev -> .venv This fixes FileNotFoundError when running in Docker with non-editable install. The alembic package is now installed with the main package, making it work in both development and production environments. Resolves: Docker startup error 'alembic.ini not found at /app/.venv/lib/python3.12/site-packages/alembic.ini' 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:42:59 +01:00
Chris Coutinho	3fa376905c	feat: add Alembic database migration system Implements Alembic for managing token storage database schema versions. Migrations run automatically on startup with full backward compatibility. Changes: - Add Alembic dependency (1.14.0+) and SQLAlchemy (auto-installed) - Create migration infrastructure in alembic/ directory - Add initial migration (001) capturing current schema - Modify RefreshTokenStorage.initialize() to run migrations via anyio - Add CLI commands: db upgrade, current, history, downgrade, migrate - Add comprehensive migration documentation Backward Compatibility: - Pre-Alembic databases automatically stamped with revision 001 - No schema changes for existing databases - Automatic upgrade on first startup after update Migration Strategy: Three scenarios handled: 1. New database → Run migrations from scratch 2. Pre-Alembic database → Stamp with 001 (no changes) 3. Alembic-managed → Upgrade to latest Architecture: - Uses anyio.to_thread.run_sync() for structured concurrency - Alembic env.py runs with anyio.run() in worker thread - SQLite-friendly migration patterns documented - No ThreadPoolExecutor needed (anyio handles it) CLI Usage: ```bash nextcloud-mcp-server db upgrade # Upgrade to latest nextcloud-mcp-server db current # Show version nextcloud-mcp-server db history # View changelog nextcloud-mcp-server db downgrade # Rollback (with confirmation) nextcloud-mcp-server db migrate "description" # Create migration ``` Testing: - All 13 webhook storage tests pass - New/pre-Alembic database scenarios validated - anyio integration tested 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:02:09 +01:00
Chris Coutinho	44391d3d1d	fix: address critical code review issues (4 fixes) This commit addresses 4 critical issues identified in code review: 1. Token Rotation Race Condition (token_broker.py) - Added per-user locking mechanism to prevent concurrent refresh token corruption - Implemented double-check pattern for cache after acquiring lock - Users can now safely refresh concurrently without token desync 2. Hardcoded OAuth Client ID (PHP files) - Made client ID configurable via `astroglobe_client_id` in system config - Updated McpServerClient to provide getClientId() method - Injected McpServerClient into IdpTokenRefresher and OAuthController - Updated admin settings UI to display client ID configuration status - App gracefully handles missing client ID with warnings in admin UI 3. Missing Cache Invalidation (management.py:revoke_user_access) - Added cache.invalidate() call when revoking user access - Ensures both storage AND cache are cleared atomically - Prevents stale cached tokens from being used after revocation 4. Error Message Exposure (management.py) - Created _sanitize_error_for_client() helper function - Updated all error handlers to log detailed errors internally - Returns generic messages to clients to prevent information leakage - Protects against exposing database paths, API URLs, tokens, etc. All changes are backward compatible and preserve existing functionality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:01:54 +01:00
Chris Coutinho	5eec34c17e	feat(auth): implement refresh token rotation for Nextcloud OIDC Add support for one-time use refresh tokens with automatic rotation to align with Nextcloud OIDC security model. Changes: - TokenBrokerService improvements: - Add user_id parameter to refresh methods - Detect and store rotated refresh tokens - Add offline_access scope to token requests - Handle refresh token rotation on every use - Add management API endpoints: - /api/v1/webhooks (GET/POST) - List/create webhooks - /api/v1/webhooks/{id} (DELETE) - Delete webhook - /api/v1/search (POST) - Unified search - /api/v1/chunk-context (GET) - Get chunk context - /api/v1/apps (GET) - List installed apps - Update tests for refresh token rotation - Add --headed flag to pytest for Playwright debugging Benefits: - Aligns with Nextcloud OIDC one-time refresh token model - Prevents refresh token invalidation after first use - Enables long-lived background operations - Provides full webhook lifecycle management 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:01:53 +01:00
Chris Coutinho	85db90a2df	feat(search): add file_path metadata and chunk offsets to search results Changes: - Add file_path to metadata in semantic and BM25 hybrid search algorithms for PDF viewer integration (search/semantic.py:161-163, search/bm25_hybrid.py:230-232) - Include chunk_start_offset, chunk_end_offset, page_number, and page_count in search results for rich chunk display (api/management.py:981-1004) - Add point_id field to SearchResult for batch retrieval (models/semantic.py) - Fix type narrowing for chunk context API parameters (api/management.py:1102-1111) - Fix None-safety in doc_types discovery (search/algorithms.py:114) This enables the Astroglobe UI to display PDF pages at the correct location for matched chunks. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:01:29 +01:00
Chris Coutinho	73783b85d5	feat(astrolabe): use proper icons and thumbnails in unified search Improve search result display to match Nextcloud's native search providers by using mimetype-specific icons and preview thumbnails. File Results: - Use preview thumbnails for images/PDFs (core.Preview API) - Use mimetype-specific icon classes (icon-pdf, icon-text, icon-image, etc.) - Detect folders and use icon-folder appropriately Other Document Types: - Notes: icon-notes - Deck Cards: icon-deck - Calendar: icon-calendar - News: icon-rss - Contacts: icon-contacts API Changes: - Management API now includes mime_type in search results - SemanticSearchProvider uses IMimeTypeDetector and IPreview services This makes Astroglobe search results visually consistent with Files, Notes, and other native providers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:01:24 +01:00
Chris Coutinho	97b48ca3dd	feat(astrolabe): add 3D PCA visualization for semantic search - Add Plotly.js 3D scatter plot showing search results in PCA space - Create shared visualization.py module to avoid code duplication - Pass include_pca parameter through API chain to enable coordinates - Fix OAuth redirects to use /settings/user/astroglobe The visualization shows document embeddings projected to 3D via PCA, with the query point highlighted in red. Uses Viridis colorscale for score visualization, matching the existing vector-viz page. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-18 00:01:09 +01:00
Chris Coutinho	a58a14111b	feat(vector-sync): enable background sync in OAuth mode Add multi-user background vector synchronization when running in OAuth mode with ENABLE_OFFLINE_ACCESS=true. Key changes: Architecture (oauth_sync.py): - User Manager task polls RefreshTokenStorage for provisioned users - Per-user scanner tasks fetch documents using OAuth tokens - Shared processor pool indexes documents from all users Token Broker improvements: - Accept client_id/client_secret instead of encryption_key - Remove redundant token audience pre-validation (Nextcloud validates) - Add _rewrite_token_endpoint for Docker internal URL routing - Remove double-decryption (storage handles encryption internally) Browser OAuth flow fixes: - Add 'resource' parameter to request Nextcloud-scoped tokens - Store and retrieve next_url for proper redirect after consent - Rewrite token endpoint URLs for internal Docker access Configuration: - Add vector_sync_user_poll_interval setting (default: 60s) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-14 20:00:41 +01:00
Chris Coutinho	e0320e761c	perf(deck): optimize card lookup by storing board_id/stack_id in metadata Addresses reviewer feedback on PR #395 about O(n²) performance issue. Changes: - scanner.py: Add metadata field to DocumentTask with board_id/stack_id - scanner.py: Populate metadata during deck card scanning (both initial and incremental sync) - processor.py: Use metadata for O(1) card lookup via get_card() API when available - processor.py: Fallback to iteration for legacy data without metadata - context.py: Add _get_deck_metadata_from_qdrant() helper to retrieve metadata from Qdrant - context.py: Use metadata for fast path lookup in chunk context expansion - context.py: Add user_id parameter to _fetch_document_text() for metadata retrieval Performance Impact: - Before: O(boards × stacks × cards) iteration for each card lookup - After: O(1) direct API call using stored board_id/stack_id - Graceful degradation: Falls back to iteration for legacy data Testing: - All existing integration tests pass (test_deck_vector_search.py) - Type checking passes with no new errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-14 00:23:12 +01:00
Chris Coutinho	20404cf3f2	feat(vector): add Deck card vector search with visualization support Adds comprehensive vector search support for Nextcloud Deck cards, including semantic search indexing, chunk preview in the vector viz UI, and proper deep linking to cards. Vector Search Indexing - Add deck_card scanning in scanner.py (scan_deck_cards function) - Index cards from non-archived, non-deleted boards - Store metadata: board_id, board_title, stack_id, stack_title, card_type, duedate, owner - Content structure: title + "\n\n" + description (matches indexing format) - Incremental sync based on lastModified timestamp - Deletion tracking with grace period Vector Visualization Support - Add deck_card handler in context.py for chunk preview expansion - Include board_id in search result metadata (bm25_hybrid.py, semantic.py) - Expose metadata in viz_routes.py JSON responses - Update vector-viz.js to construct proper Deck URLs: /apps/deck/board/{board_id}/card/{card_id} - Update vector_viz.html filter label from "Deck" to "Deck Cards" Bug Fixes - Skip soft-deleted boards (deletedAt > 0) to prevent 403 Forbidden errors - Applies to scanner, processor, and context expansion code paths - Deck API returns deleted boards but rejects stack access with 403 Testing - Add integration tests in test_deck_vector_search.py: - test_deck_card_semantic_search: Filtered search with doc_type="deck_card" - test_deck_card_appears_in_cross_app_search: Cross-app search includes deck cards - test_deck_card_chunk_context: Chunk context fetching for viz preview Documentation - Update README.md: Add Deck cards to semantic search feature list - Update semantic-search-architecture.md: Document deck_card support - Update nc_semantic_search tool documentation Type Safety - Fix type narrowing for page_boundaries (could be None) using cast() - Fix scanner.py payload None check for type safety Resolves vector search for Deck cards across indexing, search, and visualization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-13 23:51:18 +01:00
Chris Coutinho	9d0a993c2a	feat(vector-viz): add news_item support for links and chunk expansion Add support for news_item document type in the vector visualization page: - Add "News" checkbox to document type filter options - Add URL handler to link news items to /apps/news/item/{id} - Add content fetching for news items in chunk context expansion This enables users to search and view news articles in the vector visualization, with clickable links back to Nextcloud News and the ability to expand chunks to see full article context. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-13 21:34:47 +01:00
Chris Coutinho	d61e33113c	fix(news): revert get_item() to use get_items() + filter Reverts the "perf(news): use direct API endpoint for get_item()" change from commit `92c4bf3` which incorrectly assumed GET /items/{itemId} exists. The News API (v1-2, v1-3, v2) does not provide a direct endpoint to retrieve individual items. The only /items/{itemId} routes are POST operations for marking items read/unread/starred. Changes: - Restore original get_item() implementation that fetches all items and filters in Python - Update exception from HTTPStatusError to ValueError - Restore documentation explaining API limitation - Update unit tests to mock get_items() instead of _make_request() - Add test for ValueError when item not found Fixes vector processor 405 errors when indexing news items. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-13 15:47:27 +01:00
Chris Coutinho	c8da826ef7	Merge pull request #382 from cbcoutinho/renovate/mcp-1.x fix(deps): update dependency mcp to >=1.23,<1.24	2025-12-12 18:00:04 +01:00
Chris Coutinho	ec70e70a5d	fix: Disable DNS rebinding protection for containerized deployments MCP Python SDK 1.23.0 introduced automatic DNS rebinding protection that auto-enables when host="127.0.0.1" (the default). This breaks containerized deployments (Kubernetes, Docker) because the protection rejects requests with Host headers like "nextcloud-mcp-server.default.svc.cluster.local:8000". Root cause: - FastMCP defaults to host="127.0.0.1" - SDK auto-enables DNS rebinding protection with allowed_hosts=["127.0.0.1:", "localhost:", "[::1]:*"] - K8s/Docker requests use service DNS names or proxied hostnames - Protection middleware rejects these requests (421 Misdirected Request) Solution: - Explicitly pass transport_security=TransportSecuritySettings(enable_dns_rebinding_protection=False) - Applied to all three FastMCP initializations (OAuth, Smithery, BasicAuth) - DNS rebinding attacks mitigated by OAuth authentication and network isolation This fixes issue #373 and enables MCP 1.23.x upgrade in PR #382. For detailed analysis, see docs/MCP-1.23-DNS-REBINDING-FIX.md	2025-12-12 17:30:22 +01:00
Chris Coutinho	19183ad14a	fix: address PR review feedback Address all reviewer comments from PR #387: 1. ✅ Add unit tests for annotations (tests/server/test_annotations.py) - 10 comprehensive test functions validating all annotation patterns - Tests for titles, read-only, destructive, idempotent operations - Validates specific ADR-017 decisions (webdav write, semantic search) - Cross-category consistency checks 2. ✅ Fix nc_webdav_write_file idempotency classification - Changed from idempotentHint=False to idempotentHint=True - Rationale: Uses HTTP PUT without version control - Writing same content to same path = same end state (idempotent) 3. ✅ Fix semantic search openWorldHint inconsistency - Changed from openWorldHint=False to openWorldHint=True - Rationale: Consistent with other Nextcloud tools - Nextcloud is external to MCP server (indexed data is implementation detail) 4. ✅ Update ADR-017 with resolved decisions - Converted Open Questions to Resolved Questions - Added detailed rationale for webdav write and semantic search - Updated status from Proposed to Implemented - Added decision timeline with dates 5. ✅ Add MCP Tool Annotations guidelines to CLAUDE.md - Comprehensive section with code examples for all patterns - Key principles documented (idempotency, destructive, open world) - References ADR-017 for detailed rationale All OAuth tools verified to have proper annotations (oauth_tools.py lines 686-751).	2025-12-11 13:50:55 +01:00
Chris Coutinho	e1412320a7	feat: add MCP tool annotations for enhanced UX Add ToolAnnotations to all 105+ MCP tools across 13 modules to enable better client-side UX with human-readable titles and behavioral hints. Changes: - Add title and ToolAnnotations to all @mcp.tool() decorators - Apply correct idempotency classification per ADR-017 - Add destructiveHint for delete operations - Set openWorldHint=False for semantic search (internal data only) Modules updated: - OAuth (4 tools): Authentication and provisioning - Notes (7 tools): Note management - WebDAV (11 tools): File operations - Semantic (3 tools): Semantic search and RAG - Calendar (16 tools): Events and todos - Contacts (7 tools): Address book management - Sharing (5 tools): File/folder sharing - Tables (6 tools): Structured data - Deck (25 tools): Kanban board management - Cookbook (13 tools): Recipe management - News (8 tools): RSS feed reader Annotation patterns: - Read operations: readOnlyHint=True, openWorldHint=True - Create operations: idempotentHint=False, openWorldHint=True - Update operations: idempotentHint=False, openWorldHint=True - Delete operations: destructiveHint=True, idempotentHint=True, openWorldHint=True See docs/ADR-017-mcp-tool-annotations.md for rationale and implementation details. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-11 12:45:02 +01:00
Chris Coutinho	3f06e2ee77	fix: resolve all type checking errors (8 errors fixed) Fixed 8 type checker errors across the codebase: - vector/scanner.py: Handle None scroll results with null-safe iteration - search/{bm25_hybrid,semantic}.py: Add None checks for result.payload - auth/{unified_verifier,webhook_routes}.py: Assert non-None auth credentials - client/webdav.py: Add None checks before int() conversions - providers/openai.py: Assert embedding_model is not None - search/algorithms.py: Explicitly type doc_types set and cast values - observability/logging_config.py: Match parent class signature (log_data) Also fixed test_create_tag_creates_system_tag to match WebDAV implementation (was testing OCS API endpoint, now tests correct WebDAV endpoint with Content-Location header). Type checker: 0 errors (down from 8), 20 warnings (ignored) Tests: All 192 unit tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-08 01:09:02 +01:00
Chris Coutinho	92c4bf36f6	perf(news): use direct API endpoint for get_item() Replace O(n) fetch-all-and-filter approach with O(1) direct API call. The News API v1-3 supports GET /items/{id} for single-item retrieval. - Update get_item() to use direct endpoint - Add unit test for get_item() method - Fixes critical performance issue identified in code review 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 17:22:51 +01:00
Chris Coutinho	a5cb6e1242	refactor(news): simplify vector sync to fetch all items Remove the complex starred+unread filtering logic in scan_news_items(). The News app's auto-purge feature (default: 200 items per feed) already limits the total number of items, making explicit filtering unnecessary. Changes: - Replace two API calls (starred + unread) with single all-items call - Remove deduplication logic that merged both lists - Update docstring to explain the simpler approach This reduces code complexity while maintaining the same effective coverage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 15:05:34 +01:00
Chris Coutinho	a33f6a2f15	feat(news): add Nextcloud News app integration Add full integration for the Nextcloud News (RSS/Atom reader) app: - Add NewsClient with complete CRUD operations for folders, feeds, and items - Add 8 read-only MCP tools for listing/getting folders, feeds, items - Add Pydantic models for News entities with camelCase alias support - Add vector sync support for starred + unread items - Add HTML to Markdown converter using markdownify for better embeddings - Add Docker post-install hook to enable News app - Add 25 unit tests for NewsClient API methods Vector sync indexes starred and unread items, providing a balanced approach that captures important (starred) and current (unread) content without indexing the entire article history. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 14:39:31 +01:00
Chris Coutinho	92b97bda00	fix: Add rate limit retry logic to OpenAI provider Add exponential backoff retry handling for OpenAI API rate limits (429 errors). This is needed for GitHub Models API which has stricter rate limits than standard OpenAI API. - Add retry_on_rate_limit decorator with exponential backoff - Max 5 retries with delays: 2s → 4s → 8s → 16s → 32s - Apply to embed(), _embed_batch_request(), and generate() methods 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 17:24:48 +01:00
Chris Coutinho	5c73b85f65	fix: Increase MCP sampling timeout to 5 minutes for slower LLMs - Increase sampling timeout from 30s to 300s in semantic.py to accommodate slower local LLMs like Ollama - Refactor RAG integration tests to support multiple providers (ollama, openai, anthropic, bedrock) - Remove unnecessary embedding_provider fixture since MCP server handles embeddings internally - Add --provider flag via tests/integration/conftest.py - Add provider_fixtures.py with factory functions for generation providers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 05:43:48 +01:00
Chris Coutinho	4e9859117c	fix: Share vector sync state with FastMCP session lifespan via module singleton The refactor in `fafeaf3` moved background tasks to Starlette server lifespan but broke nc_get_vector_sync_status because it still looked for streams in FastMCP's AppContext (lifespan_context). Add VectorSyncState module singleton to bridge the lifespans: - starlette_lifespan sets the singleton when starting background tasks - app_lifespan_basic reads from singleton and includes in AppContext - MCP tools can now access document_receive_stream for pending count 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 04:20:47 +01:00
Chris Coutinho	fafeaf3d83	refactor: Move background tasks to server lifespan and deprecate SSE transport - Move scanner/processor tasks from FastMCP session lifespan to Starlette server lifespan (correct architecture: background tasks run once at server level, not per-session) - Change default CLI transport from SSE to streamable-http - Remove SSE transport option from CLI (SSE is deprecated) - Remove SSE client session factory from test fixtures - Add tracing instrumentation to BM25 hybrid search operations for better observability 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 04:02:30 +01:00
Chris Coutinho	2ab8dad6a5	fix: Use WebDAV for tag creation and add LLM-as-a-judge for RAG tests - Change create_tag() to use WebDAV POST instead of OCS API which returned 404 in some Nextcloud versions - Add llm_judge() helper that evaluates system output against ground truth with simple TRUE/FALSE prompt - Replace keyword-based assertions in RAG tests with LLM judge for more flexible semantic evaluation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 02:24:01 +01:00
Chris Coutinho	2896fa1dc9	feat: Add tag management methods to WebDAV client - Add get_file_info() to get file info including file ID via PROPFIND - Add create_tag() to create system tags via OCS API - Add get_or_create_tag() for idempotent tag creation - Add assign_tag_to_file() to assign tags to files via WebDAV - Add remove_tag_from_file() to remove tags from files Also refactors RAG evaluation: - Add indexed_manual_pdf fixture using existing nc_client/nc_mcp_client - Remove manual tag creation steps from workflow (now handled by fixture) - Add comprehensive unit tests for new WebDAV methods 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 01:51:42 +01:00
Chris Coutinho	208365cd3d	feat: Add OpenAI provider support for embeddings and generation Adds OpenAI provider to the unified provider architecture (ADR-015), supporting: - OpenAI API (api.openai.com) - GitHub Models API (models.github.ai/inference) - OpenAI-compatible endpoints (Fireworks, Together, etc.) Features: - Embedding support with text-embedding-3-small/large models - Text generation via chat completions API - Automatic retry with exponential backoff for rate limits - Provider auto-detection in registry (priority after Bedrock) Environment variables: - OPENAI_API_KEY: API key (required) - OPENAI_BASE_URL: Base URL override (optional) - OPENAI_EMBEDDING_MODEL: Embedding model (default: text-embedding-3-small) - OPENAI_GENERATION_MODEL: Generation model (default: gpt-4o-mini) Also adds: - Integration tests for RAG pipeline with MCP sampling - MCP client sampling support for integration tests - Ground truth Q&A pairs for Nextcloud User Manual 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-23 00:33:32 +01:00
Chris Coutinho	03b984d5a7	fix(smithery): Enable JSON response format for scanner compatibility The Smithery scanner was reporting "0 tools" despite the server returning valid tool definitions. Root cause: the server was returning SSE-formatted responses (event: message\ndata: {...}) which the scanner couldn't parse. Changes: - Add json_response=True to FastMCP for Smithery stateless mode - Clean up verbose docstring examples in semantic.py and webdav.py The MCP spec allows both SSE and plain JSON responses for HTTP transport. Setting json_response=True returns Content-Type: application/json with plain JSON-RPC instead of text/event-stream with SSE format. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 22:01:18 +01:00
Chris Coutinho	ea79e94842	Merge pull request #343 from cbcoutinho/fix/vector-viz-search perf: Optimize vector viz search performance	2025-11-22 19:53:40 +01:00
Chris Coutinho	b0612cfa0f	perf: Optimize vector viz search performance - Replace sequential Qdrant scroll calls with batch retrieve (50 HTTP requests → 1 request, ~50x faster vector fetch) - Add point_id to SearchResult to enable batch retrieval by Qdrant point ID - Reuse query embedding from search algorithm in viz_routes (eliminates redundant embedding call, saves ~30ms) - Make BM25 encode() async with thread pool to avoid blocking event loop (~4.4s was blocking, now properly async) - Run PCA computation in thread pool to avoid blocking event loop (~1.2s was blocking, now properly async) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 19:47:43 +01:00
Chris Coutinho	3b41776110	Merge pull request #342 from cbcoutinho/feature/smithery feat: Add Smithery stateless deployment support (ADR-016)	2025-11-22 19:39:53 +01:00
Chris Coutinho	39fba49cfe	fix(smithery): Add JSON Schema metadata to mcp-config endpoint Add proper JSON Schema metadata fields per Smithery documentation: - $schema: JSON Schema draft-07 - $id: Schema identifier URL - title: Human-readable title - description: Schema description - x-query-style: "flat" (no nested objects in our schema) - additionalProperties: false 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 18:28:05 +01:00
Chris Coutinho	706a15f0bc	fix(smithery): Use container runtime pattern for config discovery ADR-016: For container runtime deployment, Smithery does not auto-generate the .well-known/mcp-config endpoint like it does for Python CLI runtime. Changes: - Remove [tool.smithery] from pyproject.toml (not used in container mode) - Remove smithery_server.py (Python CLI runtime specific) - Add .well-known/mcp-config endpoint to return JSON Schema config - Add SmitheryConfigMiddleware to extract config from URL query params - Use ContextVar to pass session config to tool handlers The container runtime passes config as URL query parameters to /mcp: GET /mcp?nextcloud_url=...&username=...&app_password=... Tested: - All 164 unit tests passing - Docker container builds successfully - .well-known/mcp-config returns valid JSON Schema - Health endpoints working 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 18:22:55 +01:00
Chris Coutinho	b8dc413b73	feat: Add Smithery CLI deployment support - Add smithery package as dependency - Create smithery_server.py with @smithery.server() decorator - Add SmitheryConfigSchema for session config (nextcloud_url, username, app_password) - Add [tool.smithery] section to pyproject.toml - Remove manual .well-known/mcp-config endpoint (Smithery handles this) Smithery CLI will automatically: - Extract config schema from the decorated function - Handle session config parsing from query parameters - Make config accessible via ctx.session_config in tools 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 18:05:33 +01:00
Chris Coutinho	8d29ce0122	fix: Add Smithery lifespan and auth mode detection - Add SmitheryAppContext dataclass for stateless mode - Add app_lifespan_smithery() with minimal lifespan (no shared state) - Update is_oauth_mode() to detect Smithery mode and return BasicAuth - Use Smithery lifespan when SMITHERY_DEPLOYMENT=true - Add .well-known/mcp-config endpoint for config discovery - Skip document processors in Smithery mode (not enabled) Fixes startup issues in Smithery mode where missing env credentials would incorrectly trigger OAuth mode. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 17:48:53 +01:00
Chris Coutinho	f93d650992	feat: Implement ADR-016 Smithery stateless deployment mode Adds support for Smithery hosted deployment with stateless operation: - Add DeploymentMode enum with SELF_HOSTED and SMITHERY_STATELESS modes - Add get_deployment_mode() to detect mode from SMITHERY_DEPLOYMENT env var - Update get_client() to create per-request clients from session config - Add conditional tool registration (skip semantic search in Smithery mode) - Add conditional /app admin UI mounting (skip in Smithery mode) - Create smithery.yaml with configSchema for user credentials - Create Dockerfile.smithery for minimal stateless container - Create smithery_main.py entrypoint for Smithery deployment In Smithery mode: - Users provide nextcloud_url, username, app_password via session config - Each request creates a fresh NextcloudClient (no state between requests) - Semantic search tools are disabled (no vector database) - Admin UI (/app) is disabled (no webhooks, vector viz) All existing self-hosted functionality remains unchanged. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 17:30:42 +01:00
Chris Coutinho	34fd17ba55	fix: Use alpha_composite for proper RGBA highlight blending Drawing directly with ImageDraw on RGBA mode doesn't blend alpha properly. Use Image.alpha_composite() with a transparent overlay to achieve correct semi-transparent highlight fills. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 17:04:29 +01:00
Chris Coutinho	8baa07db84	fix: Remove pymupdf.layout.activate() to fix page_chunks behavior pymupdf.layout.activate() causes pymupdf4llm.to_markdown() to ignore the page_chunks=True option, returning a single string instead of list[dict]. This broke per-page chunking needed for semantic search indexing. See: https://github.com/pymupdf/pymupdf4llm/issues/323 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 16:58:35 +01:00
Chris Coutinho	ba8a53803a	refactor: Simplify PDF text extraction with single to_markdown call Replace parallel per-page extraction with single to_markdown(page_chunks=True) call. This is more efficient as pymupdf4llm can optimize internally for full-document processing instead of making N separate calls for N pages. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 03:52:02 +01:00
Chris Coutinho	31fade9730	perf: Optimize PDF processing with parallel extraction and single-render highlights Phase 1 - PDF Highlighting Optimization: - Render each page ONCE instead of once per chunk (N chunks = 1 render, not N) - Use PIL to draw bounding boxes on copied base images (fast) instead of re-rendering page via pymupdf (slow) - Add _find_chunk_bbox() to extract bbox without modifying page Phase 2 - Parallel Page Extraction: - Use anyio task group with run_sync() for parallel page extraction - Each page extracted in separate thread via anyio.to_thread.run_sync() - Event loop stays responsive during extraction - Remove obsolete _process_sync() method Expected improvement: 30-50% reduction in total PDF processing time. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 03:11:56 +01:00
Chris Coutinho	fffe483c02	fix: Centralize PDF processing and generate separate images per chunk Previously, pymupdf4llm.to_markdown() was called twice - once in PyMuPDFProcessor during indexing and again in PDFHighlighter during visualization. Different image path lengths caused different character offsets, leading to highlighted pages not matching their chunks. Also fixed issue where all chunks on the same page showed all highlights instead of just their own highlight. Now restores original page contents between chunks using xref stream caching. Changes: - Add PDFHighlighter class requiring pre-computed page_boundaries and full_text from document processor (no fallback extraction) - Pass pre-computed data from processor to highlighter - Extract page-relative portion of chunk text for cross-page chunks - Add bounding box highlighting using text anchor search - Run highlight generation in parallel with embedding/BM25 - Cache and restore page contents to isolate highlights per chunk Results: Highlighting success rate improved from 51% to 95% (121/128). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 02:46:30 +01:00
Chris Coutinho	a62a007c87	feat: Add context expansion to semantic search with chunk overlap removal Implements optional context expansion for semantic search results that fetches adjacent chunks (N-1 and N+1) from Qdrant to provide before/after context. Removes configurable chunk overlap (default 200 chars) to avoid duplicate text appearing in both context and excerpt. Key changes: - Add include_context and context_chars parameters to nc_semantic_search and nc_semantic_search_answer tools - Implement Qdrant cache fast path for chunk retrieval (avoids re-fetching and re-parsing documents, especially important for PDFs) - Add _get_chunk_by_index_from_qdrant() to fetch adjacent chunks - Remove chunk overlap from before_context (last N chars) and after_context (first N chars) to prevent duplicate text - Fetch context in parallel with anyio.Semaphore (max 20 concurrent) - Pass through page_number from SearchResult to SemanticSearchResult - Remove document-level deduplication (keep chunk-level dedup from algorithm) Context expansion is opt-in via include_context=true parameter. When enabled: - Populates has_context_expansion, marked_text, before_context, after_context - Adds truncation flags when context exceeds context_chars limit - Falls back to document fetch for legacy data with truncated excerpts Related: nextcloud_mcp_server/search/context.py:87-382, nextcloud_mcp_server/server/semantic.py:161-255	2025-11-21 01:02:22 +01:00

1 2 3 4 5 ...

364 Commits