nextcloud-mcp-server

Author	SHA1	Message	Date
Chris Coutinho	a11ae9c027	refactor: enforce PLC0415 (import-outside-top-level) for source code Enable ruff PLC0415 rule for all source files (tests excluded via per-file-ignores). Move 136 inline imports to top-level across 33 files. 8 imports suppressed with noqa for legitimate reasons: circular dependencies (client/__init__.py, context.py), optional dependency guards (app.py document processors, auth/userinfo_routes.py), and post-env-setup imports (smithery_main.py). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 08:04:50 +01:00
Chris Coutinho	056414752e	fix(mcp): Move all imports to the top of modules	2025-12-26 10:05:27 -06:00
Chris Coutinho	e4f3beee01	fix: resolve type checking warnings for CI - Add type casts for Starlette app state access - Add assertions for cipher, card, board, stack after initialization - Add None checks for XML element text attributes - Handle __package__ being None in tracing setup - Fix TokenBrokerService initialization to use storage credentials Resolves 42 type warnings from ty-check, enabling CI linting to pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 00:44:58 +01:00
Chris Coutinho	e0320e761c	perf(deck): optimize card lookup by storing board_id/stack_id in metadata Addresses reviewer feedback on PR #395 about O(n²) performance issue. Changes: - scanner.py: Add metadata field to DocumentTask with board_id/stack_id - scanner.py: Populate metadata during deck card scanning (both initial and incremental sync) - processor.py: Use metadata for O(1) card lookup via get_card() API when available - processor.py: Fallback to iteration for legacy data without metadata - context.py: Add _get_deck_metadata_from_qdrant() helper to retrieve metadata from Qdrant - context.py: Use metadata for fast path lookup in chunk context expansion - context.py: Add user_id parameter to _fetch_document_text() for metadata retrieval Performance Impact: - Before: O(boards × stacks × cards) iteration for each card lookup - After: O(1) direct API call using stored board_id/stack_id - Graceful degradation: Falls back to iteration for legacy data Testing: - All existing integration tests pass (test_deck_vector_search.py) - Type checking passes with no new errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-14 00:23:12 +01:00
Chris Coutinho	20404cf3f2	feat(vector): add Deck card vector search with visualization support Adds comprehensive vector search support for Nextcloud Deck cards, including semantic search indexing, chunk preview in the vector viz UI, and proper deep linking to cards. Vector Search Indexing - Add deck_card scanning in scanner.py (scan_deck_cards function) - Index cards from non-archived, non-deleted boards - Store metadata: board_id, board_title, stack_id, stack_title, card_type, duedate, owner - Content structure: title + "\n\n" + description (matches indexing format) - Incremental sync based on lastModified timestamp - Deletion tracking with grace period Vector Visualization Support - Add deck_card handler in context.py for chunk preview expansion - Include board_id in search result metadata (bm25_hybrid.py, semantic.py) - Expose metadata in viz_routes.py JSON responses - Update vector-viz.js to construct proper Deck URLs: /apps/deck/board/{board_id}/card/{card_id} - Update vector_viz.html filter label from "Deck" to "Deck Cards" Bug Fixes - Skip soft-deleted boards (deletedAt > 0) to prevent 403 Forbidden errors - Applies to scanner, processor, and context expansion code paths - Deck API returns deleted boards but rejects stack access with 403 Testing - Add integration tests in test_deck_vector_search.py: - test_deck_card_semantic_search: Filtered search with doc_type="deck_card" - test_deck_card_appears_in_cross_app_search: Cross-app search includes deck cards - test_deck_card_chunk_context: Chunk context fetching for viz preview Documentation - Update README.md: Add Deck cards to semantic search feature list - Update semantic-search-architecture.md: Document deck_card support - Update nc_semantic_search tool documentation Type Safety - Fix type narrowing for page_boundaries (could be None) using cast() - Fix scanner.py payload None check for type safety Resolves vector search for Deck cards across indexing, search, and visualization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-13 23:51:18 +01:00
Chris Coutinho	a33f6a2f15	feat(news): add Nextcloud News app integration Add full integration for the Nextcloud News (RSS/Atom reader) app: - Add NewsClient with complete CRUD operations for folders, feeds, and items - Add 8 read-only MCP tools for listing/getting folders, feeds, items - Add Pydantic models for News entities with camelCase alias support - Add vector sync support for starred + unread items - Add HTML to Markdown converter using markdownify for better embeddings - Add Docker post-install hook to enable News app - Add 25 unit tests for NewsClient API methods Vector sync indexes starred and unread items, providing a balanced approach that captures important (starred) and current (unread) content without indexing the entire article history. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 14:39:31 +01:00
Chris Coutinho	fffe483c02	fix: Centralize PDF processing and generate separate images per chunk Previously, pymupdf4llm.to_markdown() was called twice - once in PyMuPDFProcessor during indexing and again in PDFHighlighter during visualization. Different image path lengths caused different character offsets, leading to highlighted pages not matching their chunks. Also fixed issue where all chunks on the same page showed all highlights instead of just their own highlight. Now restores original page contents between chunks using xref stream caching. Changes: - Add PDFHighlighter class requiring pre-computed page_boundaries and full_text from document processor (no fallback extraction) - Pass pre-computed data from processor to highlighter - Extract page-relative portion of chunk text for cross-page chunks - Add bounding box highlighting using text anchor search - Run highlight generation in parallel with embedding/BM25 - Cache and restore page contents to isolate highlights per chunk Results: Highlighting success rate improved from 51% to 95% (121/128). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-22 02:46:30 +01:00
Chris Coutinho	a62a007c87	feat: Add context expansion to semantic search with chunk overlap removal Implements optional context expansion for semantic search results that fetches adjacent chunks (N-1 and N+1) from Qdrant to provide before/after context. Removes configurable chunk overlap (default 200 chars) to avoid duplicate text appearing in both context and excerpt. Key changes: - Add include_context and context_chars parameters to nc_semantic_search and nc_semantic_search_answer tools - Implement Qdrant cache fast path for chunk retrieval (avoids re-fetching and re-parsing documents, especially important for PDFs) - Add _get_chunk_by_index_from_qdrant() to fetch adjacent chunks - Remove chunk overlap from before_context (last N chars) and after_context (first N chars) to prevent duplicate text - Fetch context in parallel with anyio.Semaphore (max 20 concurrent) - Pass through page_number from SearchResult to SemanticSearchResult - Remove document-level deduplication (keep chunk-level dedup from algorithm) Context expansion is opt-in via include_context=true parameter. When enabled: - Populates has_context_expansion, marked_text, before_context, after_context - Adds truncation flags when context exceeds context_chars limit - Falls back to document fetch for legacy data with truncated excerpts Related: nextcloud_mcp_server/search/context.py:87-382, nextcloud_mcp_server/server/semantic.py:161-255	2025-11-21 01:02:22 +01:00
Chris Coutinho	5a251a99e6	fix: Set is_placeholder=False in processor to fix search filtering The processor was not setting is_placeholder field when writing real document chunks to Qdrant. This caused the placeholder filter to exclude all documents (since None != False), resulting in 0 search results. Now explicitly sets is_placeholder: False in payload when writing real indexed chunks, allowing search filters to correctly distinguish between placeholders and real documents.	2025-11-20 17:15:19 +01:00
Chris Coutinho	13b2d0048c	feat: Implement Qdrant placeholder state management Introduces a placeholder-based state tracking system to prevent duplicate document processing during the gap between scanner queuing and processor completion. Key Changes: 1. Placeholder Helper Functions (`vector/placeholder.py`): - `write_placeholder_point()` - Creates zero-vector placeholder when queuing - `query_document_metadata()` - Queries for existing entry (placeholder or real) - `delete_placeholder_point()` - Removes placeholder before writing real vectors - `get_placeholder_filter()` - Filters placeholders from user-facing queries 2. Scanner Updates (`vector/scanner.py`): - Replace `indexed_at` comparison with `modified_at` comparison - Write placeholder before queuing each document - Query per-document metadata instead of bulk-querying indexed_at - Fixes bug where files were resubmitted every scan cycle 3. Processor Updates (`vector/processor.py`): - Delete placeholder before upserting real vectors - Ensures no duplicate points in Qdrant 4. Query Filters (all search files): - Add `get_placeholder_filter()` to all user-facing queries - Ensures placeholders never appear in search results or visualizations - Applied to: bm25_hybrid.py, semantic.py, viz_routes.py, algorithms.py Architecture: - Placeholders use zero vectors with dimension from embedding service - Payload includes `is_placeholder: True` flag for filtering - Status field tracks: "pending", "processing", "completed", "failed" - Deterministic UUIDs using uuid5 for consistent point IDs Impact: - Eliminates duplicate processing of same documents - Fixes race condition where long-running documents get queued multiple times - Prevents scanner from resubmitting files every scan cycle - Maintains clean separation between in-flight and indexed documents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 15:04:00 +01:00
Chris Coutinho	d67aa6ae5c	fix: Align PDF text extraction between indexing and context expansion This commit fixes two critical issues with PDF processing: 1. Text extraction mismatch (context expansion bug): - Indexing used pymupdf4llm.to_markdown() producing markdown text - Context expansion used page.get_text() producing plain text - Different text formats caused character offset misalignment - Search would find correct chunk, but expansion showed wrong section - Fixed by making context.py use pymupdf4llm.to_markdown() consistently 2. Diagnostic logging for page number assignment: - Added logging to verify page_boundaries exist in metadata - Added logging to verify assign_page_numbers() assigns values - Helps diagnose why page numbers show as null in search results 3. mime_type storage bug: - Fixed incorrect field reference in processor.py:405 - Was using file_metadata.get("content_type", "") - Should use content_type from WebDAV response Changes: - nextcloud_mcp_server/search/context.py: Use pymupdf4llm.to_markdown() for PDF text extraction to match indexing method - nextcloud_mcp_server/vector/processor.py: Add diagnostic logging for page boundaries and assignment, fix mime_type storage - tests/unit/client/test_webdav.py: Fix import sorting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 13:57:50 +01:00
Chris Coutinho	d0691d5aa0	feat: Switch files to use numeric IDs with file_path resolution - scanner.py: Use file_info['id'] as doc_id instead of file_path - scanner.py: Pass file_path in DocumentTask for content retrieval - processor.py: Store file_path in Qdrant payload for later lookup - context.py: Add _get_file_path_from_qdrant() to resolve file_id → file_path - context.py: Update get_chunk_with_context() to handle file ID resolution This makes the system resilient to file renames since file IDs are stable identifiers in Nextcloud, while file paths can change.	2025-11-20 12:00:47 +01:00
Chris Coutinho	b8010270c1	fix: Add async/await, PDF metadata, and type safety fixes This commit addresses multiple issues with async operations, PDF metadata extraction, and type safety in document processing and search. ## Async/Await Fixes - processor.py:259 - Added await for chunker.chunk_text(content) - processor.py:270 - Added await for bm25_service.encode_batch(chunk_texts) - tests/unit/test_document_chunker.py - Converted all 12 test methods to async ## PDF Metadata Enhancement - pymupdf.py:143 - Added file_size metadata extraction - pymupdf.py:145-206 - Refactored to extract text page-by-page - Manually loop through pages instead of using page_chunks=True - Generate page_boundaries metadata for precise page tracking - Works around pymupdf.layout.activate() breaking page_chunks=True - processor.py:32-66 - Added assign_page_numbers() helper function - Assigns page numbers to chunks based on overlap with page boundaries - Handles chunks spanning multiple pages - processor.py:298-300 - Call assign_page_numbers() for PDF files ## Type Safety Fixes - bm25_hybrid.py:184 - Removed int() conversion of doc_id - semantic.py:131 - Removed int() conversion of doc_id - viz_routes.py:275 - Removed int() conversion of doc_id - Added comments documenting that doc_id can be int (notes) or str (file paths) ## Testing - All 18 tests passing (12 unit + 6 integration) - No type errors in modified files - Container logs show successful processing - Vector viz searches working correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 02:37:07 +01:00
Chris Coutinho	3aa7128f45	feat: add chunk position tracking to vector indexing and search Track character offsets (start_offset, end_offset) for each chunk in vector database metadata, enabling precise chunk highlighting in visualization pane. Changes: - processor.py: Store chunk_start_offset and chunk_end_offset in Qdrant metadata - processor.py: Added metadata_version=2 to indicate position tracking support - search/semantic.py: Return chunk positions from search results - server/semantic.py: Expose chunk positions in API responses (SemanticSearchResult) Enables viz pane to: 1. Display exact matched chunk with surrounding context 2. Highlight the precise portion of text that matched the query 3. Build user trust by showing what the RAG system actually retrieved Position tracking uses ChunkWithPosition dataclass from document_chunker.py which provides character-accurate offsets in the original document. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 06:47:58 +01:00
Chris Coutinho	c28fc955ca	Merge origin/master into feature/bm25 Resolved conflicts: - viz_routes.py: Kept bm25's extract_dense_vector() function for robust vector handling - hybrid.py: Removed (bm25 uses native Qdrant RRF fusion instead) - uv.lock: Regenerated after accepting master's dependencies This merge brings in: - RAG evaluation framework (ADR-013) - Performance optimizations (double-fetch elimination) - Migration from asyncio to anyio - OpenTelemetry tracing improvements - Notes app enhancements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:52:40 +01:00
Chris Coutinho	6fe5596c13	feat: Implement BM25 hybrid search with native Qdrant RRF fusion Replace custom keyword/fuzzy search algorithms with industry-standard BM25 sparse vectors, combined with dense semantic vectors using Qdrant's native Reciprocal Rank Fusion (RRF). This consolidates search architecture and improves relevance for both semantic and keyword queries. Key changes: - Add fastembed dependency for BM25 sparse vector generation - Update Qdrant collection schema to support named vectors (dense + sparse) - Create BM25SparseEmbeddingProvider using FastEmbed's Qdrant/bm25 model - Implement BM25HybridSearchAlgorithm with native Qdrant RRF prefetch - Update document processor to generate both dense and sparse embeddings - Simplify nc_semantic_search() tool to use BM25 hybrid only - Remove legacy keyword.py, fuzzy.py, and custom hybrid.py (736 lines) - Update ADR-014 with implementation notes and test results Benefits: - Consolidated architecture (single Qdrant database) - Native database-level RRF fusion (more efficient) - Industry-standard BM25 (replaces brittle custom keyword search) - Better relevance across semantic and keyword queries - Simplified codebase (-285 net lines) Tests: All 125 tests passing (118 unit, 7 integration) Implements ADR-014: Replace Custom Keyword Search with BM25 Hybrid Search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 06:59:44 +01:00
Chris Coutinho	c8d9cc24e0	refactor: migrate asyncio to anyio for consistent structured concurrency Replace asyncio primitives with anyio equivalents throughout the codebase to establish a single async pattern. This provides better structured concurrency with automatic cancellation on errors and aligns with the pytest anyio configuration. Changes: - hybrid.py: Replace asyncio.gather() with anyio task groups - token_broker.py: Replace asyncio.Lock() with anyio.Lock() - storage.py: Replace asyncio.run() with anyio.run() - app.py: Replace tg.start_soon() with await tg.start() for task status - processor.py: Add task_status parameter for structured startup - scanner.py: Add task_status parameter for structured startup - CLAUDE.md: Update async/await patterns guidance The change from start_soon() to await tg.start() enables proper task initialization signaling, ensuring background tasks are ready before proceeding. This follows anyio best practices for structured concurrency. All 118 unit tests pass with the new implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 03:51:45 +01:00
Chris Coutinho	a667d7c59c	feat: Add metrics instrumentation for queue, health, and database operations Implement Prometheus metrics to populate empty Grafana dashboard panels. ## Phase 1: Queue Size Metrics ✅ File: `processor.py` - Track vector sync queue depth in real-time - Update metric after receiving and processing each document - Update metric during timeout (empty queue) - Enables: "Processing Queue Depth" panel ## Phase 2: Health Check Metrics ✅ File: `app.py` - Add Nextcloud connectivity check with timing - Add Qdrant health check with timing - Record dependency health status (up/down) - Record health check duration - Enables: 4 health status panels + health check duration panel ## Phase 3: Database Operation Metrics (Partial) ⏳ File: `storage.py` - Instrument `store_refresh_token()` method - Track SQLite INSERT operation timing and success/error status - Enables: Partial data for database operation latency panel ## Metrics Now Exposed ### Queue Metrics: - `mcp_vector_sync_queue_size` - Real-time queue depth ### Health Metrics: - `mcp_dependency_health{dependency="nextcloud"}` - UP/DOWN status - `mcp_dependency_health{dependency="qdrant"}` - UP/DOWN status - `mcp_dependency_check_duration_seconds{dependency}` - Health check latency ### Database Metrics: - `mcp_db_operations_total{db="sqlite",operation="insert"}` - Operation count - `mcp_db_operation_duration_seconds{db="sqlite",operation="insert"}` - Operation latency ## Dashboard Impact Panels Now Populated (7/34 panels): - ✅ Processing Queue Depth - ✅ Nextcloud Health - ✅ Qdrant Health - ✅ Health Check Duration - ✅ Database Operation Latency (partial) - ✅ Vector sync panels (already working from PR #292) Panels Still Empty (remaining work): - ⏳ OAuth panels (4): Token validations, exchanges, cache hit rate, refresh ops - ⏳ MCP tool panels (3): Call volume, error rates, execution duration - ⏳ Database panel: Needs more SQLite operations instrumented (~29 remaining) ## Testing Verified metric definitions exist and will be recorded on next deployment. ## Next Steps Phase 4: OAuth token metrics (unified_verifier.py, context_helper.py, storage.py) Phase 5: MCP tool metrics (all server/*.py files with @mcp.tool()) Phase 3 completion: Remaining 29 database operations in storage.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 16:14:38 +01:00
Chris Coutinho	4ea5ed72d4	feat: Add Grafana dashboard and vector sync metric instrumentation Implement comprehensive observability for vector database synchronization with Grafana dashboard and Prometheus metrics. ## Part 1: Grafana Dashboard Created all-in-one operations dashboard with 7 rows and 34 panels: ### Dashboard Structure: - Overview Row: Request rate, error rate, P95 latency, active requests - HTTP Metrics (RED): Request/error rates by endpoint, latency percentiles - MCP Tools: Call volume, error rates, execution duration by tool - Nextcloud API: API calls/latency by app, retry patterns - OAuth & Authentication: Token validations, exchanges, cache hit rate - Dependencies & Health: Status for Nextcloud/Qdrant/Keycloak/Unstructured - Vector Sync: Processing throughput, queue depth, Qdrant operations ### Helm Chart Integration: - Added dashboard-configmap.yaml template for automatic provisioning - Configured Grafana sidecar auto-discovery (label: grafana_dashboard="1") - Added dashboards configuration section in values.yaml (opt-in) - Updated Chart.yaml with dashboard annotations - Enhanced NOTES.txt with dashboard deployment instructions - Comprehensive documentation in dashboards/README.md Dashboard supports dynamic filtering via variables: - datasource: Prometheus data source selection - namespace: Filter by Kubernetes namespace - pod: Multi-select pod filtering - interval: Query interval (1m/5m/10m/30m/1h) ## Part 2: Vector Sync Metric Instrumentation Implemented metric recording throughout vector sync pipeline: ### metrics.py: Added convenience functions: - record_vector_sync_scan() - Track documents per scan - record_vector_sync_processing() - Track processing duration/status - record_qdrant_operation() - Track database operations - update_vector_sync_queue_size() - Track queue depth ### scanner.py: - Record number of documents found in each scan - Enables monitoring of scan throughput ### processor.py: - Record processing duration for each document - Track success/failure status with timing - Record Qdrant upsert/delete operations - Handle all code paths (success, deletion, error) ### semantic.py: - Wrap Qdrant query_points with try/except - Record search operation success/failure ## Metrics Exposed: - mcp_vector_sync_documents_scanned_total - mcp_vector_sync_documents_processed_total{status} - mcp_vector_sync_processing_duration_seconds (histogram) - mcp_vector_sync_queue_size (gauge) - mcp_qdrant_operations_total{operation,status} This enables monitoring of: - Scan and processing throughput - Processing latency (P50/P95/P99) - Error rates for processing and Qdrant operations - Queue depth trends - Complete observability of vector sync pipeline ## Testing: Verified locally that metrics are recorded correctly: - 36 documents scanned - 3 documents processed (avg 7.5s each) - 3 successful Qdrant upsert operations - Search operations tracked ## Deployment: Enable dashboard provisioning in Helm values: ```yaml dashboards: enabled: true grafanaFolder: "Nextcloud MCP" ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-13 11:49:20 +01:00
Chris Coutinho	a6e5f3d8ff	refactor: simplify OpenTelemetry tracing configuration Simplifies the OpenTelemetry tracing setup by removing the redundant OTEL_ENABLED flag and using the presence of OTEL_EXPORTER_OTLP_ENDPOINT to determine if tracing should be enabled. This follows the standard OpenTelemetry environment variable conventions more closely. Changes: - Remove OTEL_ENABLED/tracing_enabled flag in favor of checking if OTEL_EXPORTER_OTLP_ENDPOINT is set - Add OTEL_EXPORTER_VERIFY_SSL configuration option for OTLP endpoints with self-signed certificates (defaults to false for development) - Move HTTPXClientInstrumentor initialization to module level to ensure httpx calls are traced across all Nextcloud API requests - Add tracing spans to vector sync operations (scan_user_documents) - Fix authorization header logging to only warn about missing headers in OAuth mode (BasicAuth mode doesn't use Authorization headers) - Update observability documentation to reflect simplified configuration - Refactor Dockerfile to use --no-editable flag for uv sync Breaking changes: - OTEL_ENABLED environment variable is removed - Tracing is now automatically enabled when OTEL_EXPORTER_OTLP_ENDPOINT is set Migration guide: - Remove OTEL_ENABLED=true from environment configuration - Tracing will be enabled automatically if OTEL_EXPORTER_OTLP_ENDPOINT is configured 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 22:48:37 +01:00
Chris Coutinho	cb39b3fca4	feat(vector): Add configurable chunk size and overlap for document embedding Enable users to tune document chunking parameters to match their embedding model and content type by adding DOCUMENT_CHUNK_SIZE and DOCUMENT_CHUNK_OVERLAP environment variables. - config.py: Added `document_chunk_size` (default: 512) and `document_chunk_overlap` (default: 50) configuration fields with validation: - Ensures overlap < chunk_size - Warns if chunk_size < 100 words - Prevents negative overlap values - processor.py: Updated DocumentChunker instantiation to use config settings instead of hardcoded values (line 174-177) - tests/unit/test_config.py: Added TestChunkConfigValidation class with 9 tests covering: - Default values - Valid configurations - Validation errors (overlap >= chunk_size, negative overlap) - Warning for small chunk sizes - Environment variable loading - docs/configuration.md: Added comprehensive "Document Chunking Configuration" section with: - Chunk size selection guidance (256-384 vs 512 vs 768-1024 words) - Overlap recommendations (10-20% of chunk size) - Configuration examples for different use cases - Added env vars to reference table - docs/semantic-search-architecture.md: Added "Document Chunking Strategy" section with: - Chunking process explanation - Example showing sliding window behavior - Search behavior with chunks - Tuning recommendations - env.sample: Added complete "Semantic Search & Vector Sync Configuration" section with: - Vector sync settings - Qdrant configuration (3 modes) - Ollama embedding service - Document chunking configuration - docker-compose.yml: Added commented examples for DOCUMENT_CHUNK_SIZE and DOCUMENT_CHUNK_OVERLAP with usage notes \`\`\`bash DOCUMENT_CHUNK_SIZE=512 DOCUMENT_CHUNK_OVERLAP=50 \`\`\` 1. \`overlap\` must be less than \`chunk_size\` 2. \`overlap\` cannot be negative 3. Warning issued if \`chunk_size\` < 100 words Precise matching (small notes, specific queries): \`\`\`bash DOCUMENT_CHUNK_SIZE=256 DOCUMENT_CHUNK_OVERLAP=25 \`\`\` Balanced (default, general purpose): \`\`\`bash DOCUMENT_CHUNK_SIZE=512 DOCUMENT_CHUNK_OVERLAP=50 \`\`\` Contextual (long documents, broader topics): \`\`\`bash DOCUMENT_CHUNK_SIZE=1024 DOCUMENT_CHUNK_OVERLAP=100 \`\`\` ✅ User control - Tune chunking to match embedding model capabilities ✅ Experimentation - Test different chunk sizes for optimal results ✅ Model alignment - Match chunk size to embedding context window ✅ Backward compatible - Defaults maintain existing behavior ✅ Well validated - Comprehensive tests prevent misconfiguration All 22 config validation tests pass (9 new tests for chunking): - Default values work correctly - Validation prevents invalid configurations - Environment variables load properly - Warning system works as expected With configurable chunk sizes, users can now experiment with different Ollama embedding models and tune chunk parameters for optimal semantic search quality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 02:47:57 +01:00
Chris Coutinho	e575c8e57b	feat(vector): Support multiple embedding models with auto-generated collection names This PR enables safe switching between embedding models and multi-server deployments by implementing auto-generated Qdrant collection names based on deployment ID and model name. ## Problem Previously, all deployments used a single hardcoded collection name "nextcloud_content", which caused two critical issues: 1. Dimension mismatches when switching models: Changing OLLAMA_EMBEDDING_MODEL (e.g., nomic-embed-text at 768D → all-minilm at 384D) would cause runtime errors as vectors couldn't be inserted into a collection with incompatible dimensions. 2. Collection collisions in multi-server setups: Multiple MCP servers sharing a single Qdrant instance would overwrite each other's data, making horizontal scaling impossible. ## Solution ### Auto-Generated Collection Naming Collections are now automatically named using the pattern: \`{deployment-id}-{model-name}\` Deployment ID: Uses \`OTEL_SERVICE_NAME\` if configured (and not default value), otherwise falls back to \`hostname\` for simple Docker deployments. Model Name: From \`OLLAMA_EMBEDDING_MODEL\` with path separators sanitized. Examples: - \`my-mcp-server-nomic-embed-text\` (with OTEL_SERVICE_NAME=my-mcp-server) - \`mcp-container-all-minilm\` (simple Docker, hostname=mcp-container) Override: Users can still set \`QDRANT_COLLECTION\` explicitly to bypass auto-generation for backward compatibility. ### Dimension Validation Added startup validation that checks collection dimensions match the embedding service. If a mismatch is detected, the server fails fast with a clear error message explaining: - Expected vs actual dimensions - Likely cause (model change) - Solutions (delete collection, use different name, or revert model) ### Improved Sampling Error Handling Enhanced MCP sampling rejection handling to treat user rejections as normal behavior rather than errors: - User rejections ("rejected", "denied") → INFO log, no traceback - Unsupported clients → INFO log, no traceback - Other MCP errors → WARNING log, no traceback - Unexpected errors → ERROR log WITH traceback This aligns with the MCP specification where clients SHOULD prompt users for approval/denial of sampling requests. ## Changes ### Core Implementation - nextcloud_mcp_server/config.py: Added \`get_collection_name()\` method with deployment ID detection and model name sanitization - nextcloud_mcp_server/vector/qdrant_client.py: Dimension validation on collection open with helpful error messages - nextcloud_mcp_server/vector/{scanner,processor}.py: Updated to use \`get_collection_name()\` - nextcloud_mcp_server/auth/userinfo_routes.py: Vector sync status uses \`get_collection_name()\` - nextcloud_mcp_server/server/semantic.py: - Updated semantic search tools to use \`get_collection_name()\` - Improved sampling rejection error handling (McpError vs Exception) ### Documentation - docs/semantic-search-architecture.md: New comprehensive architecture document (557 lines) covering background sync, semantic search flow, RAG implementation, and deployment modes - docs/configuration.md: Added detailed "Qdrant Collection Naming" section with examples and multi-server deployment guidance - docker-compose.yml: Added comments explaining collection naming behavior - README.md: Updated semantic search descriptions to clarify experimental status, Notes-only support, and infrastructure requirements ## Migration Guide For existing single-server deployments: Option 1 (Recommended): Use explicit collection name for continuity \`\`\`bash QDRANT_COLLECTION=nextcloud_content # Keep existing collection \`\`\` Option 2: Allow auto-generation and re-embed \`\`\`bash # Remove QDRANT_COLLECTION override # New collection will be created based on deployment ID + model # Requires re-embedding all documents (may take time) \`\`\` For new multi-server deployments: Set unique OTEL service names per server: \`\`\`bash # Server 1 OTEL_SERVICE_NAME=mcp-prod OLLAMA_EMBEDDING_MODEL=nomic-embed-text # → Collection: "mcp-prod-nomic-embed-text" # Server 2 OTEL_SERVICE_NAME=mcp-staging OLLAMA_EMBEDDING_MODEL=nomic-embed-text # → Collection: "mcp-staging-nomic-embed-text" \`\`\` ## Benefits ✅ Safe model switching: Each model gets its own collection, preventing dimension mismatch errors ✅ Multi-server support: Multiple MCP servers can share one Qdrant instance without conflicts ✅ Clear ownership: Collection names show which deployment and model owns the data ✅ Better error messages: Dimension validation provides actionable guidance ✅ Backward compatible: Existing deployments can continue using \`QDRANT_COLLECTION\` override ## Testing Validated with: - Single-server deployments (default hostname-based naming) - Multi-server deployments (OTEL service name-based naming) - Model switching scenarios (dimension validation) - Collection override scenarios (backward compatibility) Next steps: Testing various Ollama embedding models to investigate optimal chunk sizes and performance characteristics. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-10 01:18:30 +01:00
Chris Coutinho	72232f937a	refactor: migrate vector sync from asyncio.Queue to anyio memory object streams Replace asyncio.Queue with anyio.create_memory_object_stream() throughout the vector sync system for better library consistency and improved shutdown semantics. ## Changes Made scanner.py: - Changed parameter type from `asyncio.Queue` to `MemoryObjectSendStream[DocumentTask]` - Replaced all `await document_queue.put()` calls with `await send_stream.send()` - Wrapped scanner loop in `async with send_stream:` context manager for automatic cleanup - Updated log messages: "Queued" → "Sent" - Removed `import asyncio` (no longer needed) processor.py: - Changed parameter type from `asyncio.Queue` to `MemoryObjectReceiveStream[DocumentTask]` - Replaced `asyncio.wait_for(document_queue.get(), timeout=1.0)` with `anyio.fail_after(1.0)` + `await receive_stream.receive()` - Removed all `document_queue.task_done()` calls (not needed with streams) - Added `anyio.EndOfStream` exception handling for graceful shutdown when scanner closes - Removed `import asyncio` (no longer needed) app.py: - Removed `import asyncio` from top-level imports - Added `from anyio.streams.memory import MemoryObjectReceiveStream, MemoryObjectSendStream` - Updated AppContext dataclass: - Replaced `document_queue: Optional[asyncio.Queue]` with: - `document_send_stream: Optional[MemoryObjectSendStream]` - `document_receive_stream: Optional[MemoryObjectReceiveStream]` - Updated `app_lifespan_basic()`: - Replaced `asyncio.Queue(maxsize=...)` with `anyio.create_memory_object_stream(max_buffer_size=...)` - Pass `send_stream` to scanner_task - Pass `receive_stream.clone()` to each processor_task (enables multiple consumers) - Updated AppContext yield to include both streams - Updated `starlette_lifespan()`: - Same changes as app_lifespan_basic for streamable-http transport - Removed `import asyncio as asyncio_module` (no longer needed) - Updated app.state storage to use send_stream and receive_stream semantic.py: - Updated `nc_get_vector_sync_status()` tool: - Access `document_receive_stream` instead of `document_queue` from lifespan context - Use `stream_stats.current_buffer_used` instead of `queue.qsize()` for pending count - More reliable metrics (qsize() was not guaranteed accurate) ## Benefits 1. Library Consistency: Pure anyio throughout codebase (was mixing asyncio.Queue with anyio.Event and anyio.create_task_group) 2. Graceful Shutdown: `async with send_stream:` automatically closes stream on exit, signaling EndOfStream to all processors 3. Better Timeout Handling: `anyio.fail_after()` is more idiomatic than `asyncio.wait_for()` 4. Stream Cloning: Easy to add multiple consumers via `receive_stream.clone()` 5. Better Statistics: `.statistics()` provides accurate buffer metrics (qsize() was unreliable) 6. Type Safety: Separate send/receive types prevent accidental misuse 7. No task_done() tracking: Streams handle completion automatically ## Testing - ✅ All 69 unit tests passing - ✅ All 5 smoke tests passing - ✅ No regressions in functionality - ✅ Graceful shutdown behavior improved ## References - https://anyio.readthedocs.io/en/stable/why.html#queue-fix - https://anyio.readthedocs.io/en/stable/streams.html#memory-object-streams 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 06:43:44 +01:00
Chris Coutinho	fdd82f59e2	feat: implement semantic search tool and fix vector sync issues (ADR-007 Phase 3) Completes the ADR-007 implementation by adding user-facing semantic search functionality. Previous phases implemented scanner and processor for background indexing; this adds the query interface. Changes: - Add nc_notes_semantic_search MCP tool for natural language queries - Fix Qdrant point IDs to use UUIDs instead of strings (was causing 400 errors) - Reduce scan interval default from 1 hour to 5 minutes for faster updates - Add SemanticSearchResult and SemanticSearchNotesResponse models - Implement dual-phase authorization (Qdrant filter + Nextcloud API verification) The semantic search enables finding notes by meaning rather than exact keywords, using vector embeddings to understand query intent. Point ID fix resolves critical bug where all document indexing failed with "invalid point ID" errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 21:51:12 +01:00
Chris Coutinho	8f45e996e8	feat: implement vector sync scanner and processor (ADR-007 Phase 2) Implements background vector database synchronization using anyio TaskGroups for BasicAuth mode with single-user credentials. Scanner Implementation: - Periodic document discovery (hourly, configurable) - Timestamp-based change detection (Nextcloud vs Qdrant) - Wake event for immediate scanning on-demand - Supports both initial sync (all docs) and incremental sync (changes only) - Detects deleted documents and queues for removal Processor Implementation: - Concurrent document processing pool (3 workers default) - I/O-bound embedding generation via Ollama API - Retry logic with exponential backoff (3 retries) - Document chunking (512 words, 50-word overlap) - Handles both index and delete operations - Upserts vectors to Qdrant with rich metadata App Lifespan Integration: - Extended AppContext with background task state - Modified app_lifespan_basic() to start tasks via anyio TaskGroups - Graceful shutdown with coordinated task cancellation - Only activates when VECTOR_SYNC_ENABLED=true Embedding Service: - OllamaEmbeddingProvider with TLS support - Singleton pattern for shared client instances - Batch embedding support for efficiency - Auto-detects embedding dimension (768 for nomic-embed-text) Qdrant Client: - Async client wrapper with singleton pattern - Auto-creates collection on first use - COSINE distance metric for semantic similarity - Integrates with embedding service for dimension detection Health Check Enhancement: - Added Qdrant status check to /health/ready endpoint - Only checks when VECTOR_SYNC_ENABLED=true - 2-second timeout for health probe - Reports connection errors with details Configuration: - VECTOR_SYNC_ENABLED: Enable background sync - VECTOR_SYNC_SCAN_INTERVAL: Scanner frequency (3600s default) - VECTOR_SYNC_PROCESSOR_WORKERS: Concurrent processors (3 default) - QDRANT_URL, QDRANT_API_KEY, QDRANT_COLLECTION: Vector DB config - OLLAMA_BASE_URL, OLLAMA_EMBEDDING_MODEL: Embedding service config Dependencies Added: - qdrant-client>=1.7.0: Vector database client Docker Compose: - Added Qdrant service with health check - Exposed ports 6333 (REST) and 6334 (gRPC) - Configured MCP service with vector sync environment - Added qdrant-data volume for persistence Known Issue: - FastMCP lifespan not triggering for streamable-http transport - Background tasks will start once lifespan integration is complete - Lifespan triggers on MCP session establishment, not server startup Related: ADR-007 Background Vector Database Synchronization 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-08 21:14:38 +01:00

25 Commits