nextcloud-mcp-server

Author	SHA1	Message	Date
Chris Coutinho	5a251a99e6	fix: Set is_placeholder=False in processor to fix search filtering The processor was not setting is_placeholder field when writing real document chunks to Qdrant. This caused the placeholder filter to exclude all documents (since None != False), resulting in 0 search results. Now explicitly sets is_placeholder: False in payload when writing real indexed chunks, allowing search filters to correctly distinguish between placeholders and real documents.	2025-11-20 17:15:19 +01:00
Chris Coutinho	25ef33de7f	feat: Use Ollama native batch API in embed_batch() - Switch from sequential loop to /api/embed batch endpoint - Use 'input' array parameter instead of individual 'prompt' requests - Process in chunks of 32 to avoid quality degradation (issue #6262) - Reduces HTTP overhead: 128 texts = 4 requests instead of 128 - Maintains backward compatibility with embed() for single embeddings Ref: ollama/ollama#6262	2025-11-20 16:50:13 +01:00
Chris Coutinho	ec2c274cd9	fix: Increase placeholder staleness threshold to 5x scan interval - Changed from 2x (120s) to 5x (300s) scan interval - Large PDFs take 3-4 minutes to process, need longer threshold - Prevents premature requeuing of in-flight documents	2025-11-20 15:36:49 +01:00
Chris Coutinho	47f0b3db9a	fix: Add placeholder staleness check to prevent duplicate processing - Only requeue documents if placeholder is older than 2x scan interval (120s default) - Prevents scanner from immediately requeuing in-flight documents - Fixes issue where PDFs were being reprocessed every 60 seconds - Staleness check applied to both notes and files scanning logic	2025-11-20 15:30:10 +01:00
Chris Coutinho	233de3508f	fix: Use empty SparseVector instead of None for placeholders Qdrant validation rejects None for sparse vectors in named vector dicts. Use models.SparseVector(indices=[], values=[]) instead to create valid empty sparse vectors for placeholder points. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 15:15:10 +01:00
Chris Coutinho	13b2d0048c	feat: Implement Qdrant placeholder state management Introduces a placeholder-based state tracking system to prevent duplicate document processing during the gap between scanner queuing and processor completion. Key Changes: 1. Placeholder Helper Functions (`vector/placeholder.py`): - `write_placeholder_point()` - Creates zero-vector placeholder when queuing - `query_document_metadata()` - Queries for existing entry (placeholder or real) - `delete_placeholder_point()` - Removes placeholder before writing real vectors - `get_placeholder_filter()` - Filters placeholders from user-facing queries 2. Scanner Updates (`vector/scanner.py`): - Replace `indexed_at` comparison with `modified_at` comparison - Write placeholder before queuing each document - Query per-document metadata instead of bulk-querying indexed_at - Fixes bug where files were resubmitted every scan cycle 3. Processor Updates (`vector/processor.py`): - Delete placeholder before upserting real vectors - Ensures no duplicate points in Qdrant 4. Query Filters (all search files): - Add `get_placeholder_filter()` to all user-facing queries - Ensures placeholders never appear in search results or visualizations - Applied to: bm25_hybrid.py, semantic.py, viz_routes.py, algorithms.py Architecture: - Placeholders use zero vectors with dimension from embedding service - Payload includes `is_placeholder: True` flag for filtering - Status field tracks: "pending", "processing", "completed", "failed" - Deterministic UUIDs using uuid5 for consistent point IDs Impact: - Eliminates duplicate processing of same documents - Fixes race condition where long-running documents get queued multiple times - Prevents scanner from resubmitting files every scan cycle - Maintains clean separation between in-flight and indexed documents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 15:04:00 +01:00
Chris Coutinho	944dd760ca	fix: Return empty array instead of null for query_coords when no results When vector visualization search returns zero results, the code was returning query_coords: null, which caused JavaScript error "can't access property 0, queryCoords is null" when the frontend tried to access the array. Changed to return empty array [] to match expected type and prevent crash. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 14:18:02 +01:00
Chris Coutinho	d67aa6ae5c	fix: Align PDF text extraction between indexing and context expansion This commit fixes two critical issues with PDF processing: 1. Text extraction mismatch (context expansion bug): - Indexing used pymupdf4llm.to_markdown() producing markdown text - Context expansion used page.get_text() producing plain text - Different text formats caused character offset misalignment - Search would find correct chunk, but expansion showed wrong section - Fixed by making context.py use pymupdf4llm.to_markdown() consistently 2. Diagnostic logging for page number assignment: - Added logging to verify page_boundaries exist in metadata - Added logging to verify assign_page_numbers() assigns values - Helps diagnose why page numbers show as null in search results 3. mime_type storage bug: - Fixed incorrect field reference in processor.py:405 - Was using file_metadata.get("content_type", "") - Should use content_type from WebDAV response Changes: - nextcloud_mcp_server/search/context.py: Use pymupdf4llm.to_markdown() for PDF text extraction to match indexing method - nextcloud_mcp_server/vector/processor.py: Add diagnostic logging for page boundaries and assignment, fix mime_type storage - tests/unit/client/test_webdav.py: Fix import sorting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 13:57:50 +01:00
Chris Coutinho	f1a5fac1b9	fix: Update models and viz to use int-only doc_id - algorithms.py: Revert SearchResult.id to int (all docs use int IDs now) - semantic.py: Revert SemanticSearchResult.id to int, remove Union import - viz_routes.py: Remove str() conversion when querying doc_id from Qdrant - viz_routes.py: Convert doc_id from query param to int in chunk context Fixes vector visualization which was collapsing all chunks to a single point because Qdrant queries were failing to match doc_id (string vs int).	2025-11-20 12:32:27 +01:00
Chris Coutinho	d0691d5aa0	feat: Switch files to use numeric IDs with file_path resolution - scanner.py: Use file_info['id'] as doc_id instead of file_path - scanner.py: Pass file_path in DocumentTask for content retrieval - processor.py: Store file_path in Qdrant payload for later lookup - context.py: Add _get_file_path_from_qdrant() to resolve file_id → file_path - context.py: Update get_chunk_with_context() to handle file ID resolution This makes the system resilient to file renames since file IDs are stable identifiers in Nextcloud, while file paths can change.	2025-11-20 12:00:47 +01:00
Chris Coutinho	f1610bbd2e	fix: Reconstruct full content for notes to match indexed offsets Notes are indexed as "{title}\n\n{content}" in processor.py but were being retrieved as just content during chunk expansion, causing chunk_start_offset and chunk_end_offset to be misaligned. This fix reconstructs the full content structure when fetching notes for chunk expansion, ensuring the displayed chunks match the excerpts shown in search results. Fixes chunk/excerpt mismatch reported in vector visualization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 11:33:12 +01:00
Chris Coutinho	327d843f64	feat: Implement per-chunk vector visualization with context expansion Major improvements to vector visualization page: - Refactor PCA to display individual chunks instead of averaged documents - Add context expansion module for fetching surrounding text from notes and PDFs - Update deduplication to use (doc_id, doc_type, chunk_start, chunk_end) keys - Fix Alpine.js rendering with chunk-specific keys including offsets - Refactor authentication helper to return NextcloudClient for better reuse - Add async context manager support to NextcloudClient Technical details: - viz_routes.py: Fetch specific chunk vectors instead of averaging per document - context.py: New module supporting both notes and PDF text extraction via PyMuPDF - search algorithms: Extract page_number, chunk_index, total_chunks from Qdrant - vector-viz.js/html: Use chunk positions in expansion tracking keys This enables users to see which specific chunks match their query and view them with surrounding context in the PCA visualization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 11:22:20 +01:00
Chris Coutinho	b8010270c1	fix: Add async/await, PDF metadata, and type safety fixes This commit addresses multiple issues with async operations, PDF metadata extraction, and type safety in document processing and search. ## Async/Await Fixes - processor.py:259 - Added await for chunker.chunk_text(content) - processor.py:270 - Added await for bm25_service.encode_batch(chunk_texts) - tests/unit/test_document_chunker.py - Converted all 12 test methods to async ## PDF Metadata Enhancement - pymupdf.py:143 - Added file_size metadata extraction - pymupdf.py:145-206 - Refactored to extract text page-by-page - Manually loop through pages instead of using page_chunks=True - Generate page_boundaries metadata for precise page tracking - Works around pymupdf.layout.activate() breaking page_chunks=True - processor.py:32-66 - Added assign_page_numbers() helper function - Assigns page numbers to chunks based on overlap with page boundaries - Handles chunks spanning multiple pages - processor.py:298-300 - Call assign_page_numbers() for PDF files ## Type Safety Fixes - bm25_hybrid.py:184 - Removed int() conversion of doc_id - semantic.py:131 - Removed int() conversion of doc_id - viz_routes.py:275 - Removed int() conversion of doc_id - Added comments documenting that doc_id can be int (notes) or str (file paths) ## Testing - All 18 tests passing (12 unit + 6 integration) - No type errors in modified files - Container logs show successful processing - Vector viz searches working correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 02:37:07 +01:00
Chris Coutinho	c4ce28f05d	fix: Improve 3D plot rendering with explicit dimensions and window resize support - Get container dimensions before creating Plotly layout to render at correct size immediately - Add init() method with window resize listener for responsive plot sizing - Remove post-render resize call (no longer needed with explicit dimensions) - Improve colorbar positioning and scene domain configuration This eliminates the visual "jump" during initial render and ensures the plot resizes smoothly when the browser window changes size. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 19:43:20 +01:00
Chris Coutinho	c126c3ec03	fix: Preserve 3D plot camera and improve documentation This commit addresses PR feedback and fixes plot camera behavior. ## JavaScript Fix - Camera Preservation - Changed plot update strategy from recreating layout to using Plotly.restyle() - Query point visibility now toggles via restyle() which only modifies trace visibility - Camera position/zoom naturally preserved since layout remains untouched - Resolves jumpy plot behavior when toggling "Show Query Point" checkbox Related: nextcloud_mcp_server/auth/static/vector-viz.js:58-73 ## Documentation Improvements - Condensed vector-sync-ui.md from 316 to 94 lines (~70% reduction) - Removed redundant FAQ section (content merged into main sections) - Simplified use cases from 4 detailed sections to 3 focused paragraphs - Streamlined troubleshooting to 3 common issues - Merged technical details into overview section - Retained all essential information while improving readability ## Screenshot Updates Removed old/outdated images (5 files): - rag-workflow-bidirectional-final.png - rag-workflow-prominent-llm.png - rag-workflow-simple-final.png - vector-viz-interface.png - welcome-page.png Replaced with current screenshots (3 files): - vector-viz-document-types-2col.png - Now shows plot + results - vector-viz-chunk-context.png - Centered content view - vector-viz-results.png - Updated results list 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 14:10:53 +01:00
Chris Coutinho	9bd02d7ef7	fix: Preserve 3D plot camera position and fix CSS loading Two fixes for the vector visualization page: 1. CSS Loading Fix: Moved CSS <link> from vector_viz.html fragment to user_info.html <head> block. HTMX fragments don't process <link> tags in <head>, causing unstyled page. Now CSS loads correctly. 2. Camera Preservation: Modified renderPlot() to preserve camera position when toggling query point visibility. Previously, toggling the "Show Query Point" checkbox would reset zoom/rotation to default. Now reads existing camera settings from plot before updating. Related: nextcloud_mcp_server/auth/static/vector-viz.js:123-130 Related: nextcloud_mcp_server/auth/templates/user_info.html:12 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:51:08 +01:00
Chris Coutinho	53689d076b	feat: Improve vector visualization with static assets and fixes - Extract CSS and JavaScript into separate static files - Created nextcloud_mcp_server/auth/static/vector-viz.css - Created nextcloud_mcp_server/auth/static/vector-viz.js - Updated templates to reference external assets - Fix vector visualization issues: - Normalize vectors before PCA to match Qdrant's cosine distance - Add zero-norm and NaN detection/handling for large datasets - Enable responsive Plotly sizing (autosize + responsive config) - Widen plot area to full viewport width with minimized margins - Improve visualization accuracy: - Query point now positioned correctly relative to documents - Handles 200+ points without JSON serialization errors - Full-width plot maximizes screen space utilization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 04:10:44 +01:00
Chris Coutinho	9db20a4d01	feat: Redesign UI to match Nextcloud ecosystem aesthetic This commit updates the web interface to better align with Nextcloud's design system and improve the Vector Viz layout. Changes: - Replace emoji icons with Material Design SVG icons for better consistency with Nextcloud apps - Simplify navigation styling with minimal padding and subtle active states (250px width) - Update CSS variables to match Nextcloud design system - Restructure Vector Viz from two-column to single-column vertical layout for better plot visibility - Move search controls to compact horizontal grid at top - Make navigation toggle always visible (not just on mobile) - Fix plot container sizing with overflow:visible to prevent colorbar clipping - Remove heavy shadows and custom card styling for cleaner aesthetic - Add error and success page templates with consistent styling Technical details: - Preserve Alpine.js for reactive functionality - Use CSS Grid for responsive horizontal controls layout - Add smooth transitions for navigation collapse/expand - Maintain HTMX for dynamic content loading 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 00:45:19 +01:00
Chris Coutinho	eec923eff5	feat: Replace custom document chunker with LangChain MarkdownTextSplitter Migrates from custom word-based chunking to LangChain's MarkdownTextSplitter for better semantic search quality. This implements the chunking portion of ADR-011. Changes: - Replace custom regex word chunker with MarkdownTextSplitter - Optimized for Markdown content (headers, code blocks, lists) - Convert from word-based (512 words) to character-based (2048 chars) chunking - Maintain backward-compatible ChunkWithPosition interface - Update configuration defaults and validation - Update all unit tests (12/12 passing) Benefits: - Respects markdown structure boundaries - Never breaks code blocks or headers mid-chunk - Preserves semantic coherence within chunks - Expected 20-30% improvement in recall quality - Industry-standard approach (used by production RAG systems) Note: Full reindex required to apply new chunking to existing documents. Current vector database still contains old word-based chunks. Related: ADR-011 (Improving Semantic Search Quality) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-18 12:17:23 +01:00
Chris Coutinho	d374bfa1e5	feat(viz): Add dual-score display and improve UI controls This commit enhances the vector visualization interface with better score transparency and improved UX: Dual-Score Display: - Store original algorithm scores before normalization (viz_routes.py:203) - Display both raw and normalized scores: "Raw Score: 0.842 (89% relative)" - Update plot hover text with dual scores (userinfo_routes.py:740) - Fixes issue where all queries showed at least one 100% match regardless of actual relevance (normalization artifact) UI Improvements: 1. Fusion Method dropdown: Changed from x-show to :disabled - Prevents jarring layout shift when switching algorithms - Dropdown stays visible but grayed out when Semantic is selected - Better UX with opacity: 0.5 and cursor: not-allowed 2. Score Threshold: Changed step from 0.1 to "any" - Allows arbitrary float precision (0.7, 0.85, 0.123) - Users can now fine-tune threshold values 3. Document Types: Converted multi-select to checkbox grid - Replaced clunky Ctrl/Cmd multi-select listbox - Checkbox grid with cleaner layout - Positioned left of Score Threshold and Result Limit inputs - More intuitive UX Technical Details: - Raw score ranges vary by algorithm: - Semantic: 0.0-1.0 (cosine similarity) - BM25 RRF: ~0.001-0.033 (Reciprocal Rank Fusion) - BM25 DBSF: Can exceed 1.0 (Distribution-Based Score Fusion) - Normalized scores (0-1) used for visual encoding (marker size, color) - Original scores preserved in API response via getattr fallback Files modified: - nextcloud_mcp_server/auth/viz_routes.py (store original_score) - nextcloud_mcp_server/auth/templates/vector_viz.html (UI controls) - nextcloud_mcp_server/auth/userinfo_routes.py (plot hover text) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 08:05:49 +01:00
Chris Coutinho	3aa7128f45	feat: add chunk position tracking to vector indexing and search Track character offsets (start_offset, end_offset) for each chunk in vector database metadata, enabling precise chunk highlighting in visualization pane. Changes: - processor.py: Store chunk_start_offset and chunk_end_offset in Qdrant metadata - processor.py: Added metadata_version=2 to indicate position tracking support - search/semantic.py: Return chunk positions from search results - server/semantic.py: Expose chunk positions in API responses (SemanticSearchResult) Enables viz pane to: 1. Display exact matched chunk with surrounding context 2. Highlight the precise portion of text that matched the query 3. Build user trust by showing what the RAG system actually retrieved Position tracking uses ChunkWithPosition dataclass from document_chunker.py which provides character-accurate offsets in the original document. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 06:47:58 +01:00
Chris Coutinho	c3282534eb	feat: add vector viz template and chunk context endpoint Extracted vector visualization HTML template to separate file to resolve syntax conflicts between Jinja2, Alpine.js, and CSS. Added chunk context endpoint for fetching matched chunks with surrounding text. Changes: - Moved vector_viz.html to templates/ directory (separates Jinja2/Alpine.js/CSS) - Added /app/chunk-context endpoint for retrieving chunk text with context - Updated .dockerignore to include HTML files in Docker builds - Moved anthropic and boto3 to main dependencies (needed for production features) - Added jinja2 dependency for template rendering Fixes Jinja2 TemplateSyntaxError caused by CSS colons being parsed as Jinja2 syntax when template was inline in Python code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 06:46:52 +01:00
Chris Coutinho	862308418e	fix: prevent infinite loop in DocumentChunker with position tracking Fixed a critical infinite loop bug in document_chunker.py that occurred when the overlap parameter caused the chunker to not make forward progress. Changes: - Added ChunkWithPosition dataclass to track character positions - Refactored chunk_text() to use regex word matching for accurate position tracking - Added safety check to ensure forward progress (next_start_idx > start_idx) - Changed return type from list[str] to list[ChunkWithPosition] The bug manifested when: 1. end_idx reached len(word_matches) (processing last chunk) 2. next_start_idx = end_idx - overlap would not advance past start_idx 3. Loop would continue indefinitely without making progress Fix ensures chunker always terminates by breaking when not advancing. All 9 unit tests now pass in 1.66s (previously timing out at 180s). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 06:39:15 +01:00
Chris Coutinho	3464b21845	fix: Relax SearchResult validation to support DBSF fusion scores > 1.0 Fix false-positive validation error where DBSF (Distribution-Based Score Fusion) correctly produces scores > 1.0 but SearchResult validation incorrectly rejected them. Root Cause: SearchResult.__post_init__() enforced scores in [0.0, 1.0] range, but DBSF sums normalized scores from multiple retrieval systems (dense semantic + sparse BM25), resulting in scores like 1.55 when both systems strongly agree a document is relevant. Changes: - Relaxed validation to allow any score ≥ 0.0 (algorithms.py:147-157) - Updated SearchResult and SemanticSearchResult documentation to explain score ranges for RRF ([0.0, 1.0]) vs DBSF (unbounded) - Added comprehensive test coverage for both fusion methods - Added DBSF fusion option to vector visualization UI - Updated viz routes and vizApp() to support fusion parameter selection Testing: All 157 unit tests pass, type checking passes, ruff passes Fixes error: "Configuration error: Score must be between 0.0 and 1.0, got 1.1528953" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-17 06:32:30 +01:00
Chris Coutinho	1504df6fb5	Merge branch 'master' into feature/bedrock	2025-11-16 12:08:23 +01:00
Chris Coutinho	c28fc955ca	Merge origin/master into feature/bm25 Resolved conflicts: - viz_routes.py: Kept bm25's extract_dense_vector() function for robust vector handling - hybrid.py: Removed (bm25 uses native Qdrant RRF fusion instead) - uv.lock: Regenerated after accepting master's dependencies This merge brings in: - RAG evaluation framework (ADR-013) - Performance optimizations (double-fetch elimination) - Migration from asyncio to anyio - OpenTelemetry tracing improvements - Notes app enhancements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:52:40 +01:00
Chris Coutinho	ad4b45889f	fix: suppress Starlette middleware type warnings in ty checker	2025-11-16 11:43:50 +01:00
Chris Coutinho	5b484c9226	feat: add unified provider architecture with Amazon Bedrock support Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities. ## Major Changes ### Unified Provider Architecture (ADR-015) - Created `nextcloud_mcp_server/providers/` with unified Provider ABC - Providers now support optional capabilities (embeddings and/or generation) - Auto-detection registry with priority: Bedrock → Ollama → Simple - Backward compatible - existing code continues to work ### New Providers - BedrockProvider: Full Amazon Bedrock integration - Embeddings: Titan Embed, Cohere Embed models - Generation: Claude, Llama, Titan Text, Mistral models - Model-specific request/response handling - AWS credential chain integration - OllamaProvider: Migrated with both capabilities support - AnthropicProvider: Moved from test code to production providers - SimpleProvider: Migrated in-memory fallback provider ### Breaking Changes None - full backward compatibility maintained: - `embedding.get_embedding_service()` still works - RAG evaluation tests updated to use unified providers - All existing tests pass (127 unit tests) ### Testing - Added 9 comprehensive Bedrock unit tests with mocked boto3 - All existing unit tests pass - Type checking (ty) and linting (ruff) pass - Verified backward compatibility ### Documentation - `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR - `docs/bedrock-setup.md`: AWS setup guide with IAM permissions - `CLAUDE.md`: Updated with provider architecture section ### Dependencies - Added `boto3>=1.35.0` to dev dependencies (optional) ## Environment Variables ### Bedrock - `AWS_REGION`: AWS region (e.g., "us-east-1") - `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings - `BEDROCK_GENERATION_MODEL`: Model ID for generation - `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials ### Ollama - `OLLAMA_BASE_URL`: API URL - `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text") - `OLLAMA_GENERATION_MODEL`: Generation model ## AWS Bedrock Permissions Required Minimal IAM policy: ```json { "Effect": "Allow", "Action": ["bedrock:InvokeModel"], "Resource": ["arn:aws:bedrock:::foundation-model/"] } ``` See `docs/bedrock-setup.md` for detailed setup instructions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:36:58 +01:00
Chris Coutinho	8799450c7d	Merge pull request #306 from cbcoutinho/rag-evaluation feat: RAG evaluation framework with performance improvements	2025-11-16 11:17:41 +01:00
Chris Coutinho	c4bf077050	feat: Add OpenTelemetry tracing to @instrument_tool decorator Enhances the @instrument_tool decorator to create distributed traces for all MCP tool executions, improving observability and debugging. Changes: - Modified @instrument_tool to wrap tool execution in trace_operation - Added automatic span creation with mcp.tool.* span names - Sanitized tool arguments before adding to span attributes (excludes password, token, secret, api_key, etag, ctx) - Limited argument strings to 500 characters to prevent huge spans - Maintained existing Prometheus metrics functionality - Updated docs/observability.md to reflect correct decorator name - Added comprehensive unit tests All ~50+ MCP tools now emit traces automatically without code changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 11:16:05 +01:00
Chris Coutinho	f559ca049e	Merge branch 'rag-evaluation'	2025-11-16 10:26:19 +01:00
Chris Coutinho	02700a8e2c	perf: Eliminate double-fetching in semantic search sampling Performance optimization that removes redundant verification step and makes content fetching parallel in nc_semantic_search_answer tool. Changes: - Remove verification.py module (only had 1 caller) - Refactor nc_semantic_search to do inline deduplication instead of calling verify_search_results() - Migrate verification patterns (anyio task group, semaphore limiting) to nc_semantic_search_answer's content fetching - Change content fetching from sequential loop to parallel execution Performance impact: - Before: 10 API calls (5 parallel verification + 5 sequential content) = ~5.5s overhead - After: 5 API calls (parallel content fetch) = ~0.5s overhead - Result: 50% fewer API calls, ~10x faster for sampling operations Technical details: - Uses anyio.create_task_group() for structured concurrency - Semaphore limiting (max_concurrent=20) prevents connection pool exhaustion - Index-based storage maintains result ordering - Expected failures (deleted notes) logged at debug level - Deduplication handles hybrid search returning same doc from dense + sparse 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 10:25:04 +01:00
Chris Coutinho	1faf572546	Merge branch 'feature/bm25' Resolves conflict in viz_routes.py by combining: - Named vector extraction from feature/bm25 - Performance timing from master	2025-11-16 08:18:39 +01:00
Chris Coutinho	944b6dcf5a	fix: Handle named vectors in visualization and semantic search - viz_routes.py: Extract "dense" vector from named vector dict - semantic.py: Specify using="dense" for BM25 hybrid collections - Fixes "X must be 2D array" error in hybrid search - Fixes "Dense vector is not found" error in semantic search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 08:16:35 +01:00
Chris Coutinho	2aa82d849c	Merge branch 'feature/bm25'	2025-11-16 07:57:36 +01:00
Chris Coutinho	fc6a2f14e4	fix: Update vizApp to use bm25_hybrid algorithm and remove deprecated weights The visualization UI was still using the old 'hybrid' algorithm name and weight parameters that were replaced by the BM25 hybrid search refactor. This caused "Unknown algorithm: hybrid" errors when using the search & visualize feature. Changes: - Update default algorithm from 'hybrid' to 'bm25_hybrid' - Update default scoreThreshold from 0.7 to 0.0 to match backend - Remove deprecated semanticWeight, keywordWeight, fuzzyWeight parameters - Remove weight parameters from search request Fixes the visualization search functionality after BM25 hybrid refactor. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 07:54:20 +01:00
Chris Coutinho	16c22c953b	fix: Update viz routes to use BM25 hybrid search after refactor - Remove obsolete search algorithm imports (Fuzzy, Keyword, Hybrid) - Update UI to only show Semantic and BM25 Hybrid algorithms - Replace manual weight controls with RRF fusion info message - Update default algorithm from "hybrid" to "bm25_hybrid" - Remove weight parameters (semantic_weight, keyword_weight, fuzzy_weight) - Update score_threshold default from 0.7 to 0.0 for RRF scoring - Document ty type checker in CLAUDE.md Fixes unresolved-import type errors after BM25 refactor. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 07:23:11 +01:00
Chris Coutinho	529daf2b48	ci: temp disable sse in ci	2025-11-16 07:03:18 +01:00
Chris Coutinho	137d1d6c75	perf: fix vector viz search performance and visual encoding This commit addresses critical performance issues with vector visualization search (reducing time from 40s to ~2s) and improves result visualization through better visual encoding. ## Performance Fixes ### 1. Fix blocking sleep in retry decorator (base.py:51) - Changed `time.sleep(5)` to `await anyio.sleep(5)` in @retry_on_429 - Prevents entire event loop from freezing during rate limit retries - Impact: Reduced search time from 22s to 16s initially ### 2. Add concurrency limiting for verification (verification.py:77-93) - Added `anyio.Semaphore(20)` to limit concurrent HTTP requests - Prevents connection pool exhaustion (RequestError) from 90+ simultaneous requests - Fixes false filtering (was filtering 77/90 results incorrectly) - Note: Semaphore still in code but verification removed from viz endpoint ### 3. Remove unnecessary verification from viz endpoint (viz_routes.py:483-486) - Visualization only needs Qdrant metadata (title, excerpt), not full content - Verification only required for sampling (LLM needs full note content) - Impact: Reduced search time from 43.7s to ~2s (final fix) ### 4. Restore streaming scanner pattern (scanner.py) - Process notes one-at-a-time using async generator - Avoids loading all notes into memory ## Visualization Improvements ### 5. Result-relative score normalization (viz_routes.py:489-504) - Normalize scores within result set: best=1.0, worst=0.0 - Removes arbitrary RRF normalization (theoretical max didn't make sense) - Makes visual encoding meaningful regardless of algorithm scores ### 6. Power scaling for marker sizes (userinfo_routes.py:743) - Changed from linear `8 + (score * 12)` to power `6 + (score² * 14)` - Creates dramatic visual contrast: 0.0→6px, 0.5→9.5px, 1.0→20px - Combined with opacity (0.2-1.0) for clear visual hierarchy ### 7. Multi-channel visual encoding (userinfo_routes.py:740-745) - Size: Exponentially scaled with score² - Opacity: Linear 0.2-1.0 (keeps all points visible) - Color: Viridis gradient (blue→yellow) - Effect: Top results are large/bright/opaque, context results small/dim/transparent ## Result - Search time: 40s → ~2s (20x faster) - Visual contrast: Subtle → dramatic (clear result hierarchy) - No arbitrary cutoffs: All results visible, best naturally highlighted 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 07:01:35 +01:00
Chris Coutinho	6fe5596c13	feat: Implement BM25 hybrid search with native Qdrant RRF fusion Replace custom keyword/fuzzy search algorithms with industry-standard BM25 sparse vectors, combined with dense semantic vectors using Qdrant's native Reciprocal Rank Fusion (RRF). This consolidates search architecture and improves relevance for both semantic and keyword queries. Key changes: - Add fastembed dependency for BM25 sparse vector generation - Update Qdrant collection schema to support named vectors (dense + sparse) - Create BM25SparseEmbeddingProvider using FastEmbed's Qdrant/bm25 model - Implement BM25HybridSearchAlgorithm with native Qdrant RRF prefetch - Update document processor to generate both dense and sparse embeddings - Simplify nc_semantic_search() tool to use BM25 hybrid only - Remove legacy keyword.py, fuzzy.py, and custom hybrid.py (736 lines) - Update ADR-014 with implementation notes and test results Benefits: - Consolidated architecture (single Qdrant database) - Native database-level RRF fusion (more efficient) - Industry-standard BM25 (replaces brittle custom keyword search) - Better relevance across semantic and keyword queries - Simplified codebase (-285 net lines) Tests: All 125 tests passing (118 unit, 7 integration) Implements ADR-014: Replace Custom Keyword Search with BM25 Hybrid Search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 06:59:44 +01:00
Chris Coutinho	c8d9cc24e0	refactor: migrate asyncio to anyio for consistent structured concurrency Replace asyncio primitives with anyio equivalents throughout the codebase to establish a single async pattern. This provides better structured concurrency with automatic cancellation on errors and aligns with the pytest anyio configuration. Changes: - hybrid.py: Replace asyncio.gather() with anyio task groups - token_broker.py: Replace asyncio.Lock() with anyio.Lock() - storage.py: Replace asyncio.run() with anyio.run() - app.py: Replace tg.start_soon() with await tg.start() for task status - processor.py: Add task_status parameter for structured startup - scanner.py: Add task_status parameter for structured startup - CLAUDE.md: Update async/await patterns guidance The change from start_soon() to await tg.start() enables proper task initialization signaling, ensuring background tasks are ready before proceeding. This follows anyio best practices for structured concurrency. All 118 unit tests pass with the new implementation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-16 03:51:45 +01:00
Chris Coutinho	eaeb8eae28	feat: Normalize hybrid search RRF scores to 0-1 range Improve user comprehension by scaling RRF scores to match the intuitive 0-1 range used by other search algorithms. ## Problem RRF (Reciprocal Rank Fusion) scores had a drastically different scale than semantic/keyword/fuzzy scores: - Semantic similarity: 0.0 to 1.0 (typical: 0.5-0.9) - RRF scores: 0.0 to ~0.016 (typical: 0.005-0.015) This caused user confusion - a score of 0.0078 looked terrible but was actually excellent (near theoretical maximum). ## Solution Normalize RRF scores using the formula: `normalized_score = rrf_score * (rrf_k + 1) / total_weight` Where: - rrf_k = 60 (RRF constant) - total_weight = sum of algorithm weights (default: 1.0) Example transformation: - Before: 0.0078 (confusing) - After: 0.477 (intuitive) ## Changes nextcloud_mcp_server/search/hybrid.py: - Store total_weight as instance variable (line 63) - Calculate normalization factor in _reciprocal_rank_fusion() (line 209) - Apply normalization to all RRF scores (line 217) - Preserve raw RRF score in metadata for debugging (line 222) ## Impact User Experience: - Hybrid search scores now comparable with semantic/keyword/fuzzy - Score of 0.5 indicates good match across all algorithms - Consistent scale improves score threshold usability Backward Compatibility: - Raw RRF scores preserved in metadata["rrf_score_raw"] - Result ordering unchanged (normalization is linear transformation) - Breaking change: Existing score thresholds need adjustment Performance: - Negligible overhead (single multiplication per result) ## Testing Verified with nc_semantic_search and nc_semantic_search_answer: - Hybrid scores now 0.47-0.7 range (was 0.003-0.011) - Semantic scores unchanged (0.75) - Result ordering preserved 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 06:48:58 +01:00
Chris Coutinho	42376483ab	refactor: Optimize Nextcloud access verification with centralized filtering Move access verification from individual search algorithms to final output stage, eliminating redundant API calls and improving performance. ## Changes New: - `search/verification.py`: Centralized verification using anyio task groups - Deduplicates results by (doc_id, doc_type) before verification - Verifies all unique documents in parallel using structured concurrency - Filters out inaccessible documents in single pass Modified Search Algorithms: - `search/semantic.py`: Removed _deduplicate_and_verify() and _verify_document_access() - `search/keyword.py`: Removed _verify_access() and parallel verification - `search/fuzzy.py`: Removed _verify_access() and parallel verification - `search/hybrid.py`: Removed nextcloud_client parameter passing All algorithms now return unverified results from Qdrant payload. Modified Output Stages: - `server/semantic.py`: Added verify_search_results() call after search - `auth/viz_routes.py`: Added verify_search_results() call after search Both endpoints now verify access once at final stage with deduplication. ## Performance Impact Before: - Hybrid mode (limit=10): 30 API calls (10 per algorithm × 3 algorithms) - Single algorithm: 10-20 API calls (with verification buffer) After: - Hybrid mode (limit=10): 10 API calls (deduplicated verification) - Single algorithm: 10 API calls (deduplicated verification) Performance Gain: 3x reduction in API calls for hybrid search ## Architecture Benefits - Separation of concerns: Algorithms handle scoring, output stage handles security - Deduplication: Each document verified exactly once - Parallel execution: All verifications run concurrently via anyio task groups - Consistency: Same verification logic across MCP tools and viz endpoints 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 06:21:06 +01:00
Chris Coutinho	ed0825e661	feat: Enhance vector visualization UI and parallelize search verification Vector Visualization Improvements: - Add interactive vector viz tab with Alpine.js and Plotly.js to user info page - Refactor viz route CSS for better scoping and maintainability - Remove unused nextcloud_host variable Performance Optimizations: - Parallelize access verification in fuzzy and keyword search algorithms - Use asyncio.gather() to verify multiple documents concurrently - Add exception handling with return_exceptions=True for resilience Dependencies: - Update third_party/oidc submodule to include RFC 9728 resource_url support 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 05:39:07 +01:00
Chris Coutinho	e3153822f7	perf: Exclude vector-sync status polling from distributed tracing Skip tracing for /app/vector-sync/status to reduce noise from HTMX polling. Metrics collection continues for this endpoint. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-15 05:19:35 +01:00
Chris Coutinho	2b35dd729f	fix: Reorder tabs and fix viz pane session access - Move Webhooks tab to the right (User Info \| Vector Sync \| Vector Viz \| Webhooks) - Use request.user.display_name instead of session for viz routes - Fixes session middleware error when accessing via iframe	2025-11-15 02:41:42 +01:00
Chris Coutinho	eb32bbbc6b	feat: Add Vector Viz tab to app home page - Add Vector Viz button to tab navigation - Embed viz pane in iframe for seamless integration - Only shown when vector sync is enabled	2025-11-15 02:38:05 +01:00
Chris Coutinho	916af1c8f3	feat: Add vector visualization pane with multi-select document types - Add /app/vector-viz endpoint for interactive search testing - Implement server-side PCA dimensionality reduction (768-dim → 2D) - Support multi-select document type filter for cross-app search - Support all search algorithms: semantic, keyword, fuzzy, hybrid - Display 2D scatter plot of vector embeddings using Plotly - Show search results with scores and document types - Register viz routes in app.py	2025-11-15 02:32:10 +01:00
Chris Coutinho	9a62c8478f	feat: Implement custom PCA to remove sklearn dependency - Add custom PCA implementation using numpy eigendecomposition - Replace sklearn.decomposition.PCA with custom implementation - Maintains same API (fit, transform, fit_transform) - Supports explained_variance_ratio_ for variance analysis - Removes scikit-learn dependency from project - Add type hints and assertion for type safety	2025-11-15 02:02:57 +01:00
Chris Coutinho	2a078093ed	refactor!: Make all search algorithms query Qdrant payload, not Nextcloud BREAKING CHANGE: Search algorithms now require Qdrant to be populated. Vector sync must be enabled and documents indexed for search to work. - Keyword and fuzzy search now query Qdrant scroll API for title/excerpt - Remove inefficient Nextcloud API fetching pattern - Add optional Nextcloud verification for security - Deduplicate by (doc_id, doc_type) tuple, keeping chunk_index=0 - Align with document processor pattern that already stores text in Qdrant	2025-11-15 01:56:41 +01:00

1 2 3 4 5 ...

364 Commits