Commit Graph

3 Commits

Author SHA1 Message Date
Chris Coutinho 6812e1aca7 fix: add dynamic dimension detection for Ollama embedding models
This fixes dimension mismatch errors when using embedding models with
non-standard dimensions (e.g., qwen3-embedding:4b produces 2560-dim
vectors instead of the hardcoded 768).

Changes:
- OllamaEmbeddingProvider: Detect dimensions dynamically by generating
  test embedding instead of hardcoding to 768
- qdrant_client: Call dimension detection before collection creation
- app.py: Initialize Qdrant collection before starting background tasks
  in streamable-http transport path
- tests: Fix integration tests to properly mock EmbeddingService wrapper

Fixes dimension mismatch error:
"could not broadcast input array from shape (2560,) into shape (768,)"

All integration tests passing (6/6).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:46:30 +01:00
Chris Coutinho 7e93097137 feat(ollama): Pull model on startup if not available in ollama 2025-11-12 00:37:26 +01:00
Chris Coutinho 8f45e996e8 feat: implement vector sync scanner and processor (ADR-007 Phase 2)
Implements background vector database synchronization using anyio
TaskGroups for BasicAuth mode with single-user credentials.

Scanner Implementation:
- Periodic document discovery (hourly, configurable)
- Timestamp-based change detection (Nextcloud vs Qdrant)
- Wake event for immediate scanning on-demand
- Supports both initial sync (all docs) and incremental sync (changes only)
- Detects deleted documents and queues for removal

Processor Implementation:
- Concurrent document processing pool (3 workers default)
- I/O-bound embedding generation via Ollama API
- Retry logic with exponential backoff (3 retries)
- Document chunking (512 words, 50-word overlap)
- Handles both index and delete operations
- Upserts vectors to Qdrant with rich metadata

App Lifespan Integration:
- Extended AppContext with background task state
- Modified app_lifespan_basic() to start tasks via anyio TaskGroups
- Graceful shutdown with coordinated task cancellation
- Only activates when VECTOR_SYNC_ENABLED=true

Embedding Service:
- OllamaEmbeddingProvider with TLS support
- Singleton pattern for shared client instances
- Batch embedding support for efficiency
- Auto-detects embedding dimension (768 for nomic-embed-text)

Qdrant Client:
- Async client wrapper with singleton pattern
- Auto-creates collection on first use
- COSINE distance metric for semantic similarity
- Integrates with embedding service for dimension detection

Health Check Enhancement:
- Added Qdrant status check to /health/ready endpoint
- Only checks when VECTOR_SYNC_ENABLED=true
- 2-second timeout for health probe
- Reports connection errors with details

Configuration:
- VECTOR_SYNC_ENABLED: Enable background sync
- VECTOR_SYNC_SCAN_INTERVAL: Scanner frequency (3600s default)
- VECTOR_SYNC_PROCESSOR_WORKERS: Concurrent processors (3 default)
- QDRANT_URL, QDRANT_API_KEY, QDRANT_COLLECTION: Vector DB config
- OLLAMA_BASE_URL, OLLAMA_EMBEDDING_MODEL: Embedding service config

Dependencies Added:
- qdrant-client>=1.7.0: Vector database client

Docker Compose:
- Added Qdrant service with health check
- Exposed ports 6333 (REST) and 6334 (gRPC)
- Configured MCP service with vector sync environment
- Added qdrant-data volume for persistence

Known Issue:
- FastMCP lifespan not triggering for streamable-http transport
- Background tasks will start once lifespan integration is complete
- Lifespan triggers on MCP session establishment, not server startup

Related: ADR-007 Background Vector Database Synchronization

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 21:14:38 +01:00