Commit Graph

138 Commits

Author SHA1 Message Date
Chris Coutinho fca8ab0cfd Merge remote-tracking branch 'origin/master' into rag-evaluation 2025-11-16 00:32:59 +01:00
Chris Coutinho c272ddd82d feat: implement RAG evaluation framework with CLI tooling
- Add ADR-013 documenting RAG evaluation architecture
- Implement two-part evaluation: Context Recall (retrieval) + Answer Correctness (generation)
- Create Click CLI for ground truth generation and corpus upload
- Add pytest fixtures and tests for retrieval/generation quality
- Use BeIR/nfcorpus dataset with 5 selected test queries
- Support Ollama and Anthropic LLM providers
- Generate synthetic ground truth answers offline
- Add comprehensive documentation in tests/rag_evaluation/README.md

The framework separates one-time setup (generate/upload) from test execution,
making tests much faster (~6-12 min vs ~15-25 min per run).

Tests are manual only (not in CI) and require external LLM access.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-15 23:11:21 +01:00
github-actions[bot] 682923dcc8 bump: version 0.34.2 → 0.35.0 2025-11-15 00:46:11 +00:00
github-actions[bot] 56a5c63994 bump: version 0.34.1 → 0.34.2 2025-11-13 21:11:36 +00:00
github-actions[bot] dd12c957f6 bump: version 0.34.0 → 0.34.1 2025-11-13 21:10:16 +00:00
github-actions[bot] 2f138e7539 bump: version 0.33.1 → 0.34.0 2025-11-13 16:15:29 +00:00
github-actions[bot] bd76902932 bump: version 0.33.0 → 0.33.1 2025-11-13 12:10:42 +00:00
github-actions[bot] 15951c38fa bump: version 0.32.1 → 0.33.0 2025-11-13 10:58:05 +00:00
github-actions[bot] 747d297008 bump: version 0.32.0 → 0.32.1 2025-11-12 02:16:57 +00:00
github-actions[bot] 49a9dd43c6 bump: version 0.31.1 → 0.32.0 2025-11-11 23:54:43 +00:00
github-actions[bot] ce666934f2 bump: version 0.31.0 → 0.31.1 2025-11-10 22:21:48 +00:00
github-actions[bot] f44bf3e8f2 bump: version 0.30.0 → 0.31.0 2025-11-10 07:02:49 +00:00
renovate-bot-cbcoutinho[bot] fb1af697f7 chore(deps): lock file maintenance 2025-11-10 05:13:55 +00:00
github-actions[bot] 126b5a7626 bump: version 0.29.2 → 0.30.0 2025-11-10 02:50:11 +00:00
github-actions[bot] a0576aa9a2 bump: version 0.29.1 → 0.29.2 2025-11-09 18:28:34 +00:00
github-actions[bot] 7772b1ac2e bump: version 0.29.0 → 0.29.1 2025-11-09 08:54:26 +00:00
github-actions[bot] af96378cb6 bump: version 0.28.0 → 0.29.0 2025-11-09 08:29:53 +00:00
github-actions[bot] ae81f0334e bump: version 0.27.3 → 0.28.0 2025-11-09 08:04:06 +00:00
Chris Coutinho 23f3a231a5 Merge pull request #273 from cbcoutinho/feature/observability-monitoring
Feature/observability monitoring
2025-11-09 09:03:40 +01:00
Chris Coutinho 578de4d7d6 feat(observability): Add comprehensive monitoring with Prometheus and OpenTelemetry
- Add Prometheus metrics for HTTP, MCP tools, Nextcloud API, OAuth, vector sync, and DB operations
- Add OpenTelemetry distributed tracing with OTLP export
- Add structured JSON logging with trace context correlation
- Add ObservabilityMiddleware for automatic HTTP instrumentation
- Add app_name attribute to all client classes for per-app metrics
- Add configuration for metrics, tracing, and logging via environment variables
- Add documentation in docs/observability.md
- Fix graceful degradation when tracing is disabled (default state)
- Fix uvicorn logging configuration to use observability formatters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 08:54:04 +01:00
github-actions[bot] 8f0f989c6d bump: version 0.27.2 → 0.27.3 2025-11-09 06:52:31 +00:00
github-actions[bot] 137dc80075 bump: version 0.27.1 → 0.27.2 2025-11-09 06:45:44 +00:00
github-actions[bot] f51edff25d bump: version 0.27.0 → 0.27.1 2025-11-09 06:22:00 +00:00
github-actions[bot] 538bbc375e bump: version 0.26.1 → 0.27.0 2025-11-09 06:15:27 +00:00
Chris Coutinho e96c02e4d4 fix: remove unnecessary urllib3<2.0 constraint
The urllib3<2.0 constraint was added unnecessarily during troubleshooting.
urllib3 2.x works perfectly fine with qdrant-client. The import path for
urllib3.util.Url and parse_url remains the same across 1.x and 2.x versions.

Changes:
- Remove urllib3<2.0 constraint from pyproject.toml
- Upgrade to urllib3 2.5.0 (latest)
- All integration tests pass with urllib3 2.x

Verified:
- from urllib3.util import Url, parse_url works in 2.5.0
- All 6 semantic search integration tests pass
- qdrant-client 1.15.1 works correctly with urllib3 2.5.0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 22:18:31 +01:00
Chris Coutinho 7b8c3f93a8 test: add integration tests for semantic search with in-process embeddings
Adds comprehensive integration tests for vector database semantic search that
work without external dependencies (Ollama), making them suitable for CI/CD.

Changes:
- Add SimpleEmbeddingProvider: in-process TF-IDF-like embeddings using feature hashing
- Make Ollama optional: embedding service now falls back to SimpleEmbeddingProvider
- Add 6 integration tests covering semantic search, filtering, and batch operations
- Downgrade urllib3 to 1.26.x for qdrant-client compatibility
- Update docker-compose.yml to comment out Ollama configuration (optional)

The SimpleEmbeddingProvider generates deterministic, normalized embeddings
suitable for testing semantic similarity without requiring external services.
Tests validate that similar texts have higher cosine similarity and that
semantic search correctly ranks results by relevance.

Test coverage:
- Deterministic embedding generation
- Semantic similarity between texts
- Full search flow with Qdrant (in-memory)
- Category filtering
- Empty result handling
- Batch embedding generation

All tests pass and can run in GitHub CI without Ollama infrastructure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 22:13:33 +01:00
Chris Coutinho 8f45e996e8 feat: implement vector sync scanner and processor (ADR-007 Phase 2)
Implements background vector database synchronization using anyio
TaskGroups for BasicAuth mode with single-user credentials.

Scanner Implementation:
- Periodic document discovery (hourly, configurable)
- Timestamp-based change detection (Nextcloud vs Qdrant)
- Wake event for immediate scanning on-demand
- Supports both initial sync (all docs) and incremental sync (changes only)
- Detects deleted documents and queues for removal

Processor Implementation:
- Concurrent document processing pool (3 workers default)
- I/O-bound embedding generation via Ollama API
- Retry logic with exponential backoff (3 retries)
- Document chunking (512 words, 50-word overlap)
- Handles both index and delete operations
- Upserts vectors to Qdrant with rich metadata

App Lifespan Integration:
- Extended AppContext with background task state
- Modified app_lifespan_basic() to start tasks via anyio TaskGroups
- Graceful shutdown with coordinated task cancellation
- Only activates when VECTOR_SYNC_ENABLED=true

Embedding Service:
- OllamaEmbeddingProvider with TLS support
- Singleton pattern for shared client instances
- Batch embedding support for efficiency
- Auto-detects embedding dimension (768 for nomic-embed-text)

Qdrant Client:
- Async client wrapper with singleton pattern
- Auto-creates collection on first use
- COSINE distance metric for semantic similarity
- Integrates with embedding service for dimension detection

Health Check Enhancement:
- Added Qdrant status check to /health/ready endpoint
- Only checks when VECTOR_SYNC_ENABLED=true
- 2-second timeout for health probe
- Reports connection errors with details

Configuration:
- VECTOR_SYNC_ENABLED: Enable background sync
- VECTOR_SYNC_SCAN_INTERVAL: Scanner frequency (3600s default)
- VECTOR_SYNC_PROCESSOR_WORKERS: Concurrent processors (3 default)
- QDRANT_URL, QDRANT_API_KEY, QDRANT_COLLECTION: Vector DB config
- OLLAMA_BASE_URL, OLLAMA_EMBEDDING_MODEL: Embedding service config

Dependencies Added:
- qdrant-client>=1.7.0: Vector database client

Docker Compose:
- Added Qdrant service with health check
- Exposed ports 6333 (REST) and 6334 (gRPC)
- Configured MCP service with vector sync environment
- Added qdrant-data volume for persistence

Known Issue:
- FastMCP lifespan not triggering for streamable-http transport
- Background tasks will start once lifespan integration is complete
- Lifespan triggers on MCP session establishment, not server startup

Related: ADR-007 Background Vector Database Synchronization

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 21:14:38 +01:00
github-actions[bot] 4ceaf45ffd bump: version 0.26.0 → 0.26.1 2025-11-08 03:59:28 +00:00
Chris Coutinho 21b878a2e7 Merge pull request #265 from cbcoutinho/renovate/mcp-1.x
fix(deps): update dependency mcp to >=1.21,<1.22
2025-11-08 04:59:05 +01:00
github-actions[bot] 218f0bd366 bump: version 0.25.0 → 0.26.0 2025-11-08 03:48:50 +00:00
renovate-bot-cbcoutinho[bot] c1e135c4a2 fix(deps): update dependency mcp to >=1.21,<1.22 2025-11-07 05:06:10 +00:00
github-actions[bot] 77e491beea bump: version 0.24.1 → 0.25.0 2025-11-05 23:02:25 +00:00
Chris Coutinho 461971a1a8 Merge pull request #262 from cbcoutinho/feature/user-settings
Feature/user settings
2025-11-05 15:59:54 +01:00
Chris Coutinho 6cccd92b3b build: Add type checking 2025-11-05 15:19:55 +01:00
github-actions[bot] 7052c19de0 bump: version 0.24.0 → 0.24.1 2025-11-04 12:28:13 +00:00
renovate-bot-cbcoutinho[bot] 3e988acb97 fix(deps): update dependency mcp to >=1.20,<1.21 2025-11-04 11:08:34 +00:00
github-actions[bot] f587a4e31f bump: version 0.23.0 → 0.24.0 2025-11-04 10:27:39 +00:00
Chris Coutinho 4e1d143e54 Merge remote-tracking branch 'origin/master' into feature/keycloak 2025-11-03 02:49:00 +01:00
github-actions[bot] 02a2c4a16f bump: version 0.22.7 → 0.23.0 2025-11-03 01:48:39 +00:00
Chris Coutinho babd60e08b feat: Implement ADR-004 Hybrid Flow with comprehensive integration tests
Implement the ADR-004 Hybrid Flow OAuth pattern where the MCP server
intercepts the OAuth callback to obtain master refresh tokens while
maintaining PKCE security for clients.

## Implementation

### OAuth Routes (ADR-004 Hybrid Flow)
- Add `/oauth/authorize` endpoint: Intercepts client OAuth initiation
- Add `/oauth/callback` endpoint: Receives IdP callback, stores master token
- Add `/oauth/token` endpoint: Exchanges MCP code for client access token
- Implement PKCE code challenge/verifier validation
- Store OAuth sessions with state/challenge correlation

### MCP Server Integration
- Update `setup_oauth_config()` to return client_id and client_secret
- Initialize OAuth context in Starlette lifespan for login routes
- Add OAuth session storage to RefreshTokenStorage
- Configure authlib dependency for OAuth flow management

### Integration Tests
- Create `test_adr004_hybrid_flow.py` with Playwright automation
- Add `adr004_hybrid_flow_mcp_client` session-scoped fixture
- Test MCP session establishment with hybrid flow token
- Test tool execution using stored refresh tokens (on-behalf-of pattern)
- Test persistent access across multiple operations
- All tests passing:  3 passed in 8.82s

### Documentation
- Update ADR-004 with comprehensive Testing section
- Add integration test commands and coverage details
- Document test implementation and verification steps
- Create TESTING_INSTRUCTIONS.md for manual and automated testing
- Include manual test scripts for reference/debugging

## What This Enables

 PKCE code challenge/verifier flow
 MCP server intercepts OAuth callback and stores master refresh token
 Client receives MCP access token (not master token)
 MCP session establishment with hybrid flow token
 Tool execution using stored refresh tokens (on-behalf-of pattern)
 Multiple operations without re-authentication
 Proper token isolation (client never sees master token)

## Testing

Run ADR-004 integration tests:
```bash
uv run pytest tests/server/oauth/test_adr004_hybrid_flow.py --browser firefox -v
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-03 02:18:30 +01:00
Chris Coutinho e331544cee feat: Implement RFC 8693 token exchange for Keycloak (ADR-002 Tier 2)
Implements OAuth 2.0 Token Exchange (RFC 8693) enabling the MCP server to
exchange service account tokens for user-scoped tokens. This provides an
alternative to refresh tokens for background operations.

**Core Implementation:**
- Added `get_service_account_token()` method to KeycloakOAuthClient for
  client_credentials grant
- Added `exchange_token_for_user()` method implementing RFC 8693 token exchange
- Fixed Fernet encryption key handling in RefreshTokenStorage (was incorrectly
  base64 decoding already-encoded keys)
- Updated OAuth configuration to support offline_access scope and refresh token
  storage infrastructure

**Keycloak Configuration:**
- Enabled `serviceAccountsEnabled` in realm-export.json
- Added `token.exchange.grant.enabled` attribute
- Added `client.token.exchange.standard.enabled` attribute (required for
  Keycloak 26.2+ Standard Token Exchange V2)
- Fresh Keycloak imports now correctly enable token exchange

**Docker Compose:**
- Added TOKEN_ENCRYPTION_KEY and ENABLE_OFFLINE_ACCESS environment variables
- Created oauth-tokens volume for refresh token storage
- Configured both mcp-oauth and mcp-keycloak services

**Testing & Documentation:**
- Added tests/manual/test_token_exchange.py - Validates complete RFC 8693 flow
- Added tests/manual/test_nextcloud_impersonate.py - Documents session-based
  impersonation limitations
- Added docs/oauth-impersonation-findings.md - Comprehensive investigation
  findings and resolution documentation

**Verified Working:**
 Service account token acquisition (client_credentials grant)
 RFC 8693 token exchange for internal-to-internal tokens
 Exchanged tokens validate with Nextcloud APIs
 Keycloak 26.4.2 Standard Token Exchange V2 support

**Known Limitations:**
- User impersonation (requested_subject) requires Keycloak Legacy V1 with
  preview features
- Cross-client token exchange limited to same realm
- Refresh token storage infrastructure ready but unused (MCP protocol limitation)

Dependencies: aiosqlite>=0.20.0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-02 22:03:19 +01:00
github-actions[bot] 5259658458 bump: version 0.22.6 → 0.22.7 2025-10-29 11:18:41 +00:00
github-actions[bot] e1aca04aff bump: version 0.22.5 → 0.22.6 2025-10-29 10:57:44 +00:00
github-actions[bot] e647c87dd8 bump: version 0.22.4 → 0.22.5 2025-10-29 10:54:54 +00:00
github-actions[bot] 202058bdc8 bump: version 0.22.3 → 0.22.4 2025-10-29 10:44:11 +00:00
github-actions[bot] 8221046d8a bump: version 0.22.2 → 0.22.3 2025-10-29 10:35:58 +00:00
github-actions[bot] 9ec7637579 bump: version 0.22.1 → 0.22.2 2025-10-29 10:30:39 +00:00
github-actions[bot] 3878beaf65 bump: version 0.22.0 → 0.22.1 2025-10-29 10:17:08 +00:00
github-actions[bot] 0e7e74867f bump: version 0.21.0 → 0.22.0 2025-10-29 09:32:27 +00:00
github-actions[bot] 57a2157c58 bump: version 0.20.0 → 0.21.0 2025-10-25 18:33:56 +00:00