Commit Graph

92 Commits

Author SHA1 Message Date
Chris Coutinho 4a5766b84e feat(config): enable DCR for multi-user BasicAuth with offline access
Allows multi-user BasicAuth mode to use Dynamic Client Registration (DCR)
for OAuth credentials when ENABLE_OFFLINE_ACCESS is enabled, making it
consistent with OAuth modes and reducing configuration burden.

**Changes:**

Configuration Validation:
- Relaxed OAuth credential requirements for multi-user BasicAuth
- OAuth credentials now optional when offline access enabled
- Will use DCR as fallback if NEXTCLOUD_OIDC_CLIENT_ID/SECRET not set
- Updated validation to log info instead of error when DCR will be used

Startup Logic (app.py):
- Added DCR workflow for multi-user BasicAuth before uvicorn starts
- Creates oauth_context for management APIs when offline access enabled
- Allows Astrolabe to authenticate management API calls with OAuth
- DCR runs synchronously at same lifecycle point as OAuth modes
- Added traceback import for better error logging
- Fixed type assertions for nextcloud_host
- Fixed undefined variable references in vector sync logging

Management API:
- Improved auth mode detection using proper detect_auth_mode()
- Added auth_mode field to /status endpoint:
  * "basic" - Single-user BasicAuth
  * "multi_user_basic" - Multi-user BasicAuth
  * "oauth" - OAuth modes
  * "smithery" - Smithery stateless
- Added supports_app_passwords indicator for multi-user BasicAuth

Docker Compose:
- Updated mcp-multi-user-basic service configuration:
  * Enabled vector sync (VECTOR_SYNC_ENABLED=true)
  * Added ENABLE_OFFLINE_ACCESS=true for app password support
  * Added NEXTCLOUD_MCP_SERVER_URL for Astrolabe integration
  * Documented optional static OAuth credentials

Testing:
- Updated test_config_validators.py to expect DCR fallback
- Enhanced configure_astrolabe_for_mcp_server fixture with verification
- Added debug logging to test_users_setup fixture

**Workflow:**
1. User configures ENABLE_OFFLINE_ACCESS=true
2. Server checks for static NEXTCLOUD_OIDC_CLIENT_ID/SECRET
3. If not found, performs DCR before uvicorn starts
4. DCR registers client with Nextcloud OIDC provider
5. OAuth credentials used for Astrolabe management API auth
6. Background sync can retrieve user app passwords via Astrolabe

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-22 19:43:24 +01:00
Chris Coutinho 4507359760 refactor(config): centralize configuration validation and simplify startup
Implement centralized configuration validation (ADR-020) to simplify
deployment mode detection and improve error messages.

Changes:
- Create ADR-020 documenting 5 deployment modes with required/optional config
- Add config_validators.py with validate_configuration() and mode detection
- Simplify app.py startup with single validation point at get_app()
- Remove duplicate is_oauth_mode() function (43 lines)
- Fix DeploymentMode mapping (only SELF_HOSTED and SMITHERY_STATELESS exist)
- Add comprehensive unit tests (41 tests covering all modes and edge cases)
- Add enable_multi_user_basic_auth to Settings and BasicAuthMiddleware

Docker Compose:
- Remove conflicting ENABLE_MULTI_USER_BASIC_AUTH from mcp-oauth service
- Add dedicated mcp-multi-user-basic service on port 8003

Test Results:
- 237/237 integration tests PASSED
- All deployment modes verified: single-user BasicAuth, multi-user BasicAuth,
  OAuth single-audience, OAuth token exchange (Keycloak), Smithery stateless

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-20 20:49:28 +01:00
Chris Coutinho fe54733a39 fix(oauth): enable PKCE for all clients and add token_broker to oauth_context
This commit fixes two OAuth issues in the Astrolabe app:

1. **Always use PKCE (RFC 9207)**:
   - PKCE is now used for all OAuth flows (public and confidential clients)
   - Previous code only used PKCE for public clients, causing failures
   - Confidential clients now use both PKCE + client_secret (defense in depth)
   - Nextcloud OIDC provider requires PKCE, so token exchange was failing

2. **Add token_broker to oauth_context**:
   - Token broker is now stored in oauth_context for management API access
   - Fixes "Token broker not configured" error when revoking access
   - Revoke endpoint needs token_broker to delete refresh tokens and invalidate cache

Changes:
- OAuthController.php: Always generate PKCE verifier/challenge for all clients
- OAuthController.php: Always include code_verifier in token exchange
- app.py: Store token_broker in oauth_context after creation

Fixes:
- Astrolabe OAuth flow now works with Nextcloud OIDC
- Revoke/disconnect functionality now works in Astrolabe settings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-18 01:55:04 +01:00
Chris Coutinho e4f3beee01 fix: resolve type checking warnings for CI
- Add type casts for Starlette app state access
- Add assertions for cipher, card, board, stack after initialization
- Add None checks for XML element text attributes
- Handle __package__ being None in tracing setup
- Fix TokenBrokerService initialization to use storage credentials

Resolves 42 type warnings from ty-check, enabling CI linting to pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-18 00:44:58 +01:00
Chris Coutinho 5eec34c17e feat(auth): implement refresh token rotation for Nextcloud OIDC
Add support for one-time use refresh tokens with automatic rotation
to align with Nextcloud OIDC security model.

Changes:
- TokenBrokerService improvements:
  - Add user_id parameter to refresh methods
  - Detect and store rotated refresh tokens
  - Add offline_access scope to token requests
  - Handle refresh token rotation on every use
- Add management API endpoints:
  - /api/v1/webhooks (GET/POST) - List/create webhooks
  - /api/v1/webhooks/{id} (DELETE) - Delete webhook
  - /api/v1/search (POST) - Unified search
  - /api/v1/chunk-context (GET) - Get chunk context
  - /api/v1/apps (GET) - List installed apps
- Update tests for refresh token rotation
- Add --headed flag to pytest for Playwright debugging

Benefits:
- Aligns with Nextcloud OIDC one-time refresh token model
- Prevents refresh token invalidation after first use
- Enables long-lived background operations
- Provides full webhook lifecycle management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-18 00:01:53 +01:00
Chris Coutinho a58a14111b feat(vector-sync): enable background sync in OAuth mode
Add multi-user background vector synchronization when running in OAuth
mode with ENABLE_OFFLINE_ACCESS=true. Key changes:

Architecture (oauth_sync.py):
- User Manager task polls RefreshTokenStorage for provisioned users
- Per-user scanner tasks fetch documents using OAuth tokens
- Shared processor pool indexes documents from all users

Token Broker improvements:
- Accept client_id/client_secret instead of encryption_key
- Remove redundant token audience pre-validation (Nextcloud validates)
- Add _rewrite_token_endpoint for Docker internal URL routing
- Remove double-decryption (storage handles encryption internally)

Browser OAuth flow fixes:
- Add 'resource' parameter to request Nextcloud-scoped tokens
- Store and retrieve next_url for proper redirect after consent
- Rewrite token endpoint URLs for internal Docker access

Configuration:
- Add vector_sync_user_poll_interval setting (default: 60s)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 20:00:41 +01:00
Chris Coutinho ec70e70a5d fix: Disable DNS rebinding protection for containerized deployments
MCP Python SDK 1.23.0 introduced automatic DNS rebinding protection that
auto-enables when host="127.0.0.1" (the default). This breaks containerized
deployments (Kubernetes, Docker) because the protection rejects requests
with Host headers like "nextcloud-mcp-server.default.svc.cluster.local:8000".

Root cause:
- FastMCP defaults to host="127.0.0.1"
- SDK auto-enables DNS rebinding protection with allowed_hosts=["127.0.0.1:*", "localhost:*", "[::1]:*"]
- K8s/Docker requests use service DNS names or proxied hostnames
- Protection middleware rejects these requests (421 Misdirected Request)

Solution:
- Explicitly pass transport_security=TransportSecuritySettings(enable_dns_rebinding_protection=False)
- Applied to all three FastMCP initializations (OAuth, Smithery, BasicAuth)
- DNS rebinding attacks mitigated by OAuth authentication and network isolation

This fixes issue #373 and enables MCP 1.23.x upgrade in PR #382.

For detailed analysis, see docs/MCP-1.23-DNS-REBINDING-FIX.md
2025-12-12 17:30:22 +01:00
Chris Coutinho a33f6a2f15 feat(news): add Nextcloud News app integration
Add full integration for the Nextcloud News (RSS/Atom reader) app:

- Add NewsClient with complete CRUD operations for folders, feeds, and items
- Add 8 read-only MCP tools for listing/getting folders, feeds, items
- Add Pydantic models for News entities with camelCase alias support
- Add vector sync support for starred + unread items
- Add HTML to Markdown converter using markdownify for better embeddings
- Add Docker post-install hook to enable News app
- Add 25 unit tests for NewsClient API methods

Vector sync indexes starred and unread items, providing a balanced approach
that captures important (starred) and current (unread) content without
indexing the entire article history.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 14:39:31 +01:00
Chris Coutinho 4e9859117c fix: Share vector sync state with FastMCP session lifespan via module singleton
The refactor in fafeaf3 moved background tasks to Starlette server lifespan
but broke nc_get_vector_sync_status because it still looked for streams in
FastMCP's AppContext (lifespan_context).

Add VectorSyncState module singleton to bridge the lifespans:
- starlette_lifespan sets the singleton when starting background tasks
- app_lifespan_basic reads from singleton and includes in AppContext
- MCP tools can now access document_receive_stream for pending count

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 04:20:47 +01:00
Chris Coutinho fafeaf3d83 refactor: Move background tasks to server lifespan and deprecate SSE transport
- Move scanner/processor tasks from FastMCP session lifespan to Starlette
  server lifespan (correct architecture: background tasks run once at
  server level, not per-session)
- Change default CLI transport from SSE to streamable-http
- Remove SSE transport option from CLI (SSE is deprecated)
- Remove SSE client session factory from test fixtures
- Add tracing instrumentation to BM25 hybrid search operations for
  better observability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 04:02:30 +01:00
Chris Coutinho 03b984d5a7 fix(smithery): Enable JSON response format for scanner compatibility
The Smithery scanner was reporting "0 tools" despite the server returning
valid tool definitions. Root cause: the server was returning SSE-formatted
responses (event: message\ndata: {...}) which the scanner couldn't parse.

Changes:
- Add json_response=True to FastMCP for Smithery stateless mode
- Clean up verbose docstring examples in semantic.py and webdav.py

The MCP spec allows both SSE and plain JSON responses for HTTP transport.
Setting json_response=True returns Content-Type: application/json with
plain JSON-RPC instead of text/event-stream with SSE format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 22:01:18 +01:00
Chris Coutinho 39fba49cfe fix(smithery): Add JSON Schema metadata to mcp-config endpoint
Add proper JSON Schema metadata fields per Smithery documentation:
- $schema: JSON Schema draft-07
- $id: Schema identifier URL
- title: Human-readable title
- description: Schema description
- x-query-style: "flat" (no nested objects in our schema)
- additionalProperties: false

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:28:05 +01:00
Chris Coutinho 706a15f0bc fix(smithery): Use container runtime pattern for config discovery
ADR-016: For container runtime deployment, Smithery does not auto-generate
the .well-known/mcp-config endpoint like it does for Python CLI runtime.

Changes:
- Remove [tool.smithery] from pyproject.toml (not used in container mode)
- Remove smithery_server.py (Python CLI runtime specific)
- Add .well-known/mcp-config endpoint to return JSON Schema config
- Add SmitheryConfigMiddleware to extract config from URL query params
- Use ContextVar to pass session config to tool handlers

The container runtime passes config as URL query parameters to /mcp:
  GET /mcp?nextcloud_url=...&username=...&app_password=...

Tested:
- All 164 unit tests passing
- Docker container builds successfully
- .well-known/mcp-config returns valid JSON Schema
- Health endpoints working

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:22:55 +01:00
Chris Coutinho b8dc413b73 feat: Add Smithery CLI deployment support
- Add smithery package as dependency
- Create smithery_server.py with @smithery.server() decorator
- Add SmitheryConfigSchema for session config (nextcloud_url, username, app_password)
- Add [tool.smithery] section to pyproject.toml
- Remove manual .well-known/mcp-config endpoint (Smithery handles this)

Smithery CLI will automatically:
- Extract config schema from the decorated function
- Handle session config parsing from query parameters
- Make config accessible via ctx.session_config in tools

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:05:33 +01:00
Chris Coutinho 8d29ce0122 fix: Add Smithery lifespan and auth mode detection
- Add SmitheryAppContext dataclass for stateless mode
- Add app_lifespan_smithery() with minimal lifespan (no shared state)
- Update is_oauth_mode() to detect Smithery mode and return BasicAuth
- Use Smithery lifespan when SMITHERY_DEPLOYMENT=true
- Add .well-known/mcp-config endpoint for config discovery
- Skip document processors in Smithery mode (not enabled)

Fixes startup issues in Smithery mode where missing env credentials
would incorrectly trigger OAuth mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:48:53 +01:00
Chris Coutinho f93d650992 feat: Implement ADR-016 Smithery stateless deployment mode
Adds support for Smithery hosted deployment with stateless operation:

- Add DeploymentMode enum with SELF_HOSTED and SMITHERY_STATELESS modes
- Add get_deployment_mode() to detect mode from SMITHERY_DEPLOYMENT env var
- Update get_client() to create per-request clients from session config
- Add conditional tool registration (skip semantic search in Smithery mode)
- Add conditional /app admin UI mounting (skip in Smithery mode)
- Create smithery.yaml with configSchema for user credentials
- Create Dockerfile.smithery for minimal stateless container
- Create smithery_main.py entrypoint for Smithery deployment

In Smithery mode:
- Users provide nextcloud_url, username, app_password via session config
- Each request creates a fresh NextcloudClient (no state between requests)
- Semantic search tools are disabled (no vector database)
- Admin UI (/app) is disabled (no webhooks, vector viz)

All existing self-hosted functionality remains unchanged.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:30:42 +01:00
Chris Coutinho b8010270c1 fix: Add async/await, PDF metadata, and type safety fixes
This commit addresses multiple issues with async operations, PDF metadata
extraction, and type safety in document processing and search.

## Async/Await Fixes
- processor.py:259 - Added await for chunker.chunk_text(content)
- processor.py:270 - Added await for bm25_service.encode_batch(chunk_texts)
- tests/unit/test_document_chunker.py - Converted all 12 test methods to async

## PDF Metadata Enhancement
- pymupdf.py:143 - Added file_size metadata extraction
- pymupdf.py:145-206 - Refactored to extract text page-by-page
  - Manually loop through pages instead of using page_chunks=True
  - Generate page_boundaries metadata for precise page tracking
  - Works around pymupdf.layout.activate() breaking page_chunks=True
- processor.py:32-66 - Added assign_page_numbers() helper function
  - Assigns page numbers to chunks based on overlap with page boundaries
  - Handles chunks spanning multiple pages
- processor.py:298-300 - Call assign_page_numbers() for PDF files

## Type Safety Fixes
- bm25_hybrid.py:184 - Removed int() conversion of doc_id
- semantic.py:131 - Removed int() conversion of doc_id
- viz_routes.py:275 - Removed int() conversion of doc_id
- Added comments documenting that doc_id can be int (notes) or str (file paths)

## Testing
- All 18 tests passing (12 unit + 6 integration)
- No type errors in modified files
- Container logs show successful processing
- Vector viz searches working correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 02:37:07 +01:00
Chris Coutinho 53689d076b feat: Improve vector visualization with static assets and fixes
- Extract CSS and JavaScript into separate static files
  - Created nextcloud_mcp_server/auth/static/vector-viz.css
  - Created nextcloud_mcp_server/auth/static/vector-viz.js
  - Updated templates to reference external assets

- Fix vector visualization issues:
  - Normalize vectors before PCA to match Qdrant's cosine distance
  - Add zero-norm and NaN detection/handling for large datasets
  - Enable responsive Plotly sizing (autosize + responsive config)
  - Widen plot area to full viewport width with minimized margins

- Improve visualization accuracy:
  - Query point now positioned correctly relative to documents
  - Handles 200+ points without JSON serialization errors
  - Full-width plot maximizes screen space utilization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 04:10:44 +01:00
Chris Coutinho c3282534eb feat: add vector viz template and chunk context endpoint
Extracted vector visualization HTML template to separate file to resolve
syntax conflicts between Jinja2, Alpine.js, and CSS. Added chunk context
endpoint for fetching matched chunks with surrounding text.

Changes:
- Moved vector_viz.html to templates/ directory (separates Jinja2/Alpine.js/CSS)
- Added /app/chunk-context endpoint for retrieving chunk text with context
- Updated .dockerignore to include HTML files in Docker builds
- Moved anthropic and boto3 to main dependencies (needed for production features)
- Added jinja2 dependency for template rendering

Fixes Jinja2 TemplateSyntaxError caused by CSS colons being parsed as
Jinja2 syntax when template was inline in Python code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 06:46:52 +01:00
Chris Coutinho ad4b45889f fix: suppress Starlette middleware type warnings in ty checker 2025-11-16 11:43:50 +01:00
Chris Coutinho c8d9cc24e0 refactor: migrate asyncio to anyio for consistent structured concurrency
Replace asyncio primitives with anyio equivalents throughout the codebase
to establish a single async pattern. This provides better structured
concurrency with automatic cancellation on errors and aligns with the
pytest anyio configuration.

Changes:
- hybrid.py: Replace asyncio.gather() with anyio task groups
- token_broker.py: Replace asyncio.Lock() with anyio.Lock()
- storage.py: Replace asyncio.run() with anyio.run()
- app.py: Replace tg.start_soon() with await tg.start() for task status
- processor.py: Add task_status parameter for structured startup
- scanner.py: Add task_status parameter for structured startup
- CLAUDE.md: Update async/await patterns guidance

The change from start_soon() to await tg.start() enables proper task
initialization signaling, ensuring background tasks are ready before
proceeding. This follows anyio best practices for structured concurrency.

All 118 unit tests pass with the new implementation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 03:51:45 +01:00
Chris Coutinho 916af1c8f3 feat: Add vector visualization pane with multi-select document types
- Add /app/vector-viz endpoint for interactive search testing
- Implement server-side PCA dimensionality reduction (768-dim → 2D)
- Support multi-select document type filter for cross-app search
- Support all search algorithms: semantic, keyword, fuzzy, hybrid
- Display 2D scatter plot of vector embeddings using Plotly
- Show search results with scores and document types
- Register viz routes in app.py
2025-11-15 02:32:10 +01:00
Chris Coutinho 14a59fdff3 fix: Use NEXTCLOUD_OIDC_CLIENT_ID/SECRET env vars consistently
Fixes #296

The application code was looking for OIDC_CLIENT_ID and OIDC_CLIENT_SECRET
(without NEXTCLOUD_ prefix), but the Helm chart, documentation, and CLI
all use NEXTCLOUD_OIDC_CLIENT_ID and NEXTCLOUD_OIDC_CLIENT_SECRET.

This mismatch caused OAuth deployments via Helm to fail with crashloops
because the credentials weren't being found.

Changes:
- app.py: Use NEXTCLOUD_OIDC_CLIENT_ID/SECRET in setup_oauth_config()
- config.py: Use NEXTCLOUD_OIDC_CLIENT_ID/SECRET in get_settings()
- Updated documentation comments and error messages

This aligns with the documented naming convention where all Nextcloud-related
environment variables use the NEXTCLOUD_ prefix.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 21:48:58 +01:00
Chris Coutinho a667d7c59c feat: Add metrics instrumentation for queue, health, and database operations
Implement Prometheus metrics to populate empty Grafana dashboard panels.

## Phase 1: Queue Size Metrics 
**File**: `processor.py`
- Track vector sync queue depth in real-time
- Update metric after receiving and processing each document
- Update metric during timeout (empty queue)
- Enables: "Processing Queue Depth" panel

## Phase 2: Health Check Metrics 
**File**: `app.py`
- Add Nextcloud connectivity check with timing
- Add Qdrant health check with timing
- Record dependency health status (up/down)
- Record health check duration
- Enables: 4 health status panels + health check duration panel

## Phase 3: Database Operation Metrics (Partial) 
**File**: `storage.py`
- Instrument `store_refresh_token()` method
- Track SQLite INSERT operation timing and success/error status
- Enables: Partial data for database operation latency panel

## Metrics Now Exposed

### Queue Metrics:
- `mcp_vector_sync_queue_size` - Real-time queue depth

### Health Metrics:
- `mcp_dependency_health{dependency="nextcloud"}` - UP/DOWN status
- `mcp_dependency_health{dependency="qdrant"}` - UP/DOWN status
- `mcp_dependency_check_duration_seconds{dependency}` - Health check latency

### Database Metrics:
- `mcp_db_operations_total{db="sqlite",operation="insert"}` - Operation count
- `mcp_db_operation_duration_seconds{db="sqlite",operation="insert"}` - Operation latency

## Dashboard Impact

**Panels Now Populated** (7/34 panels):
-  Processing Queue Depth
-  Nextcloud Health
-  Qdrant Health
-  Health Check Duration
-  Database Operation Latency (partial)
-  Vector sync panels (already working from PR #292)

**Panels Still Empty** (remaining work):
-  OAuth panels (4): Token validations, exchanges, cache hit rate, refresh ops
-  MCP tool panels (3): Call volume, error rates, execution duration
-  Database panel: Needs more SQLite operations instrumented (~29 remaining)

## Testing

Verified metric definitions exist and will be recorded on next deployment.

## Next Steps

Phase 4: OAuth token metrics (unified_verifier.py, context_helper.py, storage.py)
Phase 5: MCP tool metrics (all server/*.py files with @mcp.tool())
Phase 3 completion: Remaining 29 database operations in storage.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 16:14:38 +01:00
Chris Coutinho 6812e1aca7 fix: add dynamic dimension detection for Ollama embedding models
This fixes dimension mismatch errors when using embedding models with
non-standard dimensions (e.g., qwen3-embedding:4b produces 2560-dim
vectors instead of the hardcoded 768).

Changes:
- OllamaEmbeddingProvider: Detect dimensions dynamically by generating
  test embedding instead of hardcoding to 768
- qdrant_client: Call dimension detection before collection creation
- app.py: Initialize Qdrant collection before starting background tasks
  in streamable-http transport path
- tests: Fix integration tests to properly mock EmbeddingService wrapper

Fixes dimension mismatch error:
"could not broadcast input array from shape (2560,) into shape (768,)"

All integration tests passing (6/6).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:46:30 +01:00
Chris Coutinho 12c96af819 feat: add dynamic vector sync status updates with htmx polling
Implement real-time vector sync status updates in the /app UI without
requiring page refreshes. The status (indexed documents, pending
documents, sync state) now updates automatically every 3 seconds.

Changes:
- Add vector_sync_status_fragment() endpoint that returns HTML fragment
  with current vector sync status
- Modify user_info_html() to use htmx loading for vector sync section
  with hx-trigger="load" on initial render
- Status fragment includes hx-trigger="every 3s" for continuous polling
- Add /app/vector-sync/status route to browser_routes

The implementation uses htmx (already loaded on page) to poll the status
endpoint, providing near real-time updates with minimal overhead. The
endpoint queries Qdrant for indexed count and reads from memory streams
for pending count, returning only the status HTML fragment.

Pattern follows existing webhook management UI which also uses htmx
for dynamic loading.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 21:04:31 +01:00
Chris Coutinho d86a185e04 refactor: move webapp from /user/page to /app
Simplified the webapp routing structure by consolidating the admin UI
to a single clean endpoint.

Changes:
- Moved webapp from /user/page to /app (root of mount)
- Removed /user JSON endpoint (no longer needed)
- Updated mount point from /user to /app in app.py
- Updated all route path checks (3 locations)
- Updated OAuth redirects to point to /app
- Updated all HTMX endpoint references
- Updated documentation (ADR-007, CHANGELOG)
- Added redirect from /app to /app/ for trailing slash handling

New Route Structure:
- /app - Main webapp (HTML UI with tabs)
- /app/revoke - Revoke background access
- /app/webhooks - Webhook management UI
- /app/webhooks/enable/{preset_id} - Enable webhook preset
- /app/webhooks/disable/{preset_id} - Disable webhook preset

Breaking Change: Existing bookmarks to /user or /user/page will no longer work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 20:53:43 +01:00
Chris Coutinho 1bced88c97 refactor: consolidate database storage for webhooks and OAuth tokens
Refactored the storage system to use a unified SQLite database for both
webhook tracking and OAuth token storage, available in both BasicAuth
and OAuth modes.

Changes:
- Renamed refresh_token_storage.py → storage.py
- Made TOKEN_ENCRYPTION_KEY optional (only required for OAuth token ops)
- Added registered_webhooks table with schema versioning
- Added webhook storage methods (store, get, delete, list, clear)
- Initialize storage in both BasicAuth and OAuth modes
- Updated webhook routes to persist registrations in database
- Database-first pattern for webhook status checks (performance)
- Updated all imports across codebase

Storage Behavior:
- Database created automatically at startup if needed
- Existing databases detected and reused
- Server fails fast if database initialization fails
- No migrations needed (OAuth feature is experimental)

Testing:
- Added 13 comprehensive unit tests for webhook storage
- All 118 unit tests pass
- All 5 smoke tests pass
- Verified fail-fast behavior on initialization errors

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 20:01:49 +01:00
Chris Coutinho b58e7238ae feat: validate Nextcloud webhook schemas and document findings
Manual testing of Nextcloud webhook_listeners app to validate webhook
payloads against ADR-010 expected schemas and document implementation
requirements for webhook-based vector synchronization.

## Changes

- Add test webhook endpoint at /webhooks/nextcloud in app.py
  - Captures and logs webhook payloads for analysis
  - Returns 200 OK immediately for webhook delivery confirmation

- Create webhook-testing-findings.md with comprehensive test results
  - Captured payloads for 5/6 webhook event types
  - Critical findings: missing node.id in deletions, type mismatches
  - Implementation recommendations with code examples

- Update ADR-010 with Appendix A: Manual Webhook Testing Results
  - Document actual vs expected webhook behavior
  - Update event mapping table with tested webhook status
  - Add 6 specific implementation recommendations
  - Include testing implications for future development

## Testing Results

 NodeCreatedEvent - fires correctly, includes node.id (integer)
 NodeWrittenEvent - fires correctly, includes node.id (integer)
 NodeDeletedEvent - fires but missing node.id field (path only)
 CalendarObjectCreatedEvent - fires correctly with full iCal
 CalendarObjectUpdatedEvent - fires correctly with full iCal
 CalendarObjectDeletedEvent - does not fire (potential NC bug)

## Key Findings

1. NodeDeletedEvent missing node.id field - requires path-based fallback
2. node.id returns integer not string - needs casting for consistency
3. Multiple webhooks fire per operation - needs deduplication logic
4. Calendar deletion webhooks don't fire - reported as issue #53497
5. Calendar webhooks include full iCal content - enables rich parsing

## GitHub Issues

- Created issue #56371: NodeDeletedEvent missing node.id field
- Commented on issue #53497: CalendarObjectDeletedEvent not firing

Closes #283

---

_This commit was generated with the help of AI, and reviewed by a Human_
2025-11-11 12:13:20 +01:00
Chris Coutinho a6e5f3d8ff refactor: simplify OpenTelemetry tracing configuration
Simplifies the OpenTelemetry tracing setup by removing the redundant
OTEL_ENABLED flag and using the presence of OTEL_EXPORTER_OTLP_ENDPOINT
to determine if tracing should be enabled. This follows the standard
OpenTelemetry environment variable conventions more closely.

Changes:
- Remove OTEL_ENABLED/tracing_enabled flag in favor of checking if
  OTEL_EXPORTER_OTLP_ENDPOINT is set
- Add OTEL_EXPORTER_VERIFY_SSL configuration option for OTLP endpoints
  with self-signed certificates (defaults to false for development)
- Move HTTPXClientInstrumentor initialization to module level to ensure
  httpx calls are traced across all Nextcloud API requests
- Add tracing spans to vector sync operations (scan_user_documents)
- Fix authorization header logging to only warn about missing headers
  in OAuth mode (BasicAuth mode doesn't use Authorization headers)
- Update observability documentation to reflect simplified configuration
- Refactor Dockerfile to use --no-editable flag for uv sync

Breaking changes:
- OTEL_ENABLED environment variable is removed
- Tracing is now automatically enabled when OTEL_EXPORTER_OTLP_ENDPOINT
  is set

Migration guide:
- Remove OTEL_ENABLED=true from environment configuration
- Tracing will be enabled automatically if OTEL_EXPORTER_OTLP_ENDPOINT
  is configured

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 22:48:37 +01:00
Chris Coutinho f3050e9b45 chore: Remove /health and /metrics endpoints from logging 2025-11-10 02:07:45 +01:00
Chris Coutinho 4e89e92b65 fix(observability): isolate metrics endpoint to dedicated port
Security fix: Move Prometheus metrics endpoint from main HTTP port to
dedicated port 9090 to prevent external exposure of metrics data.

Changes:
- Use prometheus_client.start_http_server() for dedicated metrics server
- Remove /metrics route from main application routes
- Metrics now only accessible on port 9090 (configurable via METRICS_PORT)
- Main application port no longer serves /metrics endpoint

This follows security best practice of isolating monitoring endpoints
from application traffic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 09:53:36 +01:00
Chris Coutinho 5e4667a643 fix(readiness): Only check external Qdrant in network mode
The readiness probe incorrectly tried to connect to an external Qdrant service
even when using memory or persistent mode (embedded Qdrant). This caused pods
to never become ready in Kubernetes deployments using the default configuration.

Root cause:
- In memory/persistent modes, QDRANT_URL env var is NOT set
- Readiness check used default 'http://qdrant:6333' anyway
- Tried to connect to non-existent service
- Connection failed -> 503 -> pod stuck in not-ready state

Fix:
- Only check external Qdrant health if QDRANT_URL is explicitly set (network mode)
- For embedded modes (memory/persistent), report status as 'embedded' without blocking
- Background scanner tasks don't block readiness (already non-blocking via anyio.start_soon)

This allows pods to become ready immediately when using embedded Qdrant,
while still validating external Qdrant connectivity in network mode.

Fixes: Kubernetes pods failing readiness check with default Qdrant configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 09:28:09 +01:00
Chris Coutinho 578de4d7d6 feat(observability): Add comprehensive monitoring with Prometheus and OpenTelemetry
- Add Prometheus metrics for HTTP, MCP tools, Nextcloud API, OAuth, vector sync, and DB operations
- Add OpenTelemetry distributed tracing with OTLP export
- Add structured JSON logging with trace context correlation
- Add ObservabilityMiddleware for automatic HTTP instrumentation
- Add app_name attribute to all client classes for per-app metrics
- Add configuration for metrics, tracing, and logging via environment variables
- Add documentation in docs/observability.md
- Fix graceful degradation when tracing is disabled (default state)
- Fix uvicorn logging configuration to use observability formatters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 08:54:04 +01:00
Chris Coutinho 72232f937a refactor: migrate vector sync from asyncio.Queue to anyio memory object streams
Replace asyncio.Queue with anyio.create_memory_object_stream() throughout
the vector sync system for better library consistency and improved shutdown
semantics.

## Changes Made

**scanner.py**:
- Changed parameter type from `asyncio.Queue` to `MemoryObjectSendStream[DocumentTask]`
- Replaced all `await document_queue.put()` calls with `await send_stream.send()`
- Wrapped scanner loop in `async with send_stream:` context manager for automatic cleanup
- Updated log messages: "Queued" → "Sent"
- Removed `import asyncio` (no longer needed)

**processor.py**:
- Changed parameter type from `asyncio.Queue` to `MemoryObjectReceiveStream[DocumentTask]`
- Replaced `asyncio.wait_for(document_queue.get(), timeout=1.0)` with `anyio.fail_after(1.0)` + `await receive_stream.receive()`
- Removed all `document_queue.task_done()` calls (not needed with streams)
- Added `anyio.EndOfStream` exception handling for graceful shutdown when scanner closes
- Removed `import asyncio` (no longer needed)

**app.py**:
- Removed `import asyncio` from top-level imports
- Added `from anyio.streams.memory import MemoryObjectReceiveStream, MemoryObjectSendStream`
- Updated AppContext dataclass:
  - Replaced `document_queue: Optional[asyncio.Queue]` with:
    - `document_send_stream: Optional[MemoryObjectSendStream]`
    - `document_receive_stream: Optional[MemoryObjectReceiveStream]`
- Updated `app_lifespan_basic()`:
  - Replaced `asyncio.Queue(maxsize=...)` with `anyio.create_memory_object_stream(max_buffer_size=...)`
  - Pass `send_stream` to scanner_task
  - Pass `receive_stream.clone()` to each processor_task (enables multiple consumers)
  - Updated AppContext yield to include both streams
- Updated `starlette_lifespan()`:
  - Same changes as app_lifespan_basic for streamable-http transport
  - Removed `import asyncio as asyncio_module` (no longer needed)
  - Updated app.state storage to use send_stream and receive_stream

**semantic.py**:
- Updated `nc_get_vector_sync_status()` tool:
  - Access `document_receive_stream` instead of `document_queue` from lifespan context
  - Use `stream_stats.current_buffer_used` instead of `queue.qsize()` for pending count
  - More reliable metrics (qsize() was not guaranteed accurate)

## Benefits

1. **Library Consistency**: Pure anyio throughout codebase (was mixing asyncio.Queue with anyio.Event and anyio.create_task_group)
2. **Graceful Shutdown**: `async with send_stream:` automatically closes stream on exit, signaling EndOfStream to all processors
3. **Better Timeout Handling**: `anyio.fail_after()` is more idiomatic than `asyncio.wait_for()`
4. **Stream Cloning**: Easy to add multiple consumers via `receive_stream.clone()`
5. **Better Statistics**: `.statistics()` provides accurate buffer metrics (qsize() was unreliable)
6. **Type Safety**: Separate send/receive types prevent accidental misuse
7. **No task_done() tracking**: Streams handle completion automatically

## Testing

-  All 69 unit tests passing
-  All 5 smoke tests passing
-  No regressions in functionality
-  Graceful shutdown behavior improved

## References

- https://anyio.readthedocs.io/en/stable/why.html#queue-fix
- https://anyio.readthedocs.io/en/stable/streams.html#memory-object-streams

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 06:43:44 +01:00
Chris Coutinho 4b026e9aa0 feat: implement ADR-009 - refactor semantic search to use generic semantic:read scope
This implements ADR-009, which documents the decision to use a generic
`semantic:read` OAuth scope instead of requiring all app-specific scopes
for semantic search functionality.

Changes:
- Created new `nextcloud_mcp_server/models/semantic.py` with semantic search models
  - SemanticSearchResult (with new doc_type field for multi-app support)
  - SemanticSearchResponse
  - SamplingSearchResponse
  - VectorSyncStatusResponse

- Created new `nextcloud_mcp_server/server/semantic.py` with semantic search tools
  - nc_semantic_search (renamed from nc_notes_semantic_search)
  - nc_semantic_search_answer (renamed from nc_notes_semantic_search_answer)
  - nc_get_vector_sync_status (renamed from nc_notes_get_vector_sync_status)
  - All tools now use @require_scopes("semantic:read") instead of "notes:read"

- Updated `nextcloud_mcp_server/server/notes.py`
  - Removed semantic search tools (moved to semantic.py)
  - Removed semantic search model imports
  - Removed unused MCP imports (ModelHint, ModelPreferences, etc.)

- Updated `nextcloud_mcp_server/models/notes.py`
  - Removed semantic search models (moved to semantic.py)

- Updated `nextcloud_mcp_server/app.py`
  - Import configure_semantic_tools
  - Register semantic tools when VECTOR_SYNC_ENABLED=true

- Updated `nextcloud_mcp_server/server/__init__.py`
  - Export configure_semantic_tools

- Updated tests
  - tests/integration/test_sampling.py: Use new tool names
  - tests/unit/test_response_models.py: Import from semantic.py, add doc_type field

Architecture:
- Semantic search is now a cross-app feature, not tied to Notes
- Uses dual-phase authorization: semantic:read scope + per-document verification
- Supports future multi-app indexing (notes, calendar, deck, files, contacts)

Test results:
- All 69 unit tests passing
- All 5 smoke tests passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-09 05:53:53 +01:00
Chris Coutinho ee183e1c1c feat: add vector sync processing status to /user/page endpoint
Add real-time processing status display to the browser UI at /user/page
showing indexed document count, pending queue size, and sync status.
Implements the status display described in ADR-007 lines 280-298.

Changes:
- Store document_queue and related state in app.state for route access
- Add _get_processing_status() helper to query Qdrant and check queue
- Display status section in user_info_html() with indexed/pending counts
- Show color-coded status badge (green "Idle" or orange "Syncing")
- Only displays when VECTOR_SYNC_ENABLED=true

Status appears in both BasicAuth and OAuth modes, positioned after
session info but before logout buttons. Numbers are formatted with
commas for readability.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 23:59:18 +01:00
Chris Coutinho 4dbb2eb468 fix: integrate vector sync tasks with Starlette lifespan for streamable-http
Fixes background task startup for streamable-http transport by integrating
vector sync initialization into the Starlette lifespan context manager.

Starlette Lifespan Integration:
- Moved background task startup from FastMCP lifespan to Starlette lifespan
- FastMCP lifespan only triggers on MCP session establishment
- Starlette lifespan runs on server startup (correct timing)
- Fixed module scoping issues with local imports (anyio_module, asyncio_module)
- Added conditional startup based on oauth_enabled flag

Scanner Fixes:
- Fixed NotesClient method: list_notes() → get_all_notes()
- Properly handle AsyncIterator with list comprehension
- Collects all notes before processing

Verified Working:
- Background tasks start successfully on server startup
- Scanner fetches notes from Nextcloud API
- Processor pool (3 workers) ready for document processing
- Health endpoint reports Qdrant status
- No startup errors

Phase 3 Complete:
- BasicAuth mode with vector sync fully functional
- Background tasks integrate cleanly with streamable-http transport
- Graceful shutdown with coordinated task cancellation

Related: ADR-007 Background Vector Database Synchronization

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 21:20:26 +01:00
Chris Coutinho 8f45e996e8 feat: implement vector sync scanner and processor (ADR-007 Phase 2)
Implements background vector database synchronization using anyio
TaskGroups for BasicAuth mode with single-user credentials.

Scanner Implementation:
- Periodic document discovery (hourly, configurable)
- Timestamp-based change detection (Nextcloud vs Qdrant)
- Wake event for immediate scanning on-demand
- Supports both initial sync (all docs) and incremental sync (changes only)
- Detects deleted documents and queues for removal

Processor Implementation:
- Concurrent document processing pool (3 workers default)
- I/O-bound embedding generation via Ollama API
- Retry logic with exponential backoff (3 retries)
- Document chunking (512 words, 50-word overlap)
- Handles both index and delete operations
- Upserts vectors to Qdrant with rich metadata

App Lifespan Integration:
- Extended AppContext with background task state
- Modified app_lifespan_basic() to start tasks via anyio TaskGroups
- Graceful shutdown with coordinated task cancellation
- Only activates when VECTOR_SYNC_ENABLED=true

Embedding Service:
- OllamaEmbeddingProvider with TLS support
- Singleton pattern for shared client instances
- Batch embedding support for efficiency
- Auto-detects embedding dimension (768 for nomic-embed-text)

Qdrant Client:
- Async client wrapper with singleton pattern
- Auto-creates collection on first use
- COSINE distance metric for semantic similarity
- Integrates with embedding service for dimension detection

Health Check Enhancement:
- Added Qdrant status check to /health/ready endpoint
- Only checks when VECTOR_SYNC_ENABLED=true
- 2-second timeout for health probe
- Reports connection errors with details

Configuration:
- VECTOR_SYNC_ENABLED: Enable background sync
- VECTOR_SYNC_SCAN_INTERVAL: Scanner frequency (3600s default)
- VECTOR_SYNC_PROCESSOR_WORKERS: Concurrent processors (3 default)
- QDRANT_URL, QDRANT_API_KEY, QDRANT_COLLECTION: Vector DB config
- OLLAMA_BASE_URL, OLLAMA_EMBEDDING_MODEL: Embedding service config

Dependencies Added:
- qdrant-client>=1.7.0: Vector database client

Docker Compose:
- Added Qdrant service with health check
- Exposed ports 6333 (REST) and 6334 (gRPC)
- Configured MCP service with vector sync environment
- Added qdrant-data volume for persistence

Known Issue:
- FastMCP lifespan not triggering for streamable-http transport
- Background tasks will start once lifespan integration is complete
- Lifespan triggers on MCP session establishment, not server startup

Related: ADR-007 Background Vector Database Synchronization

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-08 21:14:38 +01:00
Chris Coutinho 11cdab475f feat: unify session architecture and enhance login status visibility
This commit addresses the "Login not detected" issue after completing
OAuth login via elicitation by unifying the session architecture and
adding comprehensive visibility into background session status.

## Changes

### 1. Enhanced check_logged_in with comprehensive logging (oauth_tools.py)
- Added detailed logging at each step of token lookup
- Implemented fallback strategy: first search by provisioning_client_id,
  then fall back to user_id lookup
- This allows detection of refresh tokens created via any flow
  (elicitation or browser login)
- Log messages include flow_type, provisioned_at, and provisioning_client_id
  for debugging

### 2. Unified session architecture (browser_oauth_routes.py)
- Browser login now stores provisioning_client_id=state when saving
  refresh token
- This makes browser and elicitation flows consistent - both can be
  found by the same state parameter
- Treats Flow 2 (elicitation) and browser login as the same "background
  session"

### 3. Enhanced /user/page with session status (userinfo_routes.py)
- Added comprehensive background access section showing:
  - Background Access: Granted/Not Granted (with visual indicators)
  - Flow Type: browser/flow2/hybrid
  - Provisioned At: timestamp
  - Token Audience: nextcloud/mcp
  - Scopes: detailed scope list
- Status displayed regardless of which flow created the session
  (browser login or elicitation)

### 4. Added revoke functionality (userinfo_routes.py, app.py)
- New POST endpoint: /user/revoke
- Allows users to revoke background access (delete refresh token)
- Browser session cookie remains valid for UI access
- Confirmation dialog before revocation
- Success page with auto-redirect back to /user/page
- Registered route in app.py browser_routes

## Testing
All tests pass:
- 6/6 login elicitation tests pass
- 21/21 core OAuth tests pass
- Comprehensive logging helps debug future issues

## Fixes
Resolves: "Login not detected. Please ensure you completed the login
at the provided URL before clicking OK."

The issue occurred because elicitation and browser login created
separate sessions. Now they are unified under the same architecture.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 21:50:55 +01:00
Chris Coutinho 0c9a9ea24d fix: Consolidate OAuth callbacks and implement PKCE for all flows
This PR fixes multiple OAuth-related issues:

## Unified OAuth Callback
- Consolidated `/oauth/callback-nextcloud` and `/oauth/login-callback` into single `/oauth/callback` endpoint
- Flow type determined by session lookup via state parameter (no query params in redirect_uri)
- Fixes redirect_uri validation issues with IdPs requiring exact match
- Legacy endpoints kept as aliases for backwards compatibility

## PKCE Implementation
- Implemented PKCE (RFC 7636) for Flow 2 (resource provisioning)
  - Generate code_verifier and code_challenge
  - Store code_verifier in session storage
  - Retrieve and use in token exchange
- Fixed PKCE for browser login (integrated mode)
  - Previously only worked for external IdP (Keycloak)
  - Now works for both Nextcloud OIDC and external IdP

## Login Elicitation Fixes (ADR-006)
- Fixed elicitation URL to route through MCP server endpoint
  - Changed from direct Nextcloud URL to `/oauth/authorize-nextcloud`
  - Ensures PKCE is properly handled by server
- Fixed login detection after OAuth flow completes
  - Look up refresh token by state parameter instead of user_id
  - Works even when Flow 1 token not present
- Added `get_refresh_token_by_provisioning_client_id()` method

## Session Authentication
- Fixed `/user/page` redirect loop
  - Shared oauth_context with mounted browser_app
  - SessionAuthBackend can now validate sessions correctly

## Tests
- Added comprehensive login elicitation test suite
- Updated scope authorization test expectations
- All 43 OAuth tests passing

## Files Changed
- `app.py`: Shared oauth_context, unified callback route
- `oauth_routes.py`: Unified callback, PKCE for Flow 2
- `browser_oauth_routes.py`: PKCE for integrated mode
- `oauth_tools.py`: Fixed elicitation URL generation
- `refresh_token_storage.py`: Added lookup by provisioning_client_id
- `test_login_elicitation.py`: New test suite

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 21:08:55 +01:00
Chris Coutinho 659087e4c7 fix: Implement proper OAuth resource parameters and PRM-based discovery
This commit completes the OAuth audience validation implementation per RFC 7519,
RFC 8707 (Resource Indicators), and RFC 9728 (Protected Resource Metadata).

## Key Changes

### OAuth Resource Parameters (RFC 8707)
- Add `resource` parameter to Flow 1 (MCP client auth) with MCP server audience
- Add `resource` parameter to Flow 2 (Nextcloud access) with Nextcloud audience
- Add `nextcloud_resource_uri` to oauth_context configuration
- Fix undefined variable error in starlette_lifespan

### PRM-Based Resource Discovery (RFC 9728)
- Update tests to fetch resource identifier from PRM endpoint
- Add fallback to hardcoded value if PRM fetch fails
- Demonstrate correct OAuth client implementation pattern

### ADR-005 Documentation Updates
- Update to reflect simplified RFC 7519 compliant implementation
- Document that MCP validates only its own audience (not Nextcloud's)
- Add section on OAuth resource parameters and PRM discovery
- Update implementation checklist to show completed items
- Mark status as "Implemented" with update date

## Implementation Details

The solution follows RFC 7519 Section 4.1.3: resource servers validate only
their own presence in the audience claim. This simplifies the logic while
maintaining security:

- MCP server validates MCP audience only
- Nextcloud independently validates its own audience
- No dual validation required at MCP layer
- Token reuse is allowed per RFC 8707 for multi-audience tokens

## Test Results
 test_mcp_oauth_server_connection - PASSED
 test_deck_board_view_permissions - PASSED
 test_prm_endpoint - PASSED

All OAuth flows now properly specify target resources, resulting in correct
audience validation throughout the system.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 23:19:03 +01:00
Chris Coutinho 5deb3132c3 fix: Correct OAuth token audience validation for multi-audience mode
Fix two issues preventing OAuth tests from passing:

1. Set oidc_client_id and oidc_client_secret on Settings object
   - These were being read from environment but not propagated to the
     UnifiedTokenVerifier settings instance

2. Use client_issuer instead of issuer for JWT validation
   - client_issuer accounts for NEXTCLOUD_PUBLIC_ISSUER_URL override
   - Fixes "Invalid issuer" errors when public URL differs from internal

3. Accept resource URL with /mcp path in audience validation
   - During DCR, resource_url is registered as "{mcp_server_url}/mcp"
   - Tokens correctly include this full path as audience
   - Verifier now accepts both "http://localhost:8001" and
     "http://localhost:8001/mcp" as valid MCP audiences

These changes restore OAuth functionality while maintaining ADR-005
security requirements for proper audience validation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 19:03:35 +01:00
Chris Coutinho 9fab6cb550 feat: Implement ADR-005 unified token verifier to eliminate token passthrough vulnerability
Replace two non-compliant token verifiers (NextcloudTokenVerifier and
ProgressiveConsentTokenVerifier) with a single UnifiedTokenVerifier that properly
validates token audiences per MCP Security Best Practices specification.

The previous implementation had a critical security vulnerability where tokens
intended for the MCP server were passed directly to Nextcloud APIs without
proper audience validation (token passthrough anti-pattern). This violates
OAuth 2.0 security principles and the MCP specification.

Changes:
- Add UnifiedTokenVerifier supporting two compliant modes:
  * Multi-audience mode (default): Validates tokens contain BOTH MCP and
    Nextcloud audiences, enabling direct use without exchange
  * Token exchange mode (opt-in): Validates MCP audience only, exchanges
    for Nextcloud tokens via RFC 8693 with caching to minimize latency

- Remove token passthrough vulnerability from context.py and context_helper.py
- Implement token exchange caching (5-minute TTL default) to reduce network calls
- Add required environment variables for audience validation:
  * NEXTCLOUD_MCP_SERVER_URL - MCP server URL (used as audience)
  * NEXTCLOUD_RESOURCE_URI - Nextcloud resource identifier
  * TOKEN_EXCHANGE_CACHE_TTL - Cache TTL for exchanged tokens

- Update docker-compose.yml with resource URI configuration for both OAuth modes
- Add comprehensive test suite (29 tests) covering both authentication modes
- Remove legacy NextcloudTokenVerifier and ProgressiveConsentTokenVerifier

Security improvements:
- Eliminates token passthrough anti-pattern
- Enforces proper audience separation between MCP and Nextcloud
- Complies with MCP Security Best Practices and RFC 8707/8693
- Maintains performance with token exchange caching

Test results: 65/65 unit tests passed, 5/5 smoke tests passed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 18:53:14 +01:00
Chris Coutinho 6cccd92b3b build: Add type checking 2025-11-05 15:19:55 +01:00
Chris Coutinho 8983f25eaf fix: add missing await for get_nextcloud_client in capabilities resource
Fix nc_get_capabilities resource handler that was missing await when
calling get_nextcloud_client(ctx), causing error:
'coroutine' object has no attribute 'capabilities'

Root cause:
- get_nextcloud_client() is an async function (context.py:9)
- Returns a coroutine that must be awaited
- app.py:737 called it without await

Solution:
- Add await: client = await get_nextcloud_client(ctx)
- The handler is already async, so can await the call

Test fixed:
- test_mcp_resources_access now passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04 10:22:50 +01:00
Chris Coutinho de99296779 feat: implement scope-based audience mapping and RFC 9728 support
This commit removes hardcoded Keycloak audience mappers and implements
dynamic audience assignment based on OAuth client scopes and RFC 8707
resource indicators.

## MCP Server Changes

### Protected Resource Metadata (app.py)
- Change resource field from client_id to URL (RFC 9728 compliance)
- Use `{mcp_server_url}/mcp` as resource identifier
- Update DCR registration to include all Nextcloud API scopes
- Add resource_url parameter to client registration

### Client Registration (auth/client_registration.py)
- Add resource_url parameter to register_client()
- Pass resource_url to DCR endpoint
- Support RFC 9728 resource metadata

### Browser OAuth Routes (auth/browser_oauth_routes.py)
- Enhanced error logging for token exchange failures
- Log HTTP status code and response body for debugging
- Improved error messages for OAuth provisioning issues

### Token Verifier (auth/progressive_token_verifier.py)
- Add introspection_uri and client_secret parameters
- Initialize HTTP client for introspection requests
- Enable opaque token validation support

## Keycloak Configuration

### realm-export.json
- **Remove** hardcoded `audience-mcp-server` protocol mapper
- Audience now determined by client scopes:
  - External clients: RFC 8707 resource parameter → `aud: {resource_url}`
  - MCP Server: `token-exchange-nextcloud` scope → `aud: "nextcloud"`

### OIDC App (third_party/oidc)
- Updated submodule with RFC 9728 support
- Added resource_url database field
- Enhanced introspection authorization logic

## Architecture

Two separate audience flows:

1. **Gemini CLI → MCP Server**
   - Client requests: `resource=http://localhost:8002/mcp`
   - Token audience: `aud: "http://localhost:8002/mcp"`
   - MCP server validates via progressive_token_verifier

2. **MCP Server → Nextcloud APIs**
   - MCP server includes: `scope=token-exchange-nextcloud`
   - Token audience: `aud: "nextcloud"` (via scope mapper)
   - Nextcloud user_oidc validates via SelfEncodedValidator

## Benefits
-  RFC 8707 compliant (resource indicators)
-  RFC 9728 compliant (protected resource metadata)
-  Dynamic audience based on OAuth context
-  Fixes Gemini CLI authentication failures
-  Maintains Nextcloud API access for background jobs
-  Clear security boundaries between flows

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04 05:28:58 +01:00
Chris Coutinho 10dffd0c10 fix: restructure routes to prevent SessionAuthBackend from interfering with FastMCP OAuth
SessionAuthBackend middleware was wrapping the entire app including FastMCP,
which prevented FastMCP's OAuth token verification from running properly.
When SessionAuthBackend returned None for /mcp paths, Starlette marked requests
as "anonymous" and allowed them through, bypassing FastMCP's authentication.

Changes:

1. Route restructuring (app.py):
   - Create separate Starlette app for browser routes (/user, /user/page)
   - Apply SessionAuthBackend only to browser app
   - Mount browser app at /user/* before FastMCP
   - Mount FastMCP at / (catch-all with its own OAuth)
   - Remove global SessionAuthBackend middleware

2. SessionAuthBackend cleanup (session_backend.py):
   - Remove path exclusion logic (no longer needed)
   - Simplify to only handle browser routes
   - Update docstring to reflect mount-based isolation

Benefits:
- FastMCP's OAuth token verification now runs properly
- No middleware interference between authentication mechanisms
- Clear separation: SessionAuth for browser UI, OAuth Bearer for MCP clients
- Tests confirm OAuth authentication works correctly

Testing:
- All OAuth tests pass (test_mcp_oauth_*, test_jwt_*)
- Browser routes still require session auth
- FastMCP routes use OAuth Bearer tokens exclusively

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04 03:34:53 +01:00
Chris Coutinho 192c4bf009 fix: correct OAuth token audience validation using RFC 8707 resource parameter
The test_mcp_oauth_server_connection test was failing because OAuth tokens
had the wrong audience claim. The MCP server's progressive_token_verifier
expects tokens with audience matching its OAuth client ID, but tokens were
being issued with Nextcloud's default resource server audience.

Changes:

1. Test fixtures (tests/conftest.py):
   - Add get_mcp_server_resource_metadata() helper to fetch PRM metadata
   - Update playwright_oauth_token to include resource parameter in auth requests
   - Update _get_oauth_token_with_scopes to support optional resource parameter
   - Automatically fetch resource ID from MCP server's PRM endpoint

2. MCP Server (nextcloud_mcp_server/app.py):
   - Fix Protected Resource Metadata endpoint to return OAuth client ID
   - Change "resource" field from URL to client ID for proper audience validation
   - Ensures tokens obtained with resource parameter have correct audience claim

How it works:
1. Test fetches /.well-known/oauth-protected-resource from MCP server
2. Extracts resource field (MCP server's client ID)
3. Includes &resource=<client-id> in OAuth authorization request (RFC 8707)
4. Nextcloud OIDC issues tokens with aud: [<client-id>]
5. MCP server's progressive_token_verifier accepts tokens (audience matches)

Fixes OAuth test failures:
- test_mcp_oauth_server_connection
- test_mcp_oauth_tool_execution
- test_mcp_oauth_client_with_playwright

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04 03:06:11 +01:00
Chris Coutinho 01d1cf9190 feat: integrate token exchange into MCP server application
Wire up RFC 8693 token exchange throughout the MCP server to support
stateless per-request token conversion for external IdP scenarios.

Changes:

Authentication Flow:
- Add exchange_token_for_audience() for pure RFC 8693 exchange
- Update context_helper to use stateless token exchange
- Remove fallback to standard OAuth on exchange failure
- Make storage initialization lazy (only for delegation, not MCP tools)

Application Configuration:
- Add ENABLE_TOKEN_EXCHANGE environment variable support
- Skip provisioning tools when token exchange enabled
- Pass mcp_client_id to token broker for proper validation
- Update docker-compose.yml with token exchange config

Token Exchange Service:
- Add TOKEN_EXCHANGE_GRANT constant
- Implement exchange_token_for_audience() method
- Support both "mcp-server" and client_id audiences
- Lazy storage initialization for delegation scenarios
- Enhanced error handling and logging

Progressive Token Verifier:
- Add mcp_client_id parameter for external IdP validation
- Accept both "mcp-server" and configured client_id
- Support external IdP token verification

Key Behavior Changes:
- When ENABLE_TOKEN_EXCHANGE=true: Each MCP tool call triggers
  stateless token exchange (client token → Nextcloud token)
- When ENABLE_TOKEN_EXCHANGE=false: Uses pass-through mode
  (validates Flow 1 token and passes to Nextcloud)
- No provisioning tools registered in exchange mode
- No refresh tokens needed for request-time operations

This completes the token exchange implementation. The MCP server now
supports both pass-through (default) and exchange (opt-in) modes for
federated authentication architectures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04 02:32:40 +01:00