Compare commits

..

243 Commits

Author SHA1 Message Date
github-actions[bot] f70d743c8b bump: version 0.48.6 → 0.49.0 2025-12-08 06:23:14 +00:00
Chris Coutinho 251b8a10c0 Merge pull request #363 from cbcoutinho/feature/news-app-integration
feat(news): add Nextcloud News app integration
2025-12-08 07:22:42 +01:00
Chris Coutinho 3f06e2ee77 fix: resolve all type checking errors (8 errors fixed)
Fixed 8 type checker errors across the codebase:

- vector/scanner.py: Handle None scroll results with null-safe iteration
- search/{bm25_hybrid,semantic}.py: Add None checks for result.payload
- auth/{unified_verifier,webhook_routes}.py: Assert non-None auth credentials
- client/webdav.py: Add None checks before int() conversions
- providers/openai.py: Assert embedding_model is not None
- search/algorithms.py: Explicitly type doc_types set and cast values
- observability/logging_config.py: Match parent class signature (log_data)

Also fixed test_create_tag_creates_system_tag to match WebDAV implementation
(was testing OCS API endpoint, now tests correct WebDAV endpoint with
Content-Location header).

Type checker: 0 errors (down from 8), 20 warnings (ignored)
Tests: All 192 unit tests passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-08 01:09:02 +01:00
Chris Coutinho 7f11c793ef Merge remote-tracking branch 'origin/master' into feature/news-app-integration 2025-12-07 22:36:48 +01:00
Chris Coutinho e28dcbff9a Merge pull request #378 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.16
2025-12-07 13:28:38 +01:00
renovate-bot-cbcoutinho[bot] 89ec0186a4 chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.16 2025-12-07 11:06:50 +00:00
Chris Coutinho 6e1efde8c6 Merge pull request #375 from cbcoutinho/renovate/qdrant-qdrant-v1.16.2
chore(deps): update qdrant/qdrant:v1.16.2 docker digest to dab6de3
2025-12-05 20:19:08 +01:00
Chris Coutinho 6aa80d4210 Merge pull request #377 from cbcoutinho/renovate/hoverkraft-tech-compose-action-2.x
chore(deps): update hoverkraft-tech/compose-action action to v2.4.2
2025-12-05 20:18:56 +01:00
Chris Coutinho 4e86006b3f Merge pull request #376 from cbcoutinho/renovate/qdrant-1.x
chore(deps): update helm release qdrant to v1.16.2
2025-12-05 20:18:32 +01:00
renovate-bot-cbcoutinho[bot] 679e22a7c2 chore(deps): update hoverkraft-tech/compose-action action to v2.4.2 2025-12-05 11:11:41 +00:00
renovate-bot-cbcoutinho[bot] 4d3228a4a8 chore(deps): update helm release qdrant to v1.16.2 2025-12-05 11:11:34 +00:00
renovate-bot-cbcoutinho[bot] 0aa307f0b6 chore(deps): update qdrant/qdrant:v1.16.2 docker digest to dab6de3 2025-12-05 11:11:18 +00:00
Chris Coutinho 6a69ecefb1 Merge pull request #372 from cbcoutinho/renovate/qdrant-qdrant-1.x
chore(deps): update qdrant/qdrant docker tag to v1.16.2
2025-12-04 13:56:27 +01:00
renovate-bot-cbcoutinho[bot] c05beb66e9 chore(deps): update qdrant/qdrant docker tag to v1.16.2 2025-12-04 11:09:16 +00:00
Chris Coutinho 34ddb24014 Merge pull request #368 from cbcoutinho/renovate/actions-checkout-digest
chore(deps): update actions/checkout digest to 8e8c483
2025-12-03 13:09:39 +01:00
Chris Coutinho 9d69613df7 Merge pull request #369 from cbcoutinho/renovate/actions-checkout-6.x
chore(deps): update actions/checkout action to v6.0.1
2025-12-03 13:09:26 +01:00
github-actions[bot] 630f818538 bump: version 0.48.5 → 0.48.6 2025-12-03 12:09:01 +00:00
Chris Coutinho b280a720ff Merge pull request #370 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.15
2025-12-03 13:08:59 +01:00
Chris Coutinho 48bac9c212 Merge pull request #371 from cbcoutinho/renovate/mcp-1.x
fix(deps): update dependency mcp to >=1.23,<1.24
2025-12-03 13:08:30 +01:00
renovate-bot-cbcoutinho[bot] e88c49fb50 fix(deps): update dependency mcp to >=1.23,<1.24 2025-12-03 11:13:29 +00:00
renovate-bot-cbcoutinho[bot] 9e10a5a400 chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.15 2025-12-03 11:12:56 +00:00
renovate-bot-cbcoutinho[bot] 1dbea24fa2 chore(deps): update actions/checkout action to v6.0.1 2025-12-03 11:12:49 +00:00
renovate-bot-cbcoutinho[bot] 0606228b40 chore(deps): update actions/checkout digest to 8e8c483 2025-12-03 11:12:44 +00:00
Chris Coutinho f35b9f0988 Merge pull request #366 from cbcoutinho/renovate/anthropics-claude-code-action-digest
chore(deps): update anthropics/claude-code-action digest to 6337623
2025-12-02 13:17:39 +01:00
Chris Coutinho c400c46672 Merge pull request #367 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.14
2025-12-02 13:15:58 +01:00
renovate-bot-cbcoutinho[bot] fbdeb2161d chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.14 2025-12-02 11:08:38 +00:00
renovate-bot-cbcoutinho[bot] 8c7d03dd29 chore(deps): update anthropics/claude-code-action digest to 6337623 2025-12-02 11:08:33 +00:00
Chris Coutinho 135ce7b2df Merge pull request #364 from cbcoutinho/renovate/quay.io-keycloak-keycloak-26.x
chore(deps): update quay.io/keycloak/keycloak docker tag to v26.4.7
2025-12-02 07:07:36 +01:00
Chris Coutinho 0e47ae051b Merge pull request #365 from cbcoutinho/renovate/softprops-action-gh-release-2.x
chore(deps): update softprops/action-gh-release action to v2.5.0
2025-12-01 15:43:03 +01:00
renovate-bot-cbcoutinho[bot] 04255473d2 chore(deps): update softprops/action-gh-release action to v2.5.0 2025-12-01 11:07:53 +00:00
renovate-bot-cbcoutinho[bot] ce6bbff389 chore(deps): update quay.io/keycloak/keycloak docker tag to v26.4.7 2025-12-01 11:07:45 +00:00
Chris Coutinho 92c4bf36f6 perf(news): use direct API endpoint for get_item()
Replace O(n) fetch-all-and-filter approach with O(1) direct API call.
The News API v1-3 supports GET /items/{id} for single-item retrieval.

- Update get_item() to use direct endpoint
- Add unit test for get_item() method
- Fixes critical performance issue identified in code review

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 17:22:51 +01:00
Chris Coutinho 0bedbf1877 Merge remote-tracking branch 'origin/master' into feature/news-app-integration 2025-11-29 17:19:16 +01:00
Chris Coutinho a5cb6e1242 refactor(news): simplify vector sync to fetch all items
Remove the complex starred+unread filtering logic in scan_news_items().
The News app's auto-purge feature (default: 200 items per feed) already
limits the total number of items, making explicit filtering unnecessary.

Changes:
- Replace two API calls (starred + unread) with single all-items call
- Remove deduplication logic that merged both lists
- Update docstring to explain the simpler approach

This reduces code complexity while maintaining the same effective coverage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 15:05:34 +01:00
Chris Coutinho a33f6a2f15 feat(news): add Nextcloud News app integration
Add full integration for the Nextcloud News (RSS/Atom reader) app:

- Add NewsClient with complete CRUD operations for folders, feeds, and items
- Add 8 read-only MCP tools for listing/getting folders, feeds, items
- Add Pydantic models for News entities with camelCase alias support
- Add vector sync support for starred + unread items
- Add HTML to Markdown converter using markdownify for better embeddings
- Add Docker post-install hook to enable News app
- Add 25 unit tests for NewsClient API methods

Vector sync indexes starred and unread items, providing a balanced approach
that captures important (starred) and current (unread) content without
indexing the entire article history.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 14:39:31 +01:00
Chris Coutinho d79e9090e6 Merge pull request #351 from cbcoutinho/renovate/pin-dependencies
chore(deps): pin anthropics/claude-code-action action to a7e4c51
2025-11-29 12:39:10 +01:00
renovate-bot-cbcoutinho[bot] 97fd660e38 chore(deps): pin anthropics/claude-code-action action to a7e4c51 2025-11-29 11:05:15 +00:00
Chris Coutinho 96e168d035 Merge pull request #362 from cbcoutinho/renovate/actions-checkout-6.x
chore(deps): update actions/checkout action to v6
2025-11-29 00:07:55 +01:00
renovate-bot-cbcoutinho[bot] 4d2b77ecaf chore(deps): update actions/checkout action to v6 2025-11-28 23:06:18 +00:00
github-actions[bot] e48da80a4b bump: version 0.48.4 → 0.48.5 2025-11-28 23:03:07 +00:00
Chris Coutinho 6125312f61 Merge pull request #313 from cbcoutinho/renovate/pillow-12.x
fix(deps): update dependency pillow to v12
2025-11-29 00:02:36 +01:00
claude[bot] 007fd0c2e3 chore: add Renovate package rule to block Pillow >=12.0.0
Pillow 12.x is incompatible with fastembed which requires pillow<12.0.0.
Added package rule to prevent Renovate from updating Pillow to version 12+
and reverted pyproject.toml to use pillow<12.0.0.

Co-authored-by: Chris Coutinho <cbcoutinho@users.noreply.github.com>
2025-11-28 23:01:46 +00:00
Chris Coutinho c4f90d6a57 Merge pull request #361 from cbcoutinho/add-claude-github-actions-1764370764331
Add Claude Code GitHub Workflow
2025-11-29 00:00:04 +01:00
Chris Coutinho 5dd62c9466 "Claude Code Review workflow" 2025-11-28 23:59:26 +01:00
Chris Coutinho 4d072d7217 "Claude PR Assistant workflow" 2025-11-28 23:59:25 +01:00
Chris Coutinho b4242b1394 Merge pull request #360 from cbcoutinho/renovate/docker-metadata-action-digest
chore(deps): update docker/metadata-action digest to c299e40
2025-11-28 00:07:01 +01:00
renovate-bot-cbcoutinho[bot] fa2343dff9 chore(deps): update docker/metadata-action digest to c299e40 2025-11-27 17:04:27 +00:00
Chris Coutinho 1b1667bc2b Merge pull request #357 from cbcoutinho/renovate/shivammathur-setup-php-digest
chore(deps): update shivammathur/setup-php digest to 44454db
2025-11-26 18:25:06 +01:00
Chris Coutinho c2b4bf9c67 Merge pull request #358 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.13
2025-11-26 18:24:46 +01:00
Chris Coutinho 0845fefe6c Merge pull request #359 from cbcoutinho/renovate/qdrant-1.x
chore(deps): update helm release qdrant to v1.16.1
2025-11-26 18:24:34 +01:00
renovate-bot-cbcoutinho[bot] d911556a84 chore(deps): update helm release qdrant to v1.16.1 2025-11-26 17:04:52 +00:00
renovate-bot-cbcoutinho[bot] 38be8d9401 chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.13 2025-11-26 17:04:31 +00:00
renovate-bot-cbcoutinho[bot] 9f3190f62a chore(deps): update shivammathur/setup-php digest to 44454db 2025-11-26 17:04:26 +00:00
Chris Coutinho 41aeb7e0f2 Merge pull request #356 from cbcoutinho/renovate/quay.io-keycloak-keycloak-26.x
chore(deps): update quay.io/keycloak/keycloak docker tag to v26.4.6
2025-11-26 00:50:25 +01:00
renovate-bot-cbcoutinho[bot] f8e67519e1 chore(deps): update quay.io/keycloak/keycloak docker tag to v26.4.6 2025-11-25 23:06:05 +00:00
Chris Coutinho 4279dcba1e Merge pull request #354 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.12
2025-11-25 18:19:32 +01:00
Chris Coutinho be7e3d6b56 Merge pull request #355 from cbcoutinho/renovate/qdrant-qdrant-1.x
chore(deps): update qdrant/qdrant docker tag to v1.16.1
2025-11-25 18:19:07 +01:00
renovate-bot-cbcoutinho[bot] 41e128190b chore(deps): update qdrant/qdrant docker tag to v1.16.1 2025-11-25 17:06:22 +00:00
renovate-bot-cbcoutinho[bot] ba869ccde5 chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.12 2025-11-25 17:06:11 +00:00
Chris Coutinho 27fe066b23 Merge pull request #353 from cbcoutinho/renovate/docker.io-library-nextcloud-32.0.2
chore(deps): update docker.io/library/nextcloud:32.0.2 docker digest to 8cb1dc8
2025-11-23 19:41:19 +01:00
renovate-bot-cbcoutinho[bot] e94b8ff714 chore(deps): update docker.io/library/nextcloud:32.0.2 docker digest to 8cb1dc8 2025-11-23 17:04:03 +00:00
github-actions[bot] e3a6894904 bump: version 0.48.3 → 0.48.4 2025-11-23 16:40:06 +00:00
Chris Coutinho 92b97bda00 fix: Add rate limit retry logic to OpenAI provider
Add exponential backoff retry handling for OpenAI API rate limits
(429 errors). This is needed for GitHub Models API which has stricter
rate limits than standard OpenAI API.

- Add retry_on_rate_limit decorator with exponential backoff
- Max 5 retries with delays: 2s → 4s → 8s → 16s → 32s
- Apply to embed(), _embed_batch_request(), and generate() methods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 17:24:48 +01:00
Chris Coutinho d5c6039296 ci: Update rag pipeline 2025-11-23 16:33:39 +01:00
Chris Coutinho 3fa13c8bfd ci: Update rag pipeline 2025-11-23 16:12:37 +01:00
Chris Coutinho 9d306b71fa ci: Fix pytest path 2025-11-23 15:43:45 +01:00
Chris Coutinho 38a936c120 Merge pull request #352 from cbcoutinho/renovate/major-github-artifact-actions
chore(deps): update actions/upload-artifact action to v5
2025-11-23 12:43:43 +01:00
renovate-bot-cbcoutinho[bot] 86d13a7240 fix(deps): update dependency pillow to v12 2025-11-23 05:05:03 +00:00
renovate-bot-cbcoutinho[bot] 0b2d449ffa chore(deps): update actions/upload-artifact action to v5 2025-11-23 05:04:36 +00:00
Chris Coutinho d881373dce ci: Remove third_party from app mounts 2025-11-23 05:48:17 +01:00
github-actions[bot] 9ade4c65f3 bump: version 0.48.2 → 0.48.3 2025-11-23 04:44:17 +00:00
Chris Coutinho 5c73b85f65 fix: Increase MCP sampling timeout to 5 minutes for slower LLMs
- Increase sampling timeout from 30s to 300s in semantic.py to accommodate
  slower local LLMs like Ollama
- Refactor RAG integration tests to support multiple providers (ollama,
  openai, anthropic, bedrock)
- Remove unnecessary embedding_provider fixture since MCP server handles
  embeddings internally
- Add --provider flag via tests/integration/conftest.py
- Add provider_fixtures.py with factory functions for generation providers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 05:43:48 +01:00
github-actions[bot] f5764c01fc bump: version 0.48.1 → 0.48.2 2025-11-23 03:25:23 +00:00
Chris Coutinho 8c7c2a4407 Merge pull request #350 from cbcoutinho/feature/openai-provider-support
feature/openai provider support
2025-11-23 04:24:55 +01:00
Chris Coutinho 978de5e9a4 Merge branch 'master' into feature/openai-provider-support 2025-11-23 04:23:50 +01:00
Chris Coutinho 4e9859117c fix: Share vector sync state with FastMCP session lifespan via module singleton
The refactor in fafeaf3 moved background tasks to Starlette server lifespan
but broke nc_get_vector_sync_status because it still looked for streams in
FastMCP's AppContext (lifespan_context).

Add VectorSyncState module singleton to bridge the lifespans:
- starlette_lifespan sets the singleton when starting background tasks
- app_lifespan_basic reads from singleton and includes in AppContext
- MCP tools can now access document_receive_stream for pending count

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 04:20:47 +01:00
Chris Coutinho a134a0fc08 fix: Share vector sync state with FastMCP session lifespan via module singleton
The refactor in fafeaf3 moved background tasks to Starlette server lifespan
but broke nc_get_vector_sync_status because it still looked for streams in
FastMCP's AppContext (lifespan_context).

Add VectorSyncState module singleton to bridge the lifespans:
- starlette_lifespan sets the singleton when starting background tasks
- app_lifespan_basic reads from singleton and includes in AppContext
- MCP tools can now access document_receive_stream for pending count

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 04:20:09 +01:00
Chris Coutinho 6df58af0c3 ci: Decrease polling interval to 5s 2025-11-23 04:09:37 +01:00
github-actions[bot] 852606ec8b bump: version 0.48.0 → 0.48.1 2025-11-23 03:03:55 +00:00
Chris Coutinho caae6922be Merge pull request #349 from cbcoutinho/feature/openai-provider-support
feature/openai provider support
2025-11-23 04:03:29 +01:00
Chris Coutinho fafeaf3d83 refactor: Move background tasks to server lifespan and deprecate SSE transport
- Move scanner/processor tasks from FastMCP session lifespan to Starlette
  server lifespan (correct architecture: background tasks run once at
  server level, not per-session)
- Change default CLI transport from SSE to streamable-http
- Remove SSE transport option from CLI (SSE is deprecated)
- Remove SSE client session factory from test fixtures
- Add tracing instrumentation to BM25 hybrid search operations for
  better observability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 04:02:30 +01:00
Chris Coutinho 2ab8dad6a5 fix: Use WebDAV for tag creation and add LLM-as-a-judge for RAG tests
- Change create_tag() to use WebDAV POST instead of OCS API which
  returned 404 in some Nextcloud versions
- Add llm_judge() helper that evaluates system output against ground
  truth with simple TRUE/FALSE prompt
- Replace keyword-based assertions in RAG tests with LLM judge for
  more flexible semantic evaluation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 02:24:01 +01:00
Chris Coutinho 50216accde Merge pull request #348 from cbcoutinho/feature/openai-provider-support
feature/openai provider support
2025-11-23 01:56:49 +01:00
Chris Coutinho bf2fdac2d0 ci: Fix health endpoint 2025-11-23 01:56:17 +01:00
github-actions[bot] 626c4bf562 bump: version 0.47.0 → 0.48.0 2025-11-23 00:53:24 +00:00
Chris Coutinho a56b3f3d51 Merge pull request #347 from cbcoutinho/feature/openai-provider-support
feature/openai provider support
2025-11-23 01:52:55 +01:00
Chris Coutinho 2896fa1dc9 feat: Add tag management methods to WebDAV client
- Add get_file_info() to get file info including file ID via PROPFIND
- Add create_tag() to create system tags via OCS API
- Add get_or_create_tag() for idempotent tag creation
- Add assign_tag_to_file() to assign tags to files via WebDAV
- Add remove_tag_from_file() to remove tags from files

Also refactors RAG evaluation:
- Add indexed_manual_pdf fixture using existing nc_client/nc_mcp_client
- Remove manual tag creation steps from workflow (now handled by fixture)
- Add comprehensive unit tests for new WebDAV methods

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 01:51:42 +01:00
Chris Coutinho 04251401aa ci: Add permissions to github token 2025-11-23 01:26:22 +01:00
github-actions[bot] e86b6e83ae bump: version 0.46.2 → 0.47.0 2025-11-23 00:23:47 +00:00
Chris Coutinho 6f5e75da15 Merge pull request #346 from cbcoutinho/feature/openai-provider-support
feat: Add OpenAI provider support for embeddings and generation
2025-11-23 01:23:18 +01:00
Chris Coutinho b2742aab80 ci: Add RAG evaluation workflow with workflow_dispatch
Adds a manually-triggered GitHub Actions workflow for RAG evaluation:
- Builds Nextcloud User Manual PDF from documentation source
- Uploads PDF to Nextcloud via WebDAV
- Tags file with 'vector-index' for vector sync indexing
- Waits for vector sync to complete
- Runs RAG integration tests with OpenAI/GitHub Models API

Inputs:
- embedding_model: OpenAI embedding model (default: openai/text-embedding-3-small)
- generation_model: OpenAI generation model (default: openai/gpt-4o-mini)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 01:22:16 +01:00
Chris Coutinho 208365cd3d feat: Add OpenAI provider support for embeddings and generation
Adds OpenAI provider to the unified provider architecture (ADR-015),
supporting:
- OpenAI API (api.openai.com)
- GitHub Models API (models.github.ai/inference)
- OpenAI-compatible endpoints (Fireworks, Together, etc.)

Features:
- Embedding support with text-embedding-3-small/large models
- Text generation via chat completions API
- Automatic retry with exponential backoff for rate limits
- Provider auto-detection in registry (priority after Bedrock)

Environment variables:
- OPENAI_API_KEY: API key (required)
- OPENAI_BASE_URL: Base URL override (optional)
- OPENAI_EMBEDDING_MODEL: Embedding model (default: text-embedding-3-small)
- OPENAI_GENERATION_MODEL: Generation model (default: gpt-4o-mini)

Also adds:
- Integration tests for RAG pipeline with MCP sampling
- MCP client sampling support for integration tests
- Ground truth Q&A pairs for Nextcloud User Manual

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 00:33:32 +01:00
Chris Coutinho 26f679d86e Merge pull request #332 from cbcoutinho/renovate/docker.io-library-python-3.12-slim-trixie
chore(deps): update docker.io/library/python:3.12-slim-trixie docker digest to b43ff04
2025-11-23 00:29:07 +01:00
Chris Coutinho cf39a15db1 Merge pull request #345 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.11
2025-11-23 00:28:53 +01:00
renovate-bot-cbcoutinho[bot] 1f3c35f162 chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.11 2025-11-22 23:04:43 +00:00
renovate-bot-cbcoutinho[bot] 2bccc3dad9 chore(deps): update docker.io/library/python:3.12-slim-trixie docker digest to b43ff04 2025-11-22 23:04:40 +00:00
github-actions[bot] 959cb8b21a bump: version 0.46.1 → 0.46.2 2025-11-22 21:02:53 +00:00
Chris Coutinho f8a2410a0a Merge pull request #344 from cbcoutinho/fix/smithery-json-response
fix(smithery): Enable JSON response format for scanner compatibility
2025-11-22 22:02:24 +01:00
Chris Coutinho 03b984d5a7 fix(smithery): Enable JSON response format for scanner compatibility
The Smithery scanner was reporting "0 tools" despite the server returning
valid tool definitions. Root cause: the server was returning SSE-formatted
responses (event: message\ndata: {...}) which the scanner couldn't parse.

Changes:
- Add json_response=True to FastMCP for Smithery stateless mode
- Clean up verbose docstring examples in semantic.py and webdav.py

The MCP spec allows both SSE and plain JSON responses for HTTP transport.
Setting json_response=True returns Content-Type: application/json with
plain JSON-RPC instead of text/event-stream with SSE format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 22:01:18 +01:00
github-actions[bot] 57db18c6a3 bump: version 0.46.0 → 0.46.1 2025-11-22 18:54:11 +00:00
Chris Coutinho ea79e94842 Merge pull request #343 from cbcoutinho/fix/vector-viz-search
perf: Optimize vector viz search performance
2025-11-22 19:53:40 +01:00
Chris Coutinho b0612cfa0f perf: Optimize vector viz search performance
- Replace sequential Qdrant scroll calls with batch retrieve
  (50 HTTP requests → 1 request, ~50x faster vector fetch)

- Add point_id to SearchResult to enable batch retrieval by Qdrant point ID

- Reuse query embedding from search algorithm in viz_routes
  (eliminates redundant embedding call, saves ~30ms)

- Make BM25 encode() async with thread pool to avoid blocking event loop
  (~4.4s was blocking, now properly async)

- Run PCA computation in thread pool to avoid blocking event loop
  (~1.2s was blocking, now properly async)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 19:47:43 +01:00
github-actions[bot] 4e61d73da5 bump: version 0.45.0 → 0.46.0 2025-11-22 18:40:24 +00:00
Chris Coutinho 3b41776110 Merge pull request #342 from cbcoutinho/feature/smithery
feat: Add Smithery stateless deployment support (ADR-016)
2025-11-22 19:39:53 +01:00
Chris Coutinho 3e3d38696c docs(smithery): Make Smithery the primary Quick Start option
Reorganize README to promote Smithery as the fastest way to get started:
- Quick Start now features Smithery one-click deployment
- Docker instructions moved to separate "Docker (Self-Hosted)" section
- Added note about Smithery's stateless mode limitations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 19:38:11 +01:00
Chris Coutinho 7b22e5be0f build: add smithery image to docker compose 2025-11-22 19:06:25 +01:00
Chris Coutinho 39fba49cfe fix(smithery): Add JSON Schema metadata to mcp-config endpoint
Add proper JSON Schema metadata fields per Smithery documentation:
- $schema: JSON Schema draft-07
- $id: Schema identifier URL
- title: Human-readable title
- description: Schema description
- x-query-style: "flat" (no nested objects in our schema)
- additionalProperties: false

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:28:05 +01:00
Chris Coutinho 706a15f0bc fix(smithery): Use container runtime pattern for config discovery
ADR-016: For container runtime deployment, Smithery does not auto-generate
the .well-known/mcp-config endpoint like it does for Python CLI runtime.

Changes:
- Remove [tool.smithery] from pyproject.toml (not used in container mode)
- Remove smithery_server.py (Python CLI runtime specific)
- Add .well-known/mcp-config endpoint to return JSON Schema config
- Add SmitheryConfigMiddleware to extract config from URL query params
- Use ContextVar to pass session config to tool handlers

The container runtime passes config as URL query parameters to /mcp:
  GET /mcp?nextcloud_url=...&username=...&app_password=...

Tested:
- All 164 unit tests passing
- Docker container builds successfully
- .well-known/mcp-config returns valid JSON Schema
- Health endpoints working

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:22:55 +01:00
Chris Coutinho b8dc413b73 feat: Add Smithery CLI deployment support
- Add smithery package as dependency
- Create smithery_server.py with @smithery.server() decorator
- Add SmitheryConfigSchema for session config (nextcloud_url, username, app_password)
- Add [tool.smithery] section to pyproject.toml
- Remove manual .well-known/mcp-config endpoint (Smithery handles this)

Smithery CLI will automatically:
- Extract config schema from the decorated function
- Handle session config parsing from query parameters
- Make config accessible via ctx.session_config in tools

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 18:05:33 +01:00
Chris Coutinho 8d29ce0122 fix: Add Smithery lifespan and auth mode detection
- Add SmitheryAppContext dataclass for stateless mode
- Add app_lifespan_smithery() with minimal lifespan (no shared state)
- Update is_oauth_mode() to detect Smithery mode and return BasicAuth
- Use Smithery lifespan when SMITHERY_DEPLOYMENT=true
- Add .well-known/mcp-config endpoint for config discovery
- Skip document processors in Smithery mode (not enabled)

Fixes startup issues in Smithery mode where missing env credentials
would incorrectly trigger OAuth mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:48:53 +01:00
Chris Coutinho a272e7cbab build: Fix Dockerfile.smithery 2025-11-22 17:35:16 +01:00
Chris Coutinho ce55b239e2 build: Fix Dockerfile.smithery 2025-11-22 17:33:12 +01:00
Chris Coutinho 432ab73741 build: Add missing deps 2025-11-22 17:32:20 +01:00
Chris Coutinho f93d650992 feat: Implement ADR-016 Smithery stateless deployment mode
Adds support for Smithery hosted deployment with stateless operation:

- Add DeploymentMode enum with SELF_HOSTED and SMITHERY_STATELESS modes
- Add get_deployment_mode() to detect mode from SMITHERY_DEPLOYMENT env var
- Update get_client() to create per-request clients from session config
- Add conditional tool registration (skip semantic search in Smithery mode)
- Add conditional /app admin UI mounting (skip in Smithery mode)
- Create smithery.yaml with configSchema for user credentials
- Create Dockerfile.smithery for minimal stateless container
- Create smithery_main.py entrypoint for Smithery deployment

In Smithery mode:
- Users provide nextcloud_url, username, app_password via session config
- Each request creates a fresh NextcloudClient (no state between requests)
- Semantic search tools are disabled (no vector database)
- Admin UI (/app) is disabled (no webhooks, vector viz)

All existing self-hosted functionality remains unchanged.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:30:42 +01:00
github-actions[bot] f9da19d1a1 bump: version 0.44.1 → 0.45.0 2025-11-22 16:14:35 +00:00
Chris Coutinho d2b6a26fe4 Merge pull request #341 from cbcoutinho/fix/async-await-and-pdf-metadata
fix: Async/await patterns, PDF metadata, and vector visualization improvements
2025-11-22 17:14:06 +01:00
Chris Coutinho 482ef89a73 docs: Add ADR-016 for Smithery stateless deployment
Add architecture decision record for supporting Smithery-hosted MCP
server in a stateless mode for multi-user public Nextcloud instances.

Key decisions:
- New SMITHERY_STATELESS deployment mode alongside SELF_HOSTED
- Session-based configuration (nextcloud_url, username, app_password)
- Feature subset excluding semantic search and background sync
- Admin UI (/app) excluded in Smithery mode
- Per-request client creation from session config

This enables users to try the MCP server without self-hosting
infrastructure while supporting multiple Nextcloud instances.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:13:18 +01:00
Chris Coutinho 34fd17ba55 fix: Use alpha_composite for proper RGBA highlight blending
Drawing directly with ImageDraw on RGBA mode doesn't blend alpha
properly. Use Image.alpha_composite() with a transparent overlay
to achieve correct semi-transparent highlight fills.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 17:04:29 +01:00
Chris Coutinho 8baa07db84 fix: Remove pymupdf.layout.activate() to fix page_chunks behavior
pymupdf.layout.activate() causes pymupdf4llm.to_markdown() to ignore the
page_chunks=True option, returning a single string instead of list[dict].
This broke per-page chunking needed for semantic search indexing.

See: https://github.com/pymupdf/pymupdf4llm/issues/323

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 16:58:35 +01:00
Chris Coutinho ba8a53803a refactor: Simplify PDF text extraction with single to_markdown call
Replace parallel per-page extraction with single to_markdown(page_chunks=True)
call. This is more efficient as pymupdf4llm can optimize internally for
full-document processing instead of making N separate calls for N pages.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 03:52:02 +01:00
Chris Coutinho 31fade9730 perf: Optimize PDF processing with parallel extraction and single-render highlights
Phase 1 - PDF Highlighting Optimization:
- Render each page ONCE instead of once per chunk (N chunks = 1 render, not N)
- Use PIL to draw bounding boxes on copied base images (fast) instead of
  re-rendering page via pymupdf (slow)
- Add _find_chunk_bbox() to extract bbox without modifying page

Phase 2 - Parallel Page Extraction:
- Use anyio task group with run_sync() for parallel page extraction
- Each page extracted in separate thread via anyio.to_thread.run_sync()
- Event loop stays responsive during extraction
- Remove obsolete _process_sync() method

Expected improvement: 30-50% reduction in total PDF processing time.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 03:11:56 +01:00
Chris Coutinho fffe483c02 fix: Centralize PDF processing and generate separate images per chunk
Previously, pymupdf4llm.to_markdown() was called twice - once in
PyMuPDFProcessor during indexing and again in PDFHighlighter during
visualization. Different image path lengths caused different character
offsets, leading to highlighted pages not matching their chunks.

Also fixed issue where all chunks on the same page showed all highlights
instead of just their own highlight. Now restores original page contents
between chunks using xref stream caching.

Changes:
- Add PDFHighlighter class requiring pre-computed page_boundaries and
  full_text from document processor (no fallback extraction)
- Pass pre-computed data from processor to highlighter
- Extract page-relative portion of chunk text for cross-page chunks
- Add bounding box highlighting using text anchor search
- Run highlight generation in parallel with embedding/BM25
- Cache and restore page contents to isolate highlights per chunk

Results: Highlighting success rate improved from 51% to 95% (121/128).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-22 02:46:30 +01:00
Chris Coutinho 8c79993280 Merge pull request #334 from cbcoutinho/renovate/docker.io-library-redis-alpine
chore(deps): update docker.io/library/redis:alpine docker digest to 6cbef35
2025-11-21 14:24:54 +01:00
Chris Coutinho 8a0672a6be Merge pull request #339 from cbcoutinho/renovate/astral-sh-setup-uv-7.x
chore(deps): update astral-sh/setup-uv action to v7.1.4
2025-11-21 14:24:42 +01:00
Chris Coutinho 395f798ee2 Merge pull request #340 from cbcoutinho/renovate/ollama-1.x
chore(deps): update helm release ollama to v1.35.0
2025-11-21 14:24:26 +01:00
renovate-bot-cbcoutinho[bot] debff75221 chore(deps): update helm release ollama to v1.35.0 2025-11-21 11:09:18 +00:00
renovate-bot-cbcoutinho[bot] 4bf0a6c22e chore(deps): update astral-sh/setup-uv action to v7.1.4 2025-11-21 11:08:53 +00:00
Chris Coutinho fb025821cb Merge pull request #335 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-0.x
chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.11
2025-11-21 09:45:31 +01:00
Chris Coutinho ff880fd4c9 Merge pull request #338 from cbcoutinho/renovate/docker.io-library-nextcloud-32.x
chore(deps): update docker.io/library/nextcloud docker tag to v32.0.2
2025-11-21 09:34:20 +01:00
renovate-bot-cbcoutinho[bot] 03495d901d chore(deps): update docker.io/library/nextcloud docker tag to v32.0.2 2025-11-21 05:14:28 +00:00
github-actions[bot] 798958f20a bump: version 0.44.0 → 0.44.1 2025-11-21 00:39:23 +00:00
Chris Coutinho 699295c5be Merge pull request #336 from cbcoutinho/renovate/mcp-1.x
fix(deps): update dependency mcp to >=1.22,<1.23
2025-11-21 01:38:50 +01:00
Chris Coutinho a62a007c87 feat: Add context expansion to semantic search with chunk overlap removal
Implements optional context expansion for semantic search results that
fetches adjacent chunks (N-1 and N+1) from Qdrant to provide before/after
context. Removes configurable chunk overlap (default 200 chars) to avoid
duplicate text appearing in both context and excerpt.

Key changes:
- Add include_context and context_chars parameters to nc_semantic_search
  and nc_semantic_search_answer tools
- Implement Qdrant cache fast path for chunk retrieval (avoids re-fetching
  and re-parsing documents, especially important for PDFs)
- Add _get_chunk_by_index_from_qdrant() to fetch adjacent chunks
- Remove chunk overlap from before_context (last N chars) and after_context
  (first N chars) to prevent duplicate text
- Fetch context in parallel with anyio.Semaphore (max 20 concurrent)
- Pass through page_number from SearchResult to SemanticSearchResult
- Remove document-level deduplication (keep chunk-level dedup from algorithm)

Context expansion is opt-in via include_context=true parameter. When enabled:
- Populates has_context_expansion, marked_text, before_context, after_context
- Adds truncation flags when context exceeds context_chars limit
- Falls back to document fetch for legacy data with truncated excerpts

Related: nextcloud_mcp_server/search/context.py:87-382,
         nextcloud_mcp_server/server/semantic.py:161-255
2025-11-21 01:02:22 +01:00
renovate-bot-cbcoutinho[bot] d4fc1de80d fix(deps): update dependency mcp to >=1.22,<1.23 2025-11-20 23:11:11 +00:00
renovate-bot-cbcoutinho[bot] 0902b5653f chore(deps): update ghcr.io/astral-sh/uv docker tag to v0.9.11 2025-11-20 23:10:47 +00:00
renovate-bot-cbcoutinho[bot] 0b6a02075c chore(deps): update docker.io/library/redis:alpine docker digest to 6cbef35 2025-11-20 23:10:43 +00:00
Chris Coutinho 7880a8de30 Merge pull request #333 from cbcoutinho/renovate/actions-checkout-6.x
chore(deps): update actions/checkout action to v6
2025-11-20 20:17:21 +01:00
renovate-bot-cbcoutinho[bot] 2abedd6b4b chore(deps): update actions/checkout action to v6 2025-11-20 17:12:30 +00:00
Chris Coutinho 5a251a99e6 fix: Set is_placeholder=False in processor to fix search filtering
The processor was not setting is_placeholder field when writing real
document chunks to Qdrant. This caused the placeholder filter to exclude
all documents (since None != False), resulting in 0 search results.

Now explicitly sets is_placeholder: False in payload when writing real
indexed chunks, allowing search filters to correctly distinguish between
placeholders and real documents.
2025-11-20 17:15:19 +01:00
Chris Coutinho 25ef33de7f feat: Use Ollama native batch API in embed_batch()
- Switch from sequential loop to /api/embed batch endpoint
- Use 'input' array parameter instead of individual 'prompt' requests
- Process in chunks of 32 to avoid quality degradation (issue #6262)
- Reduces HTTP overhead: 128 texts = 4 requests instead of 128
- Maintains backward compatibility with embed() for single embeddings

Ref: ollama/ollama#6262
2025-11-20 16:50:13 +01:00
Chris Coutinho ec2c274cd9 fix: Increase placeholder staleness threshold to 5x scan interval
- Changed from 2x (120s) to 5x (300s) scan interval
- Large PDFs take 3-4 minutes to process, need longer threshold
- Prevents premature requeuing of in-flight documents
2025-11-20 15:36:49 +01:00
Chris Coutinho 47f0b3db9a fix: Add placeholder staleness check to prevent duplicate processing
- Only requeue documents if placeholder is older than 2x scan interval (120s default)
- Prevents scanner from immediately requeuing in-flight documents
- Fixes issue where PDFs were being reprocessed every 60 seconds
- Staleness check applied to both notes and files scanning logic
2025-11-20 15:30:10 +01:00
Chris Coutinho 233de3508f fix: Use empty SparseVector instead of None for placeholders
Qdrant validation rejects None for sparse vectors in named vector dicts.
Use models.SparseVector(indices=[], values=[]) instead to create valid
empty sparse vectors for placeholder points.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 15:15:10 +01:00
Chris Coutinho 13b2d0048c feat: Implement Qdrant placeholder state management
Introduces a placeholder-based state tracking system to prevent duplicate
document processing during the gap between scanner queuing and processor
completion.

**Key Changes:**

1. **Placeholder Helper Functions** (`vector/placeholder.py`):
   - `write_placeholder_point()` - Creates zero-vector placeholder when queuing
   - `query_document_metadata()` - Queries for existing entry (placeholder or real)
   - `delete_placeholder_point()` - Removes placeholder before writing real vectors
   - `get_placeholder_filter()` - Filters placeholders from user-facing queries

2. **Scanner Updates** (`vector/scanner.py`):
   - Replace `indexed_at` comparison with `modified_at` comparison
   - Write placeholder before queuing each document
   - Query per-document metadata instead of bulk-querying indexed_at
   - Fixes bug where files were resubmitted every scan cycle

3. **Processor Updates** (`vector/processor.py`):
   - Delete placeholder before upserting real vectors
   - Ensures no duplicate points in Qdrant

4. **Query Filters** (all search files):
   - Add `get_placeholder_filter()` to all user-facing queries
   - Ensures placeholders never appear in search results or visualizations
   - Applied to: bm25_hybrid.py, semantic.py, viz_routes.py, algorithms.py

**Architecture:**
- Placeholders use zero vectors with dimension from embedding service
- Payload includes `is_placeholder: True` flag for filtering
- Status field tracks: "pending", "processing", "completed", "failed"
- Deterministic UUIDs using uuid5 for consistent point IDs

**Impact:**
- Eliminates duplicate processing of same documents
- Fixes race condition where long-running documents get queued multiple times
- Prevents scanner from resubmitting files every scan cycle
- Maintains clean separation between in-flight and indexed documents

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 15:04:00 +01:00
Chris Coutinho 944dd760ca fix: Return empty array instead of null for query_coords when no results
When vector visualization search returns zero results, the code was returning
query_coords: null, which caused JavaScript error "can't access property 0,
queryCoords is null" when the frontend tried to access the array.

Changed to return empty array [] to match expected type and prevent crash.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 14:18:02 +01:00
Chris Coutinho d67aa6ae5c fix: Align PDF text extraction between indexing and context expansion
This commit fixes two critical issues with PDF processing:

1. **Text extraction mismatch (context expansion bug)**:
   - Indexing used pymupdf4llm.to_markdown() producing markdown text
   - Context expansion used page.get_text() producing plain text
   - Different text formats caused character offset misalignment
   - Search would find correct chunk, but expansion showed wrong section
   - Fixed by making context.py use pymupdf4llm.to_markdown() consistently

2. **Diagnostic logging for page number assignment**:
   - Added logging to verify page_boundaries exist in metadata
   - Added logging to verify assign_page_numbers() assigns values
   - Helps diagnose why page numbers show as null in search results

3. **mime_type storage bug**:
   - Fixed incorrect field reference in processor.py:405
   - Was using file_metadata.get("content_type", "")
   - Should use content_type from WebDAV response

Changes:
- nextcloud_mcp_server/search/context.py: Use pymupdf4llm.to_markdown()
  for PDF text extraction to match indexing method
- nextcloud_mcp_server/vector/processor.py: Add diagnostic logging for
  page boundaries and assignment, fix mime_type storage
- tests/unit/client/test_webdav.py: Fix import sorting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 13:57:50 +01:00
Chris Coutinho f1a5fac1b9 fix: Update models and viz to use int-only doc_id
- algorithms.py: Revert SearchResult.id to int (all docs use int IDs now)
- semantic.py: Revert SemanticSearchResult.id to int, remove Union import
- viz_routes.py: Remove str() conversion when querying doc_id from Qdrant
- viz_routes.py: Convert doc_id from query param to int in chunk context

Fixes vector visualization which was collapsing all chunks to a single
point because Qdrant queries were failing to match doc_id (string vs int).
2025-11-20 12:32:27 +01:00
Chris Coutinho d0691d5aa0 feat: Switch files to use numeric IDs with file_path resolution
- scanner.py: Use file_info['id'] as doc_id instead of file_path
- scanner.py: Pass file_path in DocumentTask for content retrieval
- processor.py: Store file_path in Qdrant payload for later lookup
- context.py: Add _get_file_path_from_qdrant() to resolve file_id → file_path
- context.py: Update get_chunk_with_context() to handle file ID resolution

This makes the system resilient to file renames since file IDs are stable
identifiers in Nextcloud, while file paths can change.
2025-11-20 12:00:47 +01:00
Chris Coutinho f1610bbd2e fix: Reconstruct full content for notes to match indexed offsets
Notes are indexed as "{title}\n\n{content}" in processor.py but were
being retrieved as just content during chunk expansion, causing
chunk_start_offset and chunk_end_offset to be misaligned.

This fix reconstructs the full content structure when fetching notes
for chunk expansion, ensuring the displayed chunks match the excerpts
shown in search results.

Fixes chunk/excerpt mismatch reported in vector visualization.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 11:33:12 +01:00
Chris Coutinho 327d843f64 feat: Implement per-chunk vector visualization with context expansion
Major improvements to vector visualization page:
- Refactor PCA to display individual chunks instead of averaged documents
- Add context expansion module for fetching surrounding text from notes and PDFs
- Update deduplication to use (doc_id, doc_type, chunk_start, chunk_end) keys
- Fix Alpine.js rendering with chunk-specific keys including offsets
- Refactor authentication helper to return NextcloudClient for better reuse
- Add async context manager support to NextcloudClient

Technical details:
- viz_routes.py: Fetch specific chunk vectors instead of averaging per document
- context.py: New module supporting both notes and PDF text extraction via PyMuPDF
- search algorithms: Extract page_number, chunk_index, total_chunks from Qdrant
- vector-viz.js/html: Use chunk positions in expansion tracking keys

This enables users to see which specific chunks match their query
and view them with surrounding context in the PCA visualization.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 11:22:20 +01:00
Chris Coutinho b8010270c1 fix: Add async/await, PDF metadata, and type safety fixes
This commit addresses multiple issues with async operations, PDF metadata
extraction, and type safety in document processing and search.

## Async/Await Fixes
- processor.py:259 - Added await for chunker.chunk_text(content)
- processor.py:270 - Added await for bm25_service.encode_batch(chunk_texts)
- tests/unit/test_document_chunker.py - Converted all 12 test methods to async

## PDF Metadata Enhancement
- pymupdf.py:143 - Added file_size metadata extraction
- pymupdf.py:145-206 - Refactored to extract text page-by-page
  - Manually loop through pages instead of using page_chunks=True
  - Generate page_boundaries metadata for precise page tracking
  - Works around pymupdf.layout.activate() breaking page_chunks=True
- processor.py:32-66 - Added assign_page_numbers() helper function
  - Assigns page numbers to chunks based on overlap with page boundaries
  - Handles chunks spanning multiple pages
- processor.py:298-300 - Call assign_page_numbers() for PDF files

## Type Safety Fixes
- bm25_hybrid.py:184 - Removed int() conversion of doc_id
- semantic.py:131 - Removed int() conversion of doc_id
- viz_routes.py:275 - Removed int() conversion of doc_id
- Added comments documenting that doc_id can be int (notes) or str (file paths)

## Testing
- All 18 tests passing (12 unit + 6 integration)
- No type errors in modified files
- Container logs show successful processing
- Vector viz searches working correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 02:37:07 +01:00
Chris Coutinho 0f24bdb17a docs: Add svg 2025-11-19 23:44:23 +01:00
github-actions[bot] bf11f16e2f bump: version 0.43.0 → 0.44.0 2025-11-19 22:43:03 +00:00
Chris Coutinho bf05ff8d6e Merge pull request #329 from cbcoutinho/feature/nextcloud-ui-improvements
feat: Redesign UI and improve vector visualization
2025-11-19 23:42:32 +01:00
Chris Coutinho c4ce28f05d fix: Improve 3D plot rendering with explicit dimensions and window resize support
- Get container dimensions before creating Plotly layout to render at correct size immediately
- Add init() method with window resize listener for responsive plot sizing
- Remove post-render resize call (no longer needed with explicit dimensions)
- Improve colorbar positioning and scene domain configuration

This eliminates the visual "jump" during initial render and ensures the plot resizes smoothly when the browser window changes size.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 19:43:20 +01:00
Chris Coutinho 9b2a06964b Merge pull request #331 from cbcoutinho/renovate/commitizen-tools-commitizen-action-0.x
chore(deps): update commitizen-tools/commitizen-action action to v0.26.0
2025-11-19 14:42:06 +01:00
Chris Coutinho c126c3ec03 fix: Preserve 3D plot camera and improve documentation
This commit addresses PR feedback and fixes plot camera behavior.

## JavaScript Fix - Camera Preservation
- Changed plot update strategy from recreating layout to using Plotly.restyle()
- Query point visibility now toggles via restyle() which only modifies trace visibility
- Camera position/zoom naturally preserved since layout remains untouched
- Resolves jumpy plot behavior when toggling "Show Query Point" checkbox

Related: nextcloud_mcp_server/auth/static/vector-viz.js:58-73

## Documentation Improvements
- Condensed vector-sync-ui.md from 316 to 94 lines (~70% reduction)
- Removed redundant FAQ section (content merged into main sections)
- Simplified use cases from 4 detailed sections to 3 focused paragraphs
- Streamlined troubleshooting to 3 common issues
- Merged technical details into overview section
- Retained all essential information while improving readability

## Screenshot Updates
Removed old/outdated images (5 files):
- rag-workflow-bidirectional-final.png
- rag-workflow-prominent-llm.png
- rag-workflow-simple-final.png
- vector-viz-interface.png
- welcome-page.png

Replaced with current screenshots (3 files):
- vector-viz-document-types-2col.png - Now shows plot + results
- vector-viz-chunk-context.png - Centered content view
- vector-viz-results.png - Updated results list

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 14:10:53 +01:00
Chris Coutinho 9bd02d7ef7 fix: Preserve 3D plot camera position and fix CSS loading
Two fixes for the vector visualization page:

1. **CSS Loading Fix**: Moved CSS <link> from vector_viz.html fragment
   to user_info.html <head> block. HTMX fragments don't process <link>
   tags in <head>, causing unstyled page. Now CSS loads correctly.

2. **Camera Preservation**: Modified renderPlot() to preserve camera
   position when toggling query point visibility. Previously, toggling
   the "Show Query Point" checkbox would reset zoom/rotation to default.
   Now reads existing camera settings from plot before updating.

Related: nextcloud_mcp_server/auth/static/vector-viz.js:123-130
Related: nextcloud_mcp_server/auth/templates/user_info.html:12

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 13:51:08 +01:00
renovate-bot-cbcoutinho[bot] e38a830f02 chore(deps): update commitizen-tools/commitizen-action action to v0.26.0 2025-11-19 11:07:37 +00:00
Chris Coutinho 18b753c3c7 Merge pull request #330 from cbcoutinho/renovate/docker.io-library-nextcloud-32.0.1
chore(deps): update docker.io/library/nextcloud:32.0.1 docker digest to d572839
2025-11-19 09:57:27 +01:00
renovate-bot-cbcoutinho[bot] b0735bae85 chore(deps): update docker.io/library/nextcloud:32.0.1 docker digest to d572839 2025-11-19 05:08:00 +00:00
Chris Coutinho 53689d076b feat: Improve vector visualization with static assets and fixes
- Extract CSS and JavaScript into separate static files
  - Created nextcloud_mcp_server/auth/static/vector-viz.css
  - Created nextcloud_mcp_server/auth/static/vector-viz.js
  - Updated templates to reference external assets

- Fix vector visualization issues:
  - Normalize vectors before PCA to match Qdrant's cosine distance
  - Add zero-norm and NaN detection/handling for large datasets
  - Enable responsive Plotly sizing (autosize + responsive config)
  - Widen plot area to full viewport width with minimized margins

- Improve visualization accuracy:
  - Query point now positioned correctly relative to documents
  - Handles 200+ points without JSON serialization errors
  - Full-width plot maximizes screen space utilization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 04:10:44 +01:00
Chris Coutinho 0f7d6c0e33 Merge pull request #327 from cbcoutinho/renovate/docker.io-library-python-3.12-slim-trixie
chore(deps): update docker.io/library/python:3.12-slim-trixie docker digest to 2e683fc
2025-11-19 01:53:05 +01:00
Chris Coutinho 16701fdb72 Merge pull request #328 from cbcoutinho/renovate/docker.io-library-redis-alpine
chore(deps): update docker.io/library/redis:alpine docker digest to 5013e94
2025-11-19 01:52:57 +01:00
Chris Coutinho 9db20a4d01 feat: Redesign UI to match Nextcloud ecosystem aesthetic
This commit updates the web interface to better align with Nextcloud's
design system and improve the Vector Viz layout.

Changes:
- Replace emoji icons with Material Design SVG icons for better
  consistency with Nextcloud apps
- Simplify navigation styling with minimal padding and subtle active
  states (250px width)
- Update CSS variables to match Nextcloud design system
- Restructure Vector Viz from two-column to single-column vertical
  layout for better plot visibility
- Move search controls to compact horizontal grid at top
- Make navigation toggle always visible (not just on mobile)
- Fix plot container sizing with overflow:visible to prevent colorbar
  clipping
- Remove heavy shadows and custom card styling for cleaner aesthetic
- Add error and success page templates with consistent styling

Technical details:
- Preserve Alpine.js for reactive functionality
- Use CSS Grid for responsive horizontal controls layout
- Add smooth transitions for navigation collapse/expand
- Maintain HTMX for dynamic content loading

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 00:45:19 +01:00
renovate-bot-cbcoutinho[bot] 7ddf8370e6 chore(deps): update docker.io/library/redis:alpine docker digest to 5013e94 2025-11-18 23:10:41 +00:00
renovate-bot-cbcoutinho[bot] 98dff98e9c chore(deps): update docker.io/library/python:3.12-slim-trixie docker digest to 2e683fc 2025-11-18 23:10:36 +00:00
Chris Coutinho 73e8012707 Merge pull request #325 from cbcoutinho/renovate/docker.io-library-python-3.12-slim-trixie
chore(deps): update docker.io/library/python:3.12-slim-trixie docker digest to 2bbc83f
2025-11-18 14:06:14 +01:00
Chris Coutinho c2fd87a5d3 Merge pull request #324 from cbcoutinho/renovate/docker.io-library-nextcloud-32.0.1
chore(deps): update docker.io/library/nextcloud:32.0.1 docker digest to f6232ea
2025-11-18 14:03:38 +01:00
github-actions[bot] 441d94301e bump: version 0.42.0 → 0.43.0 2025-11-18 12:56:15 +00:00
Chris Coutinho b488d69939 Merge pull request #326 from cbcoutinho/feature/notes2
feat: Replace custom document chunker with LangChain MarkdownTextSplitter
2025-11-18 13:55:34 +01:00
Chris Coutinho eec923eff5 feat: Replace custom document chunker with LangChain MarkdownTextSplitter
Migrates from custom word-based chunking to LangChain's MarkdownTextSplitter
for better semantic search quality. This implements the chunking portion of
ADR-011.

Changes:
- Replace custom regex word chunker with MarkdownTextSplitter
- Optimized for Markdown content (headers, code blocks, lists)
- Convert from word-based (512 words) to character-based (2048 chars) chunking
- Maintain backward-compatible ChunkWithPosition interface
- Update configuration defaults and validation
- Update all unit tests (12/12 passing)

Benefits:
- Respects markdown structure boundaries
- Never breaks code blocks or headers mid-chunk
- Preserves semantic coherence within chunks
- Expected 20-30% improvement in recall quality
- Industry-standard approach (used by production RAG systems)

Note: Full reindex required to apply new chunking to existing documents.
Current vector database still contains old word-based chunks.

Related: ADR-011 (Improving Semantic Search Quality)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 12:17:23 +01:00
renovate-bot-cbcoutinho[bot] 3642faf32c chore(deps): update docker.io/library/python:3.12-slim-trixie docker digest to 2bbc83f 2025-11-18 11:08:08 +00:00
renovate-bot-cbcoutinho[bot] 3b1cd96722 chore(deps): update docker.io/library/nextcloud:32.0.1 docker digest to f6232ea 2025-11-18 11:08:03 +00:00
Chris Coutinho 219d064459 Merge pull request #321 from cbcoutinho/renovate/pin-dependencies
chore(deps): pin ghcr.io/astral-sh/uv docker tag to 29bd450
2025-11-18 00:15:32 +01:00
Chris Coutinho d0ab8d071a Merge pull request #322 from cbcoutinho/renovate/actions-checkout-digest
chore(deps): update actions/checkout digest to 93cb6ef
2025-11-18 00:15:20 +01:00
Chris Coutinho b792e9d9a3 Merge pull request #323 from cbcoutinho/renovate/docker.io-library-mariadb-lts
chore(deps): update docker.io/library/mariadb:lts docker digest to 1cac849
2025-11-18 00:14:46 +01:00
renovate-bot-cbcoutinho[bot] 4288814ff4 chore(deps): update docker.io/library/mariadb:lts docker digest to 1cac849 2025-11-17 23:11:14 +00:00
renovate-bot-cbcoutinho[bot] f34a1c5677 chore(deps): update actions/checkout digest to 93cb6ef 2025-11-17 23:11:10 +00:00
renovate-bot-cbcoutinho[bot] 6d48f90112 chore(deps): pin ghcr.io/astral-sh/uv docker tag to 29bd450 2025-11-17 23:11:04 +00:00
Chris Coutinho b72aeca55f test: Add custom notes app 2025-11-17 22:14:01 +01:00
Chris Coutinho c1ae818b75 Merge pull request #317 from cbcoutinho/renovate/ghcr.io-astral-sh-uv-latest
chore(deps): update ghcr.io/astral-sh/uv:latest docker digest to 29bd450
2025-11-17 19:40:24 +01:00
Chris Coutinho ebca2bfc70 build: pin uv to 0.9.10, use --no-cache 2025-11-17 19:33:15 +01:00
Chris Coutinho 6dcd0bae48 Merge pull request #318 from cbcoutinho/renovate/actions-checkout-5.x
chore(deps): update actions/checkout action to v5.0.1
2025-11-17 19:23:32 +01:00
Chris Coutinho 818f643dca Merge pull request #319 from cbcoutinho/renovate/qdrant-1.x
chore(deps): update helm release qdrant to v1.16.0
2025-11-17 19:23:25 +01:00
Chris Coutinho d31b490f13 Merge pull request #320 from cbcoutinho/renovate/qdrant-qdrant-1.x
chore(deps): update qdrant/qdrant docker tag to v1.16.0
2025-11-17 19:23:16 +01:00
renovate-bot-cbcoutinho[bot] 839cf159b8 chore(deps): update qdrant/qdrant docker tag to v1.16.0 2025-11-17 17:09:02 +00:00
renovate-bot-cbcoutinho[bot] cefb438017 chore(deps): update helm release qdrant to v1.16.0 2025-11-17 17:08:54 +00:00
renovate-bot-cbcoutinho[bot] efc78a835e chore(deps): update actions/checkout action to v5.0.1 2025-11-17 17:08:34 +00:00
renovate-bot-cbcoutinho[bot] fa25a1b4df chore(deps): update ghcr.io/astral-sh/uv:latest docker digest to 29bd450 2025-11-17 17:08:28 +00:00
github-actions[bot] 8367208a03 bump: version 0.41.0 → 0.42.0 2025-11-17 07:25:33 +00:00
Chris Coutinho 52acc4bc07 Merge pull request #316 from cbcoutinho/feature/cleanup
feat(viz): Add dual-score display and improve UI controls
2025-11-17 08:25:04 +01:00
Chris Coutinho d374bfa1e5 feat(viz): Add dual-score display and improve UI controls
This commit enhances the vector visualization interface with better score
transparency and improved UX:

**Dual-Score Display:**
- Store original algorithm scores before normalization (viz_routes.py:203)
- Display both raw and normalized scores: "Raw Score: 0.842 (89% relative)"
- Update plot hover text with dual scores (userinfo_routes.py:740)
- Fixes issue where all queries showed at least one 100% match regardless
  of actual relevance (normalization artifact)

**UI Improvements:**
1. Fusion Method dropdown: Changed from x-show to :disabled
   - Prevents jarring layout shift when switching algorithms
   - Dropdown stays visible but grayed out when Semantic is selected
   - Better UX with opacity: 0.5 and cursor: not-allowed

2. Score Threshold: Changed step from 0.1 to "any"
   - Allows arbitrary float precision (0.7, 0.85, 0.123)
   - Users can now fine-tune threshold values

3. Document Types: Converted multi-select to checkbox grid
   - Replaced clunky Ctrl/Cmd multi-select listbox
   - Checkbox grid with cleaner layout
   - Positioned left of Score Threshold and Result Limit inputs
   - More intuitive UX

**Technical Details:**
- Raw score ranges vary by algorithm:
  - Semantic: 0.0-1.0 (cosine similarity)
  - BM25 RRF: ~0.001-0.033 (Reciprocal Rank Fusion)
  - BM25 DBSF: Can exceed 1.0 (Distribution-Based Score Fusion)
- Normalized scores (0-1) used for visual encoding (marker size, color)
- Original scores preserved in API response via getattr fallback

Files modified:
- nextcloud_mcp_server/auth/viz_routes.py (store original_score)
- nextcloud_mcp_server/auth/templates/vector_viz.html (UI controls)
- nextcloud_mcp_server/auth/userinfo_routes.py (plot hover text)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 08:05:49 +01:00
github-actions[bot] b1f7b1d30b bump: version 0.40.0 → 0.41.0 2025-11-17 05:57:12 +00:00
Chris Coutinho b8bdbb499f Merge pull request #315 from cbcoutinho/feature/cleanup
Feature/cleanup
2025-11-17 06:56:43 +01:00
Chris Coutinho 2522b13d35 ci: Add unit tests to ci 2025-11-17 06:51:40 +01:00
Chris Coutinho 6cfd7e2729 feat: add configurable fusion algorithms for BM25 hybrid search
Added support for two fusion algorithms (RRF and DBSF) to combine dense
semantic and sparse BM25 search results, with comprehensive documentation
and unit tests.

Changes:
- Added fusion parameter to nc_semantic_search and nc_semantic_search_answer tools
- Updated ADR-014 with detailed comparison of RRF vs DBSF fusion algorithms
- Added unit tests for fusion algorithm initialization and validation
- Updated search_method in responses to include fusion type (e.g., "bm25_hybrid_rrf")

Fusion Algorithms:
- RRF (Reciprocal Rank Fusion): Default, rank-based, general-purpose
- DBSF (Distribution-Based Score Fusion): Score normalization using statistics

RRF is recommended for most use cases due to its robustness and established
track record. DBSF may provide better results when retrieval systems have
very different score distributions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 06:48:43 +01:00
Chris Coutinho 3aa7128f45 feat: add chunk position tracking to vector indexing and search
Track character offsets (start_offset, end_offset) for each chunk in vector
database metadata, enabling precise chunk highlighting in visualization pane.

Changes:
- processor.py: Store chunk_start_offset and chunk_end_offset in Qdrant metadata
- processor.py: Added metadata_version=2 to indicate position tracking support
- search/semantic.py: Return chunk positions from search results
- server/semantic.py: Expose chunk positions in API responses (SemanticSearchResult)

Enables viz pane to:
1. Display exact matched chunk with surrounding context
2. Highlight the precise portion of text that matched the query
3. Build user trust by showing what the RAG system actually retrieved

Position tracking uses ChunkWithPosition dataclass from document_chunker.py
which provides character-accurate offsets in the original document.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 06:47:58 +01:00
Chris Coutinho c3282534eb feat: add vector viz template and chunk context endpoint
Extracted vector visualization HTML template to separate file to resolve
syntax conflicts between Jinja2, Alpine.js, and CSS. Added chunk context
endpoint for fetching matched chunks with surrounding text.

Changes:
- Moved vector_viz.html to templates/ directory (separates Jinja2/Alpine.js/CSS)
- Added /app/chunk-context endpoint for retrieving chunk text with context
- Updated .dockerignore to include HTML files in Docker builds
- Moved anthropic and boto3 to main dependencies (needed for production features)
- Added jinja2 dependency for template rendering

Fixes Jinja2 TemplateSyntaxError caused by CSS colons being parsed as
Jinja2 syntax when template was inline in Python code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 06:46:52 +01:00
Chris Coutinho 862308418e fix: prevent infinite loop in DocumentChunker with position tracking
Fixed a critical infinite loop bug in document_chunker.py that occurred
when the overlap parameter caused the chunker to not make forward progress.

Changes:
- Added ChunkWithPosition dataclass to track character positions
- Refactored chunk_text() to use regex word matching for accurate position tracking
- Added safety check to ensure forward progress (next_start_idx > start_idx)
- Changed return type from list[str] to list[ChunkWithPosition]

The bug manifested when:
1. end_idx reached len(word_matches) (processing last chunk)
2. next_start_idx = end_idx - overlap would not advance past start_idx
3. Loop would continue indefinitely without making progress

Fix ensures chunker always terminates by breaking when not advancing.

All 9 unit tests now pass in 1.66s (previously timing out at 180s).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 06:39:15 +01:00
Chris Coutinho 3464b21845 fix: Relax SearchResult validation to support DBSF fusion scores > 1.0
Fix false-positive validation error where DBSF (Distribution-Based Score
Fusion) correctly produces scores > 1.0 but SearchResult validation
incorrectly rejected them.

**Root Cause**: SearchResult.__post_init__() enforced scores in [0.0, 1.0]
range, but DBSF sums normalized scores from multiple retrieval systems
(dense semantic + sparse BM25), resulting in scores like 1.55 when both
systems strongly agree a document is relevant.

**Changes**:
- Relaxed validation to allow any score ≥ 0.0 (algorithms.py:147-157)
- Updated SearchResult and SemanticSearchResult documentation to explain
  score ranges for RRF ([0.0, 1.0]) vs DBSF (unbounded)
- Added comprehensive test coverage for both fusion methods
- Added DBSF fusion option to vector visualization UI
- Updated viz routes and vizApp() to support fusion parameter selection

**Testing**: All 157 unit tests pass, type checking passes, ruff passes

Fixes error: "Configuration error: Score must be between 0.0 and 1.0, got 1.1528953"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 06:32:30 +01:00
Chris Coutinho ea01ce7673 Merge pull request #311 from cbcoutinho/renovate/python-replacement
chore(deps): replace python docker tag with docker.io/library/python
2025-11-16 12:11:52 +01:00
Chris Coutinho 216cb94383 Merge branch 'master' into renovate/python-replacement 2025-11-16 12:11:36 +01:00
Chris Coutinho 5f3e0b84a3 Merge pull request #310 from cbcoutinho/renovate/pin-dependencies
chore(deps): pin dependencies
2025-11-16 12:10:57 +01:00
github-actions[bot] 39131cefcc bump: version 0.39.0 → 0.40.0 2025-11-16 11:09:40 +00:00
Chris Coutinho 9498c0fa36 Merge pull request #309 from cbcoutinho/feature/bedrock
feat: Unified Provider Architecture + Amazon Bedrock Support
2025-11-16 12:09:12 +01:00
Chris Coutinho ed33b39062 docs: fix ADR-014 template text and numbering
- Remove template instruction text from line 1
- Fix ADR numbering from 007 to 014 to match filename
2025-11-16 12:08:37 +01:00
Chris Coutinho 1504df6fb5 Merge branch 'master' into feature/bedrock 2025-11-16 12:08:23 +01:00
renovate-bot-cbcoutinho[bot] 392e1536b9 chore(deps): replace python docker tag with docker.io/library/python 2025-11-16 11:07:34 +00:00
renovate-bot-cbcoutinho[bot] 00ed3f07e5 chore(deps): pin dependencies 2025-11-16 11:07:28 +00:00
github-actions[bot] 050e9a56b9 bump: version 0.38.0 → 0.39.0 2025-11-16 11:02:48 +00:00
Chris Coutinho 7fccd47722 Merge pull request #304 from cbcoutinho/feature/bm25
feat: Replace custom keyword search with BM25 hybrid search via Qdrant
2025-11-16 12:02:18 +01:00
Chris Coutinho f65b95ef07 Update Dockerfile 2025-11-16 11:58:13 +01:00
Chris Coutinho c28fc955ca Merge origin/master into feature/bm25
Resolved conflicts:
- viz_routes.py: Kept bm25's extract_dense_vector() function for robust vector handling
- hybrid.py: Removed (bm25 uses native Qdrant RRF fusion instead)
- uv.lock: Regenerated after accepting master's dependencies

This merge brings in:
- RAG evaluation framework (ADR-013)
- Performance optimizations (double-fetch elimination)
- Migration from asyncio to anyio
- OpenTelemetry tracing improvements
- Notes app enhancements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 11:52:40 +01:00
Chris Coutinho ad4b45889f fix: suppress Starlette middleware type warnings in ty checker 2025-11-16 11:43:50 +01:00
Chris Coutinho 5b484c9226 feat: add unified provider architecture with Amazon Bedrock support
Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities.

## Major Changes

### Unified Provider Architecture (ADR-015)
- Created `nextcloud_mcp_server/providers/` with unified Provider ABC
- Providers now support optional capabilities (embeddings and/or generation)
- Auto-detection registry with priority: Bedrock → Ollama → Simple
- Backward compatible - existing code continues to work

### New Providers
- **BedrockProvider**: Full Amazon Bedrock integration
  - Embeddings: Titan Embed, Cohere Embed models
  - Generation: Claude, Llama, Titan Text, Mistral models
  - Model-specific request/response handling
  - AWS credential chain integration
- **OllamaProvider**: Migrated with both capabilities support
- **AnthropicProvider**: Moved from test code to production providers
- **SimpleProvider**: Migrated in-memory fallback provider

### Breaking Changes
None - full backward compatibility maintained:
- `embedding.get_embedding_service()` still works
- RAG evaluation tests updated to use unified providers
- All existing tests pass (127 unit tests)

### Testing
- Added 9 comprehensive Bedrock unit tests with mocked boto3
- All existing unit tests pass
- Type checking (ty) and linting (ruff) pass
- Verified backward compatibility

### Documentation
- `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR
- `docs/bedrock-setup.md`: AWS setup guide with IAM permissions
- `CLAUDE.md`: Updated with provider architecture section

### Dependencies
- Added `boto3>=1.35.0` to dev dependencies (optional)

## Environment Variables

### Bedrock
- `AWS_REGION`: AWS region (e.g., "us-east-1")
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings
- `BEDROCK_GENERATION_MODEL`: Model ID for generation
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials

### Ollama
- `OLLAMA_BASE_URL`: API URL
- `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL`: Generation model

## AWS Bedrock Permissions Required

Minimal IAM policy:
```json
{
  "Effect": "Allow",
  "Action": ["bedrock:InvokeModel"],
  "Resource": ["arn:aws:bedrock:*::foundation-model/*"]
}
```

See `docs/bedrock-setup.md` for detailed setup instructions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 11:36:58 +01:00
github-actions[bot] b58b200452 bump: version 0.37.0 → 0.38.0 2025-11-16 10:18:37 +00:00
Chris Coutinho c1aad94aa7 Merge pull request #308 from cbcoutinho/revert-305-feature/notes
Revert "Feature/notes"
2025-11-16 11:18:12 +01:00
github-actions[bot] 10129354d9 bump: version 0.36.0 → 0.37.0 2025-11-16 10:18:00 +00:00
Chris Coutinho 259d33b41d Revert "Feature/notes" 2025-11-16 11:17:59 +01:00
Chris Coutinho 32d8eaaab6 Merge pull request #305 from cbcoutinho/feature/notes
Feature/notes
2025-11-16 11:17:51 +01:00
Chris Coutinho 8799450c7d Merge pull request #306 from cbcoutinho/rag-evaluation
feat: RAG evaluation framework with performance improvements
2025-11-16 11:17:41 +01:00
Chris Coutinho 1a02819999 Merge pull request #307 from cbcoutinho/feature/mcp-tool-tracing
feat: Add OpenTelemetry tracing to @instrument_tool decorator
2025-11-16 11:17:33 +01:00
Chris Coutinho c4bf077050 feat: Add OpenTelemetry tracing to @instrument_tool decorator
Enhances the @instrument_tool decorator to create distributed traces
for all MCP tool executions, improving observability and debugging.

Changes:
- Modified @instrument_tool to wrap tool execution in trace_operation
- Added automatic span creation with mcp.tool.* span names
- Sanitized tool arguments before adding to span attributes
  (excludes password, token, secret, api_key, etag, ctx)
- Limited argument strings to 500 characters to prevent huge spans
- Maintained existing Prometheus metrics functionality
- Updated docs/observability.md to reflect correct decorator name
- Added comprehensive unit tests

All ~50+ MCP tools now emit traces automatically without code changes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 11:16:05 +01:00
Chris Coutinho f559ca049e Merge branch 'rag-evaluation' 2025-11-16 10:26:19 +01:00
Chris Coutinho 8e7b3c3ded Merge branch 'feature/notes' 2025-11-16 09:18:58 +01:00
Chris Coutinho 758cd5dbfb build: bump submodule 2025-11-16 09:18:45 +01:00
Chris Coutinho c74695af16 Merge branch 'feature/notes' 2025-11-16 08:28:00 +01:00
Chris Coutinho f36f92120c build: bump submodule 2025-11-16 08:27:49 +01:00
Chris Coutinho 1faf572546 Merge branch 'feature/bm25'
Resolves conflict in viz_routes.py by combining:
- Named vector extraction from feature/bm25
- Performance timing from master
2025-11-16 08:18:39 +01:00
Chris Coutinho 944b6dcf5a fix: Handle named vectors in visualization and semantic search
- viz_routes.py: Extract "dense" vector from named vector dict
- semantic.py: Specify using="dense" for BM25 hybrid collections
- Fixes "X must be 2D array" error in hybrid search
- Fixes "Dense vector  is not found" error in semantic search

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 08:16:35 +01:00
Chris Coutinho 2aa82d849c Merge branch 'feature/bm25' 2025-11-16 07:57:36 +01:00
Chris Coutinho fc6a2f14e4 fix: Update vizApp to use bm25_hybrid algorithm and remove deprecated weights
The visualization UI was still using the old 'hybrid' algorithm name and
weight parameters that were replaced by the BM25 hybrid search refactor.
This caused "Unknown algorithm: hybrid" errors when using the search
& visualize feature.

Changes:
- Update default algorithm from 'hybrid' to 'bm25_hybrid'
- Update default scoreThreshold from 0.7 to 0.0 to match backend
- Remove deprecated semanticWeight, keywordWeight, fuzzyWeight parameters
- Remove weight parameters from search request

Fixes the visualization search functionality after BM25 hybrid refactor.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 07:54:20 +01:00
Chris Coutinho d1fb7eb633 Merge branch 'rag-evaluation' 2025-11-16 07:46:17 +01:00
Chris Coutinho 5e80f22d42 Merge pull request #303 from cbcoutinho/renovate/commitizen-tools-commitizen-action-0.x
chore(deps): update commitizen-tools/commitizen-action action to v0.25.0
2025-11-16 07:37:05 +01:00
Chris Coutinho 96cee48258 build: Migrate image to debian-based 2025-11-16 07:32:01 +01:00
Chris Coutinho 16c22c953b fix: Update viz routes to use BM25 hybrid search after refactor
- Remove obsolete search algorithm imports (Fuzzy, Keyword, Hybrid)
- Update UI to only show Semantic and BM25 Hybrid algorithms
- Replace manual weight controls with RRF fusion info message
- Update default algorithm from "hybrid" to "bm25_hybrid"
- Remove weight parameters (semantic_weight, keyword_weight, fuzzy_weight)
- Update score_threshold default from 0.7 to 0.0 for RRF scoring
- Document ty type checker in CLAUDE.md

Fixes unresolved-import type errors after BM25 refactor.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 07:23:11 +01:00
Chris Coutinho b96657c935 ci: Add open-webui to docker-compose 2025-11-16 07:00:20 +01:00
Chris Coutinho 6fe5596c13 feat: Implement BM25 hybrid search with native Qdrant RRF fusion
Replace custom keyword/fuzzy search algorithms with industry-standard BM25
sparse vectors, combined with dense semantic vectors using Qdrant's native
Reciprocal Rank Fusion (RRF). This consolidates search architecture and
improves relevance for both semantic and keyword queries.

Key changes:
- Add fastembed dependency for BM25 sparse vector generation
- Update Qdrant collection schema to support named vectors (dense + sparse)
- Create BM25SparseEmbeddingProvider using FastEmbed's Qdrant/bm25 model
- Implement BM25HybridSearchAlgorithm with native Qdrant RRF prefetch
- Update document processor to generate both dense and sparse embeddings
- Simplify nc_semantic_search() tool to use BM25 hybrid only
- Remove legacy keyword.py, fuzzy.py, and custom hybrid.py (736 lines)
- Update ADR-014 with implementation notes and test results

Benefits:
- Consolidated architecture (single Qdrant database)
- Native database-level RRF fusion (more efficient)
- Industry-standard BM25 (replaces brittle custom keyword search)
- Better relevance across semantic and keyword queries
- Simplified codebase (-285 net lines)

Tests: All 125 tests passing (118 unit, 7 integration)

Implements ADR-014: Replace Custom Keyword Search with BM25 Hybrid Search

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 06:59:44 +01:00
Chris Coutinho b174e7f8fb ci: Add notes app for development 2025-11-16 06:57:28 +01:00
Chris Coutinho f5bc3e3bc3 docs: init ADR 2025-11-16 06:24:25 +01:00
renovate-bot-cbcoutinho[bot] a9eb2c1da2 chore(deps): update commitizen-tools/commitizen-action action to v0.25.0 2025-11-16 05:07:20 +00:00
github-actions[bot] 7a7ed79d56 bump: version 0.35.0 → 0.36.0 2025-11-15 23:32:55 +00:00
120 changed files with 17796 additions and 3039 deletions
+2
View File
@@ -5,3 +5,5 @@
!uv.lock
!nextcloud_mcp_server/**/*.py
!nextcloud_mcp_server/**/*.html
!nextcloud_mcp_server/auth/static/*
+3 -3
View File
@@ -15,17 +15,17 @@ jobs:
packages: write
steps:
- name: Check out
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6
with:
fetch-depth: 0
token: "${{ secrets.PERSONAL_ACCESS_TOKEN }}"
- name: Create bump and changelog
uses: commitizen-tools/commitizen-action@5b0848cd060263e24602d1eba03710e056ef7711 # 0.24.0
uses: commitizen-tools/commitizen-action@bb4f1df6601e2a1a891506581b0c53acdc88e07d # 0.26.0
with:
github_token: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
changelog_increment_filename: body.md
- name: Release
uses: softprops/action-gh-release@5be0e66d93ac7ed76da52eca8bb058f665c3a5fe # v2.4.2
uses: softprops/action-gh-release@a06a81a03ee405af7f2048a818ed3f03bbf83c7b # v2.5.0
with:
body_path: "body.md"
tag_name: v${{ env.REVISION }}
+57
View File
@@ -0,0 +1,57 @@
name: Claude Code Review
on:
pull_request:
types: [opened, synchronize]
# Optional: Only run on specific file changes
# paths:
# - "src/**/*.ts"
# - "src/**/*.tsx"
# - "src/**/*.js"
# - "src/**/*.jsx"
jobs:
claude-review:
# Optional: Filter by PR author
# if: |
# github.event.pull_request.user.login == 'external-contributor' ||
# github.event.pull_request.user.login == 'new-developer' ||
# github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6
with:
fetch-depth: 1
- name: Run Claude Code Review
id: claude-review
uses: anthropics/claude-code-action@6337623ebba10cf8c8214b507993f8062fd4ccfb # v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
prompt: |
REPO: ${{ github.repository }}
PR NUMBER: ${{ github.event.pull_request.number }}
Please review this pull request and provide feedback on:
- Code quality and best practices
- Potential bugs or issues
- Performance considerations
- Security concerns
- Test coverage
Use the repository's CLAUDE.md for guidance on style and conventions. Be constructive and helpful in your feedback.
Use `gh pr comment` with your Bash tool to leave your review as a comment on the PR.
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://docs.claude.com/en/docs/claude-code/cli-reference for available options
claude_args: '--allowed-tools "Bash(gh issue view:*),Bash(gh search:*),Bash(gh issue list:*),Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh pr list:*)"'
+50
View File
@@ -0,0 +1,50 @@
name: Claude Code
on:
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
issues:
types: [opened, assigned]
pull_request_review:
types: [submitted]
jobs:
claude:
if: |
(github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
(github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
issues: read
id-token: write
actions: read # Required for Claude to read CI results on PRs
steps:
- name: Checkout repository
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6
with:
fetch-depth: 1
- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@6337623ebba10cf8c8214b507993f8062fd4ccfb # v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
# This is an optional setting that allows Claude to read CI results on PRs
additional_permissions: |
actions: read
# Optional: Give a custom prompt to Claude. If this is not specified, Claude will perform the instructions specified in the comment that tagged it.
# prompt: 'Update the pull request description to include a summary of changes.'
# Optional: Add claude_args to customize behavior and configuration
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://docs.claude.com/en/docs/claude-code/cli-reference for available options
# claude_args: '--allowed-tools Bash(gh pr:*)'
+2 -2
View File
@@ -12,11 +12,11 @@ jobs:
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6
- name: Docker meta
id: meta
uses: docker/metadata-action@318604b99e75e41977312d83839a89be02ca4893 # v5
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5
with:
# list of Docker images to use as base name for tags
images: |
+1 -1
View File
@@ -14,7 +14,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6
with:
fetch-depth: 0
+105
View File
@@ -0,0 +1,105 @@
name: RAG Evaluation
on:
workflow_dispatch:
inputs:
manual_path:
description: 'Path to Nextcloud User Manual PDF in Nextcloud'
required: false
default: 'Nextcloud Manual.pdf'
embedding_model:
description: 'OpenAI embedding model'
required: false
default: 'openai/text-embedding-3-small'
generation_model:
description: 'OpenAI generation model'
required: false
default: 'openai/gpt-4o-mini'
jobs:
rag-evaluation:
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
models: read
steps:
- uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1
- name: Run docker compose with vector sync
uses: hoverkraft-tech/compose-action@248470ecc5ed40d8ed3d4480d8260d77179ef579 # v2.4.2
with:
compose-file: |
./docker-compose.yml
./docker-compose.ci.yml
up-flags: "--build"
env:
# Environment variables passed to docker-compose.ci.yml
OPENAI_API_KEY: ${{ secrets.GITHUB_TOKEN }}
OPENAI_BASE_URL: "https://models.github.ai/inference"
OPENAI_EMBEDDING_MODEL: ${{ inputs.embedding_model }}
OPENAI_GENERATION_MODEL: ${{ inputs.generation_model }}
VECTOR_SYNC_SCAN_INTERVAL: "5"
- name: Install the latest version of uv
uses: astral-sh/setup-uv@1e862dfacbd1d6d858c55d9b792c756523627244 # v7.1.4
- name: Wait for Nextcloud to be ready
run: |
echo "Waiting for Nextcloud..."
max_attempts=60
attempt=0
until curl -o /dev/null -s -w "%{http_code}\n" http://localhost:8080/ocs/v2.php/apps/serverinfo/api/v1/info | grep -q "401"; do
attempt=$((attempt + 1))
if [ $attempt -ge $max_attempts ]; then
echo "Service did not become ready in time."
exit 1
fi
echo "Attempt $attempt/$max_attempts: Service not ready, sleeping for 5 seconds..."
sleep 5
done
echo "Nextcloud is ready."
- name: Wait for MCP server to be ready
run: |
echo "Waiting for MCP server..."
max_attempts=30
attempt=0
until curl -o /dev/null -s -w "%{http_code}\n" http://localhost:8000/health/live | grep -q "200"; do
attempt=$((attempt + 1))
if [ $attempt -ge $max_attempts ]; then
echo "MCP server did not become ready in time."
exit 1
fi
echo "Attempt $attempt/$max_attempts: MCP not ready, sleeping for 2 seconds..."
sleep 2
done
echo "MCP server is ready."
- name: Run RAG evaluation tests
env:
NEXTCLOUD_HOST: "http://localhost:8080"
NEXTCLOUD_USERNAME: "admin"
NEXTCLOUD_PASSWORD: "admin"
RAG_MANUAL_PATH: ${{ inputs.manual_path }}
OPENAI_API_KEY: ${{ secrets.GITHUB_TOKEN }}
OPENAI_BASE_URL: "https://models.github.ai/inference"
OPENAI_EMBEDDING_MODEL: ${{ inputs.embedding_model }}
OPENAI_GENERATION_MODEL: ${{ inputs.generation_model }}
run: |
uv run pytest tests/integration/test_rag.py -v --log-cli-level=INFO --provider openai
- name: Capture MCP container logs
if: always()
run: |
echo "=== MCP Container Logs ==="
docker compose logs mcp --tail=500
- name: Upload test results
if: always()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # v5
with:
name: rag-evaluation-results
path: |
pytest-results.xml
retention-days: 30
+2 -2
View File
@@ -18,9 +18,9 @@ jobs:
contents: read
steps:
- name: Checkout
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6
- name: Install uv
uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7.1.3
uses: astral-sh/setup-uv@1e862dfacbd1d6d858c55d9b792c756523627244 # v7.1.4
- name: Install Python 3.11
run: uv python install 3.11
- name: Build
+7 -7
View File
@@ -9,9 +9,9 @@ jobs:
linting:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1
- name: Install the latest version of uv
uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7.1.3
uses: astral-sh/setup-uv@1e862dfacbd1d6d858c55d9b792c756523627244 # v7.1.4
- name: Check format
run: |
uv run --frozen ruff format --diff
@@ -27,7 +27,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1
with:
submodules: 'true'
@@ -35,7 +35,7 @@ jobs:
###### Required to build OIDC App ######
- name: Set up php 8.4
uses: shivammathur/setup-php@bf6b4fbd49ca58e4608c9c89fba0b8d90bd2a39f # v2
uses: shivammathur/setup-php@44454db4f0199b8b9685a5d763dc37cbf79108e1 # v2
with:
php-version: 8.4
coverage: none
@@ -49,14 +49,14 @@ jobs:
- name: Run docker compose
uses: hoverkraft-tech/compose-action@3846bcd61da338e9eaaf83e7ed0234a12b099b72 # v2.4.1
uses: hoverkraft-tech/compose-action@248470ecc5ed40d8ed3d4480d8260d77179ef579 # v2.4.2
with:
compose-file: "./docker-compose.yml"
#compose-flags: "--profile qdrant"
up-flags: "--build"
- name: Install the latest version of uv
uses: astral-sh/setup-uv@5a7eac68fb9809dea845d802897dc5c723910fa3 # v7.1.3
uses: astral-sh/setup-uv@1e862dfacbd1d6d858c55d9b792c756523627244 # v7.1.4
- name: Install Playwright dependencies
run: |
@@ -85,4 +85,4 @@ jobs:
NEXTCLOUD_USERNAME: "admin"
NEXTCLOUD_PASSWORD: "admin"
run: |
uv run pytest -v --log-cli-level=WARN -m smoke
uv run pytest -v --log-cli-level=WARN -m unit -m smoke
+3 -3
View File
@@ -1,6 +1,6 @@
[submodule "oidc"]
path = third_party/oidc
url = https://github.com/cbcoutinho/oidc
[submodule "third_party/oidc"]
path = third_party/oidc
url = https://github.com/cbcoutinho/oidc
[submodule "third_party/notes"]
path = third_party/notes
url = https://github.com/cbcoutinho/notes
+254
View File
@@ -1,3 +1,257 @@
## v0.49.0 (2025-12-08)
### Feat
- **news**: add Nextcloud News app integration
### Fix
- resolve all type checking errors (8 errors fixed)
### Refactor
- **news**: simplify vector sync to fetch all items
### Perf
- **news**: use direct API endpoint for get_item()
## v0.48.6 (2025-12-03)
### Fix
- **deps**: update dependency mcp to >=1.23,<1.24
## v0.48.5 (2025-11-28)
### Fix
- **deps**: update dependency pillow to v12
## v0.48.4 (2025-11-23)
### Fix
- Add rate limit retry logic to OpenAI provider
## v0.48.3 (2025-11-23)
### Fix
- Increase MCP sampling timeout to 5 minutes for slower LLMs
## v0.48.2 (2025-11-23)
### Fix
- Share vector sync state with FastMCP session lifespan via module singleton
- Share vector sync state with FastMCP session lifespan via module singleton
## v0.48.1 (2025-11-23)
### Fix
- Use WebDAV for tag creation and add LLM-as-a-judge for RAG tests
### Refactor
- Move background tasks to server lifespan and deprecate SSE transport
## v0.48.0 (2025-11-23)
### Feat
- Add tag management methods to WebDAV client
## v0.47.0 (2025-11-23)
### Feat
- Add OpenAI provider support for embeddings and generation
## v0.46.2 (2025-11-22)
### Fix
- **smithery**: Enable JSON response format for scanner compatibility
## v0.46.1 (2025-11-22)
### Perf
- Optimize vector viz search performance
## v0.46.0 (2025-11-22)
### Feat
- Add Smithery CLI deployment support
- Implement ADR-016 Smithery stateless deployment mode
### Fix
- **smithery**: Add JSON Schema metadata to mcp-config endpoint
- **smithery**: Use container runtime pattern for config discovery
- Add Smithery lifespan and auth mode detection
## v0.45.0 (2025-11-22)
### Feat
- Add context expansion to semantic search with chunk overlap removal
- Use Ollama native batch API in embed_batch()
- Implement Qdrant placeholder state management
- Switch files to use numeric IDs with file_path resolution
- Implement per-chunk vector visualization with context expansion
### Fix
- Use alpha_composite for proper RGBA highlight blending
- Remove pymupdf.layout.activate() to fix page_chunks behavior
- Centralize PDF processing and generate separate images per chunk
- Set is_placeholder=False in processor to fix search filtering
- Increase placeholder staleness threshold to 5x scan interval
- Add placeholder staleness check to prevent duplicate processing
- Use empty SparseVector instead of None for placeholders
- Return empty array instead of null for query_coords when no results
- Align PDF text extraction between indexing and context expansion
- Update models and viz to use int-only doc_id
- Reconstruct full content for notes to match indexed offsets
- Add async/await, PDF metadata, and type safety fixes
### Refactor
- Simplify PDF text extraction with single to_markdown call
### Perf
- Optimize PDF processing with parallel extraction and single-render highlights
## v0.44.1 (2025-11-21)
### Fix
- **deps**: update dependency mcp to >=1.22,<1.23
## v0.44.0 (2025-11-19)
### Feat
- Improve vector visualization with static assets and fixes
- Redesign UI to match Nextcloud ecosystem aesthetic
### Fix
- Improve 3D plot rendering with explicit dimensions and window resize support
- Preserve 3D plot camera and improve documentation
- Preserve 3D plot camera position and fix CSS loading
## v0.43.0 (2025-11-18)
### Feat
- Replace custom document chunker with LangChain MarkdownTextSplitter
## v0.42.0 (2025-11-17)
### Feat
- **viz**: Add dual-score display and improve UI controls
## v0.41.0 (2025-11-17)
### Feat
- add configurable fusion algorithms for BM25 hybrid search
- add chunk position tracking to vector indexing and search
- add vector viz template and chunk context endpoint
### Fix
- prevent infinite loop in DocumentChunker with position tracking
- Relax SearchResult validation to support DBSF fusion scores > 1.0
## v0.40.0 (2025-11-16)
### Feat
- add unified provider architecture with Amazon Bedrock support
### Fix
- suppress Starlette middleware type warnings in ty checker
## v0.39.0 (2025-11-16)
### Feat
- Implement BM25 hybrid search with native Qdrant RRF fusion
### Fix
- Handle named vectors in visualization and semantic search
- Update vizApp to use bm25_hybrid algorithm and remove deprecated weights
- Update viz routes to use BM25 hybrid search after refactor
## v0.38.0 (2025-11-16)
### Feat
- add concurrent uploads and --force flag to upload command
- implement RAG evaluation framework with CLI tooling
### Fix
- download qrels from BEIR ZIP instead of HuggingFace
### Refactor
- migrate asyncio to anyio for consistent structured concurrency
- replace httpx client with NextcloudClient in upload command
### Perf
- Eliminate double-fetching in semantic search sampling
- fix vector viz search performance and visual encoding
- make note deletion concurrent in upload --force
## v0.37.0 (2025-11-16)
### Feat
- Add OpenTelemetry tracing to @instrument_tool decorator
## v0.36.0 (2025-11-15)
### BREAKING CHANGE
- Search algorithms now require Qdrant to be populated.
Vector sync must be enabled and documents indexed for search to work.
### Feat
- Normalize hybrid search RRF scores to 0-1 range
- Enhance vector visualization UI and parallelize search verification
- Add Vector Viz tab to app home page
- Add vector visualization pane with multi-select document types
- Implement custom PCA to remove sklearn dependency
- Add multi-document Protocol with cross-app search support
- Update nc_semantic_search tool with algorithm selection
- Implement unified search algorithm module
### Fix
- Reorder tabs and fix viz pane session access
### Refactor
- Optimize Nextcloud access verification with centralized filtering
- Make all search algorithms query Qdrant payload, not Nextcloud
### Perf
- Exclude vector-sync status polling from distributed tracing
## v0.35.0 (2025-11-15)
### Feat
+58 -2
View File
@@ -17,13 +17,17 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- **Use Python 3.10+ union syntax**: `str | None` instead of `Optional[str]`
- **Use lowercase generics**: `dict[str, Any]` instead of `Dict[str, Any]`
- **Type all function signatures** - Parameters and return types
- **No explicit type checker configured** - Ruff handles linting only
- **Type checker**: `ty` is configured for static type checking
```bash
uv run ty check -- nextcloud_mcp_server
```
### Code Quality
- **Run ruff before committing**:
- **Run ruff and ty before committing**:
```bash
uv run ruff check
uv run ruff format
uv run ty check -- nextcloud_mcp_server
```
- **Ruff configuration** in pyproject.toml (extends select: ["I"] for import sorting)
@@ -57,8 +61,60 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- `nextcloud_mcp_server/server/` - MCP tool/resource definitions
- `nextcloud_mcp_server/auth/` - OAuth/OIDC authentication
- `nextcloud_mcp_server/models/` - Pydantic response models
- `nextcloud_mcp_server/providers/` - Unified LLM provider infrastructure (embeddings + generation)
- `tests/` - Layered test suite (unit, smoke, integration, load)
### Provider Architecture (ADR-015)
**Unified Provider System** for embeddings and text generation:
**Location:** `nextcloud_mcp_server/providers/`
- `base.py` - `Provider` ABC with optional capabilities
- `registry.py` - Auto-detection and factory pattern
- `ollama.py` - Ollama provider (embeddings + generation)
- `anthropic.py` - Anthropic provider (generation only)
- `bedrock.py` - Amazon Bedrock provider (embeddings + generation)
- `simple.py` - Simple in-memory provider (embeddings only, fallback)
**Usage:**
```python
from nextcloud_mcp_server.providers import get_provider
provider = get_provider() # Auto-detects from environment
# Check capabilities
if provider.supports_embeddings:
embeddings = await provider.embed_batch(texts)
if provider.supports_generation:
text = await provider.generate("prompt", max_tokens=500)
```
**Environment Variables:**
Bedrock:
- `AWS_REGION` - AWS region (e.g., "us-east-1")
- `BEDROCK_EMBEDDING_MODEL` - Embedding model ID (e.g., "amazon.titan-embed-text-v2:0")
- `BEDROCK_GENERATION_MODEL` - Generation model ID (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` - Optional, uses AWS credential chain
Ollama:
- `OLLAMA_BASE_URL` - API URL (e.g., "http://localhost:11434")
- `OLLAMA_EMBEDDING_MODEL` - Embedding model (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL` - Generation model (e.g., "llama3.2:1b")
- `OLLAMA_VERIFY_SSL` - SSL verification (default: "true")
Simple (fallback, no config needed):
- `SIMPLE_EMBEDDING_DIMENSION` - Dimension (default: 384)
**Auto-Detection Priority:** Bedrock → Ollama → Simple
**Backward Compatibility:**
- Old code using `nextcloud_mcp_server.embedding.get_embedding_service()` still works
- `EmbeddingService` now wraps `get_provider()` internally
**For Details:** See `docs/ADR-015-unified-provider-architecture.md`
## Development Commands (Quick Reference)
### Testing
+10 -3
View File
@@ -1,17 +1,24 @@
FROM ghcr.io/astral-sh/uv:0.9.9-python3.11-alpine@sha256:0faa7934fac1db7f5056f159c1224d144bab864fd2677a4066d25a686ae32edd
FROM docker.io/library/python:3.12-slim-trixie@sha256:b43ff04d5df04ad5cabb80890b7ef74e8410e3395b19af970dcd52d7a4bff921
COPY --from=ghcr.io/astral-sh/uv:0.9.16@sha256:ae9ff79d095a61faf534a882ad6378e8159d2ce322691153d68d2afac7422840 /uv /uvx /bin/
# Install dependencies
# 1. git (required for caldav dependency from git)
# 2. sqlite for development with token db
RUN apk add --no-cache git sqlite
RUN apt update && apt install --no-install-recommends --no-install-suggests -y \
git \
tesseract-ocr \
sqlite3 && apt clean
WORKDIR /app
COPY . .
RUN uv sync --locked --no-dev --no-editable
RUN uv sync --locked --no-dev --no-editable --no-cache
ENV PYTHONUNBUFFERED=1
ENV VIRTUAL_ENV=/app/.venv
ENV PATH=/app/.vnev/bin:$PATH
ENV TESSDATA_PREFIX=/usr/share/tesseract-ocr/5/tessdata
ENTRYPOINT ["/app/.venv/bin/nextcloud-mcp-server", "--host", "0.0.0.0"]
+44
View File
@@ -0,0 +1,44 @@
# Dockerfile for Smithery stateless deployment
# ADR-016: Stateless mode for multi-user public Nextcloud instances
#
# This image excludes:
# - Vector database dependencies (qdrant-client)
# - Background sync workers
# - Admin UI routes (/app)
# - Semantic search tools
#
# Features included:
# - Core Nextcloud tools (notes, calendar, contacts, files, deck, tables, cookbook)
# - Per-session app password authentication
# - Multi-user support via Smithery session config
FROM docker.io/library/python:3.12-slim-trixie@sha256:b43ff04d5df04ad5cabb80890b7ef74e8410e3395b19af970dcd52d7a4bff921
WORKDIR /app
# Install uv for fast dependency management
COPY --from=ghcr.io/astral-sh/uv:0.9.16@sha256:ae9ff79d095a61faf534a882ad6378e8159d2ce322691153d68d2afac7422840 /uv /uvx /bin/
# Install dependencies
# 1. git (required for caldav dependency from git)
# 2. sqlite for development with token db
RUN apt update && apt install --no-install-recommends --no-install-suggests -y \
git
# Copy project files
COPY . .
RUN uv sync --locked --no-dev --no-editable --no-cache
# Set Smithery mode environment variables
ENV SMITHERY_DEPLOYMENT=true
ENV VECTOR_SYNC_ENABLED=false
# Smithery sets PORT=8081 by default
EXPOSE 8081
# Health check endpoint
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD uv run python -c "import httpx; httpx.get('http://localhost:${PORT:-8081}/health/live').raise_for_status()"
CMD ["/app/.venv/bin/smithery-main"]
+26 -2
View File
@@ -1,6 +1,11 @@
<p align="center">
<img src="astrolabe.svg" alt="Nextcloud MCP Server" width="128" height="128">
</p>
# Nextcloud MCP Server
[![Docker Image](https://img.shields.io/badge/docker-ghcr.io/cbcoutinho/nextcloud--mcp--server-blue)](https://github.com/cbcoutinho/nextcloud-mcp-server/pkgs/container/nextcloud-mcp-server)
[![smithery badge](https://smithery.ai/badge/@cbcoutinho/nextcloud-mcp-server)](https://smithery.ai/server/@cbcoutinho/nextcloud-mcp-server)
**A production-ready MCP server that connects AI assistants to your Nextcloud instance.**
@@ -13,7 +18,20 @@ This is a **dedicated standalone MCP server** designed for external MCP clients
## Quick Start
Get up and running in 60 seconds using Docker:
The fastest way to get started is via [Smithery](https://smithery.ai/server/@cbcoutinho/nextcloud-mcp-server) - no Docker or self-hosting required:
1. Visit the [Smithery marketplace page](https://smithery.ai/server/@cbcoutinho/nextcloud-mcp-server)
2. Click "Deploy" and configure:
- **Nextcloud URL**: Your Nextcloud instance (e.g., `https://cloud.example.com`)
- **Username**: Your Nextcloud username
- **App Password**: Generate one in Nextcloud → Settings → Security → Devices & sessions
> [!NOTE]
> Smithery runs in stateless mode without semantic search. For full features, use [Docker](#docker-self-hosted) or see [ADR-016](docs/ADR-016-smithery-stateless-deployment.md).
## Docker (Self-Hosted)
For full features including semantic search, run with Docker:
```bash
# 1. Create a minimal configuration
@@ -29,10 +47,15 @@ docker run -p 127.0.0.1:8000:8000 --env-file .env --rm \
# 3. Test the connection
curl http://127.0.0.1:8000/health/ready
# 4. Connect to the endpoint
http://127.0.0.1:8000/sse
# Or with --transport streamable-http
http://127.0.0.1:8000/mcp
```
**Next Steps:**
- Create an app password in Nextcloud: Settings → Security → Devices & sessions
- Connect your MCP client (Claude Desktop, IDEs, `mcp dev`, etc.)
- See [docs/installation.md](docs/installation.md) for other deployment options (local, Kubernetes)
@@ -123,6 +146,7 @@ This enables natural language queries and helps discover related content across
- **[App Documentation](docs/)** - Notes, Calendar, Contacts, WebDAV, Deck, Cookbook, Tables
- **[Document Processing](docs/configuration.md#document-processing)** - OCR and text extraction setup
- **[Semantic Search Architecture](docs/semantic-search-architecture.md)** - Experimental vector search (Notes only, opt-in)
- **[Vector Sync UI Guide](docs/user-guide/vector-sync-ui.md)** - Browser interface for semantic search visualization and testing
### Advanced Topics
- **[OAuth Architecture](docs/oauth-architecture.md)** - How OAuth works (experimental)
+5
View File
@@ -0,0 +1,5 @@
#!/bin/bash
set -euox pipefail
php /var/www/html/occ app:enable news
@@ -2,4 +2,30 @@
set -euox pipefail
php /var/www/html/occ app:enable notes
echo "Installing and configuring notes app for testing..."
# Check if development notes app is mounted at /opt/apps/notes
if [ -d /opt/apps/notes ]; then
echo "Development notes app found at /opt/apps/notes"
# Remove any existing notes app in apps (from app store or old symlink)
if [ -e /var/www/html/custom_apps/notes ]; then
echo "Removing existing notes in apps..."
rm -rf /var/www/html/custom_apps/notes
fi
# Create symlink from apps to the mounted development version
# Per Nextcloud docs: apps outside server root need symlinks in server root
echo "Creating symlink: custom_apps/notes -> /opt/apps/notes"
ln -sf /opt/apps/notes /var/www/html/custom_apps/notes
echo "Enabling notes app from /opt/apps (development mode via symlink)"
php /var/www/html/occ app:enable notes
elif [ -d /var/www/html/custom_apps/notes ]; then
echo "notes app directory found in apps (already installed)"
php /var/www/html/occ app:enable notes
else
echo "notes app not found, installing from app store..."
php /var/www/html/occ app:install notes
php /var/www/html/occ app:enable notes
fi
+4
View File
@@ -0,0 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" width="512" height="512" viewBox="0 0 512 512">
<rect width="512" height="512" rx="80" ry="80" fill="#0082C9"/>
<path d="M255.9 21.04c-11.8 0-22.2 4.08-28.6 10.01-5.6 4.98-8.6 11.41-8.6 18.11 0 5.55 2.2 11.01 5.9 15.48-16.4 4.97-30.1 13.64-39 24.53 22.1-7.67 45.7-11.86 70.3-11.86 24.6 0 48.3 4.19 70.3 11.86-8.9-10.89-22.6-19.56-39-24.53 3.9-4.47 5.9-9.93 5.9-15.48 0-6.7-3-13.13-8.5-18.11-6.4-5.93-16.9-10.01-28.7-10.01zm0 20.34c5.3 0 10.1 1.27 13.6 3.52 1.7 1.16 3.4 2.43 3.4 4.27 0 1.76-1.7 3.03-3.4 4.19-3.5 2.33-8.3 3.61-13.6 3.61-5.3 0-10.1-1.28-13.6-3.61-1.6-1.16-3.3-2.43-3.3-4.19 0-1.84 1.7-3.11 3.3-4.27 3.5-2.25 8.3-3.52 13.6-3.52zm.1 48.1c-110.8 0-200.72 90.02-200.72 200.82S145.2 491 256 491s200.7-89.9 200.7-200.7c0-110.8-89.9-200.82-200.7-200.82zm0 32.62c92.9 0 168.2 75.3 168.2 168.2 0 92.8-75.3 168.2-168.2 168.2-92.9 0-168.26-75.4-168.26-168.2 0-92.9 75.36-168.2 168.26-168.2zm-8.2 6.3c-9.6.5-19 1.9-28.3 4.1l2.3 7.8c8.4-2 17.1-3.3 26-3.8v-8.1zm16.2 0v8.1c9 .5 17.7 1.8 26 3.8l2.2-7.8c-9.1-2.2-18.6-3.6-28.2-4.1zm-60 8.5c-9 3.2-17.6 7-25.8 11.6l4.1 7.1c7.7-4.3 15.6-7.9 23.9-10.8l-2.2-7.9zm103.7 0-2 7.9c8.4 2.9 16.2 6.5 23.8 10.8l4.2-7.1c-8.2-4.6-16.9-8.4-26-11.6zm-143.3 20.3c-7.5 5.4-14.6 11.4-21.1 17.9l5.8 5.8c5.9-6.1 12.5-11.7 19.5-16.6l-4.2-7.1zm182.9 0-4 7.1c6.9 4.9 13.5 10.5 19.5 16.6l5.7-5.8c-6.5-6.5-13.7-12.5-21.2-17.9zm-91.4 11.5c-37 0-67.4 28.6-70.3 64.9l15.9 4.7c.7-29.6 24.7-53.4 54.4-53.4 30.1 0 54.4 24.4 54.4 54.3 0 15-6.2 28.7-16 38.5l.1.1c1.7 2.7 3 5.6 4.1 8.6.9 3 1.7 5.7 2.3 8.6v.4c33.8-16.7 57.2-51.5 57.2-91.7 0-3.8-.2-7.3-.6-10.9-3.2-3.3-6.3-6.4-9.8-9.5 1.5 6.5 2.3 13.4 2.3 20.4 0 28.7-13 54.7-33.5 71.8 6.3-10.6 10.1-23 10.1-36.3 0-38.9-31.7-70.5-70.6-70.5zm-91.8 14.6c-3.3 3.1-6.5 6.2-9.7 9.5-.3 3.6-.5 7.1-.5 10.9 0 7.3.7 14.2 2.1 20.9l9.1 2.7c-2.1-7.5-3.1-15.4-3.1-23.6 0-7 .7-13.9 2.1-20.4zm-31.6 4c-5.8 7.1-10.9 14.6-15.4 22.6l7.1 4c4.1-7.4 8.8-14.3 14-20.8l-5.7-5.8zm246.8 0-5.7 5.8c5.3 6.5 10 13.4 13.9 20.8l7.1-4c-4.4-8-9.5-15.5-15.3-22.6zm-269.2 37.1c-2.5 5.7-4.6 11.4-6.4 17.6l.1-.3c3.4-5 7.9-9.3 12.9-12.5l.3-.6-6.9-4.2zm291.8 0-7.2 4.2c3.2 7.3 5.7 15.1 7.6 23.1l7.9-2.1c-2.1-8.8-4.9-17.3-8.3-25.2zm-261.2 11.5c-13.4.1-25.7 9-29.7 22.5l114.8 34.2c-4.9 16.7 4.6 34.2 21.2 39.2L361.7 366c16.6 5 34.1-4.4 39.1-21l-114.6-34.4c4.9-16.5-4.7-34.1-21.3-39.1 0 0-72.4-21.5-114.8-34.3-3.1-.9-6.3-1.4-9.4-1.3zm-42.09 29.7c-.9 6.9-1.4 14-1.4 21.3 0 1.3.1 2.9.1 4.2h8.09v-4.2c0-6.5.4-12.9 1.2-19.2l-7.99-2.1zm314.59 0-7.9 2.1c.7 6.3 1.3 12.7 1.3 19.2 0 1.3 0 2.9-.2 4.2h8.2v-4.2c0-7.3-.5-14.4-1.4-21.3zm-157.3 24.7c6.3 0 11.5 5 11.5 11.3 0 6.4-5.2 11.6-11.5 11.6s-11.5-5.2-11.5-11.6c0-6.3 5.2-11.3 11.5-11.3zM98.51 307.4c1 8.2 2.89 16.4 5.09 24.3l7.9-2.1c-2.1-7.2-3.8-14.6-4.8-22.2h-8.19zm306.69 0c-1.1 7.6-2.7 15-4.8 22.2l7.8 2.1c2.2-7.9 4.1-16.1 5.2-24.3h-8.2zm-191.3 10.9c-19 13.3-31.4 35.3-31.4 60.1 0 10.4 2.3 20.4 6.2 29.7 8.8 4.9 17.9 8.8 27.6 11.7-10.8-10.7-17.5-25.2-17.5-41.4 0-19 9.3-36 23.7-46.3-3.8-4.1-6.7-8.7-8.6-13.8zM116.8 345l-7.9 2c3.1 7.6 6.8 14.7 11 21.6l6.9-4.2c-3.8-6.2-7-12.8-10-19.4zm194.8 20.5c.9 4.1 1.4 8.5 1.4 12.9 0 16.2-6.7 30.7-17.4 41.4 9.6-2.9 18.8-6.8 27.5-11.7 4-9.3 6.2-19.3 6.2-29.7 0-2.7-.2-5.2-.4-7.7l-17.3-5.2zM136 377.9l-7.1 4.1c4.7 6.2 9.7 12.1 15.3 17.3l5.7-5.5c-5.1-5-9.7-10.3-13.9-15.9zm243.9 2.3-.2.1c-2.1.3-4 .6-6.2.7h-.1c-3.6 4.5-7.3 8.8-11.5 12.8l5.8 5.5c5.5-5.2 10.5-11.1 15.2-17.3l-3-1.8zm-217.8 24-5.9 5.9c6 4.8 12.2 9.7 18.8 13.6l3.8-7.8c-5.7-2.9-11.4-6.8-16.7-11.7zm187.7 0c-5.4 4.9-11.1 8.8-16.8 11.7l3.9 7.8c6.5-3.9 12.8-8.8 18.7-13.6l-5.8-5.9zm-156.4 19.5-4.1 6.8c6.6 4 13.7 5.8 20.7 8.8l2.2-7.9c-6.5-1.9-12.7-4.8-18.8-7.7zm125.2 0c-6.2 2.9-12.5 5.8-19.1 7.7l2.3 7.9c7.2-3 14-4.8 20.7-8.8l-3.9-6.8zm-90.7 11.7-2 7.8c7.1 1 14.5 1.9 21.9 1.9v-7.7c-6.8 0-13.5-1.1-19.9-2zm55.9 0c-6.3.9-13 2-19.8 2v7.7c7.5 0 14.8-.9 22.1-1.9l-2.3-7.8z" fill="#fff"/>
</svg>

After

Width:  |  Height:  |  Size: 3.8 KiB

+4 -4
View File
@@ -1,9 +1,9 @@
dependencies:
- name: qdrant
repository: https://qdrant.github.io/qdrant-helm
version: 1.15.5
version: 1.16.2
- name: ollama
repository: https://otwld.github.io/ollama-helm
version: 1.34.0
digest: sha256:d51c97d05be2614b751c0dd7267ef7dc959eff5ebef859c5f895c5c554b7a874
generated: "2025-11-09T17:08:02.86648061Z"
version: 1.35.0
digest: sha256:bcb0779739e4710b90bb65f6a7baeaa295bd0ba9776f8a1cf8d9b69d233c8ec0
generated: "2025-12-05T11:11:27.999374001Z"
+4 -4
View File
@@ -2,8 +2,8 @@ apiVersion: v2
name: nextcloud-mcp-server
description: A Helm chart for Nextcloud MCP Server - enables AI assistants to interact with Nextcloud
type: application
version: 0.35.0
appVersion: "0.35.0"
version: 0.49.0
appVersion: "0.49.0"
keywords:
- nextcloud
- mcp
@@ -27,10 +27,10 @@ annotations:
grafana_dashboard_folder: "Nextcloud MCP"
dependencies:
- name: qdrant
version: "1.15.5"
version: "1.16.2"
repository: https://qdrant.github.io/qdrant-helm
condition: qdrant.networkMode.deploySubchart
- name: ollama
version: "1.34.0"
version: "1.35.0"
repository: https://otwld.github.io/ollama-helm
condition: ollama.enabled
+25
View File
@@ -0,0 +1,25 @@
# CI-specific overrides for RAG evaluation pipeline
# This file is used by the rag-evaluation.yml workflow to configure the MCP
# container with OpenAI/GitHub Models API for vector embeddings.
#
# Usage:
# docker compose -f docker-compose.yml -f docker-compose.ci.yml up
#
# Environment variables (set in CI workflow):
# OPENAI_API_KEY - API key for embeddings (GitHub Models uses GITHUB_TOKEN)
# OPENAI_BASE_URL - API endpoint (e.g., https://models.github.ai/inference)
# OPENAI_EMBEDDING_MODEL - Model name (e.g., openai/text-embedding-3-small)
# OPENAI_GENERATION_MODEL - Model name for generation (e.g., openai/gpt-4o-mini)
services:
mcp:
environment:
# OpenAI provider configuration (required for CI vector sync)
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENAI_BASE_URL=${OPENAI_BASE_URL:-https://models.github.ai/inference}
- OPENAI_EMBEDDING_MODEL=${OPENAI_EMBEDDING_MODEL:-openai/text-embedding-3-small}
- OPENAI_GENERATION_MODEL=${OPENAI_GENERATION_MODEL:-openai/gpt-4o-mini}
# Faster sync for CI
- VECTOR_SYNC_SCAN_INTERVAL=${VECTOR_SYNC_SCAN_INTERVAL:-5}
# Enable document processing for PDF parsing
- ENABLE_DOCUMENT_PROCESSING=true
+25 -5
View File
@@ -3,7 +3,7 @@ services:
# https://hub.docker.com/_/mariadb
db:
# Note: Check the recommend version here: https://docs.nextcloud.com/server/latest/admin_manual/installation/system_requirements.html#server
image: docker.io/library/mariadb:lts@sha256:6b848cb24fbbd87429917f6c4422ac53c343e85692eb0fef86553e99e4f422f3
image: docker.io/library/mariadb:lts@sha256:1cac8492bd78b1ec693238dc600be173397efd7b55eabc725abc281dc855b482
restart: always
command: --transaction-isolation=READ-COMMITTED
volumes:
@@ -17,11 +17,11 @@ services:
# Note: Redis is an external service. You can find more information about the configuration here:
# https://hub.docker.com/_/redis
redis:
image: docker.io/library/redis:alpine@sha256:28c9c4d7596949a24b183eaaab6455f8e5d55ecbf72d02ff5e2c17fe72671d31
image: docker.io/library/redis:alpine@sha256:6cbef353e480a8a6e7f10ec545f13d7d3fa85a212cdcc5ffaf5a1c818b9d3798
restart: always
app:
image: docker.io/library/nextcloud:32.0.1@sha256:5b043f7ea2f609d5ff5635f475c30d303bec17775a5c3f7fa435e3818e669120
image: docker.io/library/nextcloud:32.0.2@sha256:8cb1dc8c26944115469dd22f4965d2ed35bab9cf8c48d2bb052c8e9f83821ded
restart: always
ports:
- 0.0.0.0:8080:80
@@ -158,7 +158,7 @@ services:
- oauth-tokens:/app/data
keycloak:
image: quay.io/keycloak/keycloak:26.4.5@sha256:653852bfdea2be6e958b9e90a976eff1c6de34edd55f2f679bdc48ef16bc528e
image: quay.io/keycloak/keycloak:26.4.7@sha256:9409c59bdfb65dbffa20b11e6f18b8abb9281d480c7ca402f51ed3d5977e6007
command:
- "start-dev"
- "--import-realm"
@@ -224,8 +224,28 @@ services:
- keycloak-tokens:/app/data
- keycloak-oauth-storage:/app/.oauth
# Smithery stateless deployment mode (ADR-016)
# Test with: docker compose --profile smithery up smithery
# Then: curl http://localhost:8081/.well-known/mcp-config
smithery:
build:
context: .
dockerfile: Dockerfile.smithery
restart: always
depends_on:
app:
condition: service_healthy
ports:
- 127.0.0.1:8081:8081
environment:
- SMITHERY_DEPLOYMENT=true
- VECTOR_SYNC_ENABLED=false
- PORT=8081
profiles:
- smithery
qdrant:
image: qdrant/qdrant:v1.15.5@sha256:0fb8897412abc81d1c0430a899b9a81eb8328aa634e7242d1bc804c1fe8fe863
image: qdrant/qdrant:v1.16.2@sha256:dab6de32f7b2cc599985a7c764db3e8b062f70508fb85ca074aa856f829bf335
restart: always
ports:
- 127.0.0.1:6333:6333 # REST API
@@ -1,7 +1,8 @@
# ADR-011: Improving Semantic Search Quality Through Better Chunking and Embeddings
**Status**: Proposed
**Status**: Partially Implemented (Chunking Complete, Embeddings Pending)
**Date**: 2025-11-12
**Implementation Date**: 2025-11-18 (Chunking)
**Authors**: Development Team
**Related**: ADR-003 (Vector Database Architecture), ADR-008 (MCP Sampling for RAG)
@@ -893,3 +894,50 @@ This ADR addresses the root causes of poor semantic search recall:
- No new infrastructure or ongoing costs
**Next Steps**: Approve ADR → Implement changes → Reindex → Validate → Production rollout
## Implementation Status
### Completed (2025-11-18)
**✅ Semantic Markdown-Aware Chunking (Option C1 + C3 Hybrid)**
Implementation details:
- Replaced custom word-based chunking with `MarkdownTextSplitter` from LangChain
- Optimized for Nextcloud Notes markdown content with special handling for:
- Headers (`#`, `##`, `###`, etc.)
- Code blocks (` ``` `)
- Lists (`-`, `*`, `1.`)
- Horizontal rules (`---`)
- Paragraphs and sentences
- Maintained `ChunkWithPosition` interface for backward compatibility
- Updated configuration defaults:
- `DOCUMENT_CHUNK_SIZE`: 512 words → 2048 characters
- `DOCUMENT_CHUNK_OVERLAP`: 50 words → 200 characters
- Updated unit tests to verify position tracking and boundary preservation
- All tests passing with markdown-aware character-based chunking
**Files Modified**:
- `nextcloud_mcp_server/vector/document_chunker.py` - LangChain integration
- `nextcloud_mcp_server/config.py` - Character-based defaults
- `tests/unit/test_document_chunker.py` - Updated test suite
**Dependencies Added**:
- `langchain-text-splitters>=1.0.0` (already present in `pyproject.toml`)
**Migration Required**:
- ⚠️ Full reindex required to apply new chunking strategy
- Existing documents in vector database use old word-based chunks
- See "Migration Strategy" section above for reindexing process
### Pending
**⏳ Embedding Model Upgrade (Option E1)**
Still to be implemented:
- Switch from `nomic-embed-text` (768-dim) to `mxbai-embed-large-v1` (1024-dim)
- Implement dynamic dimension detection in `ollama_provider.py`
- Create migration script for collection reindexing
- Run benchmarking to validate improvement
- Deploy to production with atomic collection swap
**Estimated Timeline**: 1-2 weeks for implementation and validation
+241
View File
@@ -0,0 +1,241 @@
# ADR-014: Replace Custom Keyword Search with BM25 Hybrid Search via Qdrant
**Date:** 2025-11-16
**Status:** Implemented
---
### 1. Context
Our RAG application currently employs two separate retrieval mechanisms:
1. **Dense (Semantic) Search:** Using vector embeddings stored in our Qdrant database to find semantically similar context.
2. **Keyword Search:** A custom-built fuzzy/character-based search to match-specific keywords, acronyms, and product codes that semantic search often misses.
This dual-system approach has several drawbacks:
* **Poor Relevance:** Our current keyword search is basic (e.g., `LIKE` queries or simple fuzzy matching). It is not as effective as modern full-text search algorithms like BM25.
* **Clunky Fusion:** We lack a robust, principled method to combine the results from the two systems. This leads to disjointed logic in the application layer and suboptimal context being passed to the LLM.
* **Architectural Complexity:** We must maintain two separate search pathways (one to Qdrant, one to the keyword search mechanism), increasing code complexity and maintenance overhead.
Our vector database, **Qdrant**, natively supports **hybrid search** by combining dense vectors with BM25-based **sparse vectors** in a single collection.
### 2. Decision
We will **deprecate and remove** the existing custom keyword/fuzzy search functionality.
We will **replace it by implementing native hybrid search within Qdrant**. This involves:
1. **Modifying the Qdrant Collection:** Updating our collection to support a named sparse vector index configured for BM25.
2. **Updating the Ingestion Pipeline:** For every document chunk, we will generate and upsert *both*:
* Its **dense vector** (from our existing embedding model).
* Its **sparse vector** (generated using a BM25-compatible model, e.g., `Qdrant/bm25` from `fastembed`).
3. **Refactoring Retrieval Logic:** All retrieval calls will be consolidated into a single Qdrant query using the `query_points` endpoint. This query will use the `prefetch` parameter to execute both dense and sparse searches, and Qdrant's built-in **Reciprocal Rank Fusion (RRF)** to automatically merge the results into a single, relevance-ranked list.
4. **Backfilling:** A one-time migration script will be created to generate and add sparse vectors for all existing documents in the Qdrant collection.
---
### 3. Considered Options
#### Option 1: Native Qdrant Hybrid Search (Chosen)
* Use Qdrant's built-in sparse vector and RRF capabilities.
* **Pros:**
* **Consolidated Architecture:** Manages both dense and sparse indexes in one database.
* **No Data Sync Issues:** Updates are atomic. A single `upsert` updates both representations.
* **Built-in Fusion:** RRF is handled natively and efficiently by the database.
* **Superior Relevance:** Replaces our brittle custom search with the industry-standard BM25.
* **Cons:**
* Requires a one-time data backfill which may be time-consuming.
* Adds a new step (sparse vector generation) to the ingestion pipeline.
#### Option 2: External Full-Text Search (e.g., Elasticsearch)
* Keep Qdrant for dense search and add a separate Elasticsearch/OpenSearch cluster for BM25.
* **Pros:**
* Provides a very powerful, dedicated full-text search engine.
* **Cons:**
* **High Complexity:** Introduces a new, stateful service to deploy, manage, and scale.
* **Data Sync Nightmare:** We would be responsible for ensuring that the document IDs and content in Qdrant and Elasticsearch are always perfectly synchronized. This is a major source of bugs.
* **Manual Fusion:** The application would have to query both systems and perform RRF manually.
#### Option 3: Keep Current System
* Make no changes.
* **Pros:**
* No engineering effort required.
* **Cons:**
* Fails to address the known relevance and architectural problems.
* Our RAG application's performance will remain suboptimal, especially for keyword-sensitive queries.
---
### 4. Rationale
**Option 1 is the clear winner.** It directly solves our primary problem (poor keyword matching) by adopting the industry-standard BM25.
Critically, it achieves this while **simplifying** our overall architecture, not complicating it. By leveraging features already present in our existing database (Qdrant), we avoid the massive operational and synchronization overhead of adding a second search system (Option 2).
This decision consolidates our retrieval logic, eliminates the data consistency problem, and moves the complex fusion logic (RRF) from the application layer into the database, where it can be performed more efficiently.
### 5. Consequences
**New Work:**
* **Ingestion:** The data ingestion pipeline must be updated to add the `fastembed` library (or similar), generate sparse vectors, and upsert them to the new named vector field in Qdrant.
* **Retrieval:** The application's retrieval service must be refactored to use the `query_points` endpoint with `prefetch` and `fusion=models.Fusion.RRF`.
* **Migration:** A one-time backfill script must be written and executed to add sparse vectors for all existing documents.
* **Infrastructure:** The Qdrant collection schema must be updated (or re-created) to add the `sparse_vectors_config`.
**Positive:**
* **Improved Accuracy:** Retrieval will be significantly more accurate, handling both semantic and keyword queries robustly.
* **Simplified Code:** The application's retrieval logic will be cleaner and simpler, with one endpoint instead of two.
* **Reduced Maintenance:** We will remove the custom fuzzy-search code, which is brittle and difficult to maintain.
**Negative:**
* The data backfill process will require careful management to avoid downtime.
* Ingestion time will slightly increase due to the extra step of sparse vector generation. This is considered a negligible trade-off for the gains in relevance.
---
### 6. Implementation Notes
**Implementation completed on 2025-11-16**
**Key Changes:**
1. **Dependencies** (pyproject.toml:25):
- Added `fastembed>=0.4.2` for BM25 sparse vector embeddings
- Adjusted `pillow` version constraint to be compatible with fastembed
2. **Qdrant Collection Schema** (nextcloud_mcp_server/vector/qdrant_client.py:113-128):
- Updated to named vectors: `{"dense": VectorParams(...), "sparse": SparseVectorParams(...)}`
- Added sparse vector configuration with BM25 index
- Maintains backward compatibility with existing collections (detects legacy schema)
3. **BM25 Embedding Provider** (nextcloud_mcp_server/embedding/bm25_provider.py):
- Created `BM25SparseEmbeddingProvider` using FastEmbed's `Qdrant/bm25` model
- Implements `encode()` and `encode_batch()` methods
- Returns sparse vectors as `{indices: list[int], values: list[float]}` format
4. **Document Indexing Pipeline** (nextcloud_mcp_server/vector/processor.py:229-255):
- Generates both dense (semantic) and sparse (BM25) embeddings for each document chunk
- Updates `PointStruct` to use named vectors: `vector={"dense": ..., "sparse": ...}`
- Maintains same chunking strategy (512 words, 50-word overlap)
5. **BM25 Hybrid Search Algorithm** (nextcloud_mcp_server/search/bm25_hybrid.py):
- Implements `BM25HybridSearchAlgorithm` using Qdrant's native RRF fusion
- Uses `prefetch` parameter for parallel dense + sparse search
- Applies `fusion=models.Fusion.RRF` for automatic result merging
- Maintains same deduplication and filtering logic as semantic search
6. **MCP Tool Updates** (nextcloud_mcp_server/server/semantic.py:39-68):
- Simplified `nc_semantic_search()` to use BM25 hybrid only
- Removed `algorithm`, `semantic_weight`, `keyword_weight`, `fuzzy_weight` parameters
- Updated default `score_threshold=0.0` for RRF scoring
- Returns `search_method="bm25_hybrid"` in responses
7. **Legacy Algorithm Removal**:
- Deleted `nextcloud_mcp_server/search/keyword.py` (278 lines)
- Deleted `nextcloud_mcp_server/search/fuzzy.py` (220 lines)
- Deleted `nextcloud_mcp_server/search/hybrid.py` (238 lines - custom RRF)
- Updated `nextcloud_mcp_server/search/__init__.py` to export only BM25 hybrid
**Migration Strategy:**
- No migration required (vector sync feature is experimental)
- New documents automatically indexed with both dense + sparse vectors
- Collection re-creation on first startup with updated schema
**Test Results:**
- All unit tests passing (118 passed)
- All integration tests passing (7 semantic search tests)
- Code formatting verified with ruff
**Benefits Realized:**
- ✅ Consolidated architecture (single Qdrant database for both dense + sparse)
- ✅ Native fusion algorithms (database-level, more efficient)
- ✅ Industry-standard BM25 (replaces custom keyword search)
- ✅ Simplified codebase (removed 736 lines of legacy code)
- ✅ Better relevance (handles both semantic and keyword queries)
- ✅ Configurable fusion methods (RRF and DBSF)
---
### 7. Fusion Algorithm Options
**Update: 2025-11-16**
The BM25 hybrid search now supports two fusion algorithms for combining dense (semantic) and sparse (BM25) search results:
#### Reciprocal Rank Fusion (RRF)
**Default fusion method.** RRF is a widely-used, well-established algorithm that combines rankings from multiple retrieval systems using the reciprocal rank formula:
```
RRF(doc) = Σ 1/(k + rank_i(doc))
```
where `k` is a constant (typically 60) and `rank_i(doc)` is the rank of the document in retrieval system `i`.
**Characteristics:**
-**General-purpose**: Works well across diverse query types and document collections
-**Rank-based**: Focuses on relative rankings rather than absolute scores
-**Established**: Well-tested, documented, and understood in IR literature
-**Robust**: Less sensitive to score distribution differences between systems
**When to use RRF:**
- Default choice for most use cases
- When you have mixed query types (semantic + keyword)
- When retrieval systems have very different score ranges
- When you want predictable, well-understood behavior
#### Distribution-Based Score Fusion (DBSF)
**Alternative fusion method.** DBSF normalizes scores from each retrieval system using distribution statistics before combining them:
1. **Normalization**: For each query, calculates mean (μ) and standard deviation (σ) of scores
2. **Outlier handling**: Uses μ ± 3σ as normalization bounds
3. **Fusion**: Sums normalized scores across systems
**Characteristics:**
-**Score-aware**: Uses actual relevance scores, not just rankings
-**Statistical**: Normalizes based on score distribution properties
- ⚠️ **Experimental**: Newer algorithm, less battle-tested than RRF
- ⚠️ **Sensitive**: May behave differently depending on score distributions
**When to use DBSF:**
- When retrieval systems have vastly different score ranges that RRF doesn't balance well
- When you want to experiment with score-based (vs rank-based) fusion
- When statistical normalization better matches your use case
- For A/B testing against RRF to measure retrieval quality improvements
#### Configuration
Both fusion algorithms are exposed via the `fusion` parameter in MCP tools:
```python
# Use RRF (default)
response = await nc_semantic_search(
query="async programming",
fusion="rrf" # Can be omitted, RRF is default
)
# Use DBSF
response = await nc_semantic_search(
query="async programming",
fusion="dbsf"
)
```
The `nc_semantic_search_answer` tool also supports the `fusion` parameter and passes it through to the underlying search.
#### Future: Configurable Weights
**Current limitation**: Neither RRF nor DBSF currently support per-system weights (e.g., 0.8 for semantic, 0.2 for BM25). This is a Qdrant platform limitation tracked in [qdrant/qdrant#6067](https://github.com/qdrant/qdrant/issues/6067).
When Qdrant adds weight support, the `fusion` parameter can be extended to accept weight configurations:
```python
# Hypothetical future API
response = await nc_semantic_search(
query="async programming",
fusion="rrf",
fusion_weights={"dense": 0.7, "sparse": 0.3} # Not yet implemented
)
```
**Recommendation**: Start with RRF (default). If you encounter cases where keyword matches are under- or over-weighted, experiment with DBSF. Monitor [qdrant/qdrant#6067](https://github.com/qdrant/qdrant/issues/6067) for configurable weight support.
@@ -0,0 +1,380 @@
# ADR-015: Unified Provider Architecture for Embeddings and Text Generation
**Status:** Accepted
**Date:** 2025-01-16
**Deciders:** Development Team
**Related:** ADR-003 (Vector Database), ADR-008 (MCP Sampling), ADR-013 (RAG Evaluation)
## Context
Prior to this refactoring, the codebase had two separate provider systems:
1. **Embedding Providers** (`nextcloud_mcp_server/embedding/`)
- Used `EmbeddingProvider` ABC with methods: `embed()`, `embed_batch()`, `get_dimension()`
- Had auto-detection via `EmbeddingService._detect_provider()`
- Used for semantic search and vector indexing (production)
2. **LLM Providers** (`tests/rag_evaluation/llm_providers.py`)
- Used `LLMProvider` Protocol with method: `generate()`
- Had separate factory function `create_llm_provider()`
- Used only for RAG evaluation tests (not production)
This fragmentation created several problems:
### Problems with Dual Provider Systems
1. **Code Duplication**
- Ollama configuration appeared in both `embedding/service.py` and `tests/rag_evaluation/llm_providers.py`
- Similar provider detection logic in multiple places
- Separate singleton patterns for each system
2. **Limited Extensibility**
- Hard-coded provider detection in `EmbeddingService._detect_provider()`
- No support for providers that offer both capabilities (like Bedrock)
- Adding new providers required modifying multiple files
3. **Inconsistent Patterns**
- BM25 provider didn't follow `EmbeddingProvider` ABC
- Different method names across providers (`embed` vs `encode`)
- ABC vs Protocol for type checking
4. **Difficult Scaling**
- Adding Amazon Bedrock (our third provider) would exacerbate all issues
- No clear path for future providers (OpenAI, Cohere, etc.)
### Amazon Bedrock Requirements
Bedrock naturally supports **both** embeddings and text generation:
- **Embeddings**: `amazon.titan-embed-text-v1/v2`, `cohere.embed-*`
- **Text Generation**: `anthropic.claude-*`, `meta.llama3-*`, `amazon.titan-text-*`
- **Unified API**: Single `invoke_model()` method via bedrock-runtime
This made it the perfect opportunity to establish a unified provider architecture.
## Decision
We refactored the provider infrastructure to use a **unified Provider ABC** with optional capabilities:
### 1. Unified Provider Interface
**New Structure:**
```
nextcloud_mcp_server/providers/
├── __init__.py
├── base.py # Provider ABC with optional capabilities
├── registry.py # Auto-detection and factory
├── ollama.py # Supports both embedding + generation
├── anthropic.py # Generation only
├── bedrock.py # Supports both embedding + generation
└── simple.py # Embedding only (testing fallback)
```
**Base Class (`providers/base.py`):**
```python
class Provider(ABC):
@property
@abstractmethod
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
pass
@property
@abstractmethod
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
pass
@abstractmethod
async def embed(self, text: str) -> list[float]:
"""Generate embedding (raises NotImplementedError if not supported)."""
pass
@abstractmethod
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
"""Generate batch embeddings (raises NotImplementedError if not supported)."""
pass
@abstractmethod
def get_dimension(self) -> int:
"""Get embedding dimension (raises NotImplementedError if not supported)."""
pass
@abstractmethod
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""Generate text (raises NotImplementedError if not supported)."""
pass
@abstractmethod
async def close(self) -> None:
"""Close provider and release resources."""
pass
```
### 2. Provider Registry
**Auto-Detection Priority** (`providers/registry.py`):
```python
class ProviderRegistry:
@staticmethod
def create_provider() -> Provider:
# 1. Bedrock (AWS_REGION or BEDROCK_*_MODEL)
# 2. Ollama (OLLAMA_BASE_URL)
# 3. Simple (fallback)
```
**Environment Variables:**
**Bedrock:**
- `AWS_REGION`: AWS region (e.g., "us-east-1")
- `AWS_ACCESS_KEY_ID`: AWS access key (optional, uses credential chain)
- `AWS_SECRET_ACCESS_KEY`: AWS secret key (optional)
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
- `BEDROCK_GENERATION_MODEL`: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
**Ollama:**
- `OLLAMA_BASE_URL`: Ollama API base URL (e.g., "http://localhost:11434")
- `OLLAMA_EMBEDDING_MODEL`: Model for embeddings (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL`: Model for text generation (e.g., "llama3.2:1b")
- `OLLAMA_VERIFY_SSL`: Verify SSL certificates (default: "true")
**Simple (no configuration, fallback):**
- `SIMPLE_EMBEDDING_DIMENSION`: Embedding dimension (default: 384)
### 3. Backward Compatibility
**Old Code Continues to Work:**
```python
# Old way (still works)
from nextcloud_mcp_server.embedding import get_embedding_service
service = get_embedding_service() # Returns singleton Provider
embeddings = await service.embed_batch(texts)
```
**New Way (recommended):**
```python
# New way (cleaner)
from nextcloud_mcp_server.providers import get_provider
provider = get_provider() # Returns singleton Provider
embeddings = await provider.embed_batch(texts)
# Can also use generation if provider supports it
if provider.supports_generation:
text = await provider.generate("prompt")
```
**Migration Path:**
- `embedding/service.py` now wraps `providers.get_provider()` for compatibility
- `tests/rag_evaluation/llm_providers.py` now uses unified providers
- Old imports still work, marked as deprecated in docstrings
### 4. Amazon Bedrock Implementation
**Features:**
- Supports both embeddings and text generation
- Model-specific request/response handling for:
- Titan Embed (amazon.titan-embed-text-*)
- Cohere Embed (cohere.embed-*)
- Claude (anthropic.claude-*)
- Llama (meta.llama3-*)
- Titan Text (amazon.titan-text-*)
- Mistral (mistral.*)
- Uses boto3 bedrock-runtime client
- Graceful degradation if boto3 not installed
- Async implementation matching existing patterns
**Model-Specific Handling:**
```python
# Bedrock embedding request (Titan)
{"inputText": text}
# Bedrock generation request (Claude)
{
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": max_tokens,
"temperature": 0.7,
"messages": [{"role": "user", "content": prompt}]
}
```
## Consequences
### Positive
1. **Sustainable Provider Additions**
- New providers only need to implement `Provider` ABC
- Auto-detection via environment variables
- No modifications to existing code required
2. **Code Consolidation**
- Single provider interface instead of two
- Unified configuration pattern
- Eliminated duplication
3. **Better Extensibility**
- Providers can support one or both capabilities
- Clear capability detection via properties
- Registry pattern simplifies auto-detection
4. **Improved Testing**
- RAG evaluation can use any provider (Ollama, Anthropic, Bedrock)
- Comprehensive unit tests for all providers
- Mocked boto3 tests for Bedrock
5. **Production-Ready Bedrock Support**
- Full embedding and generation support
- Multiple model families supported
- AWS credential chain integration
### Neutral
1. **Optional Boto3 Dependency**
- boto3 is dev dependency only (not required for core functionality)
- Bedrock provider gracefully fails if boto3 not installed
- Users who want Bedrock must `pip install boto3`
2. **Capability Properties**
- All providers must implement capability properties
- Methods raise `NotImplementedError` if capability not supported
- Clear error messages guide users to alternatives
### Negative
1. **Migration Effort**
- Existing code must be migrated to new imports (optional, backward compatible)
- Documentation needs updating
- Users must learn new environment variables
2. **Increased Complexity**
- Provider base class has more methods (embedding + generation)
- More environment variables to configure
- Capability detection adds runtime checks
## Implementation
### Files Created
**New Provider Infrastructure:**
- `nextcloud_mcp_server/providers/__init__.py`
- `nextcloud_mcp_server/providers/base.py`
- `nextcloud_mcp_server/providers/registry.py`
- `nextcloud_mcp_server/providers/ollama.py`
- `nextcloud_mcp_server/providers/anthropic.py`
- `nextcloud_mcp_server/providers/bedrock.py`
- `nextcloud_mcp_server/providers/simple.py`
**Tests:**
- `tests/unit/providers/__init__.py`
- `tests/unit/providers/test_bedrock.py` (9 unit tests)
**Documentation:**
- `docs/ADR-015-unified-provider-architecture.md` (this file)
### Files Modified
**Backward Compatibility:**
- `nextcloud_mcp_server/embedding/service.py` - Now wraps `get_provider()`
- `tests/rag_evaluation/llm_providers.py` - Uses unified providers
**Dependencies:**
- `pyproject.toml` - Added `boto3>=1.35.0` to dev dependencies
### Testing Results
**Unit Tests:** 127 passed (including 9 new Bedrock tests)
**Type Checking:** All checks passed (ty)
**Linting:** All checks passed (ruff)
**Backward Compatibility:** Verified - existing embedding tests work
## Alternatives Considered
### Alternative 1: Keep Separate Provider Systems
**Pros:**
- No refactoring needed
- Simpler short-term
**Cons:**
- Bedrock would need to be implemented twice
- Continued code duplication
- No long-term scalability
**Decision:** Rejected - technical debt would continue to grow
### Alternative 2: Separate Embedding and Generation Providers
Use composition instead of unified interface:
```python
class CombinedProvider:
def __init__(self, embedding: EmbeddingProvider, generation: LLMProvider):
self.embedding = embedding
self.generation = generation
```
**Pros:**
- Clearer separation of concerns
- Simpler individual providers
**Cons:**
- Bedrock and Ollama naturally do both - artificial separation
- More complex configuration (two providers to configure)
- More boilerplate code
**Decision:** Rejected - unified interface better matches provider capabilities
### Alternative 3: Plugin System
Dynamic provider registration via entry points:
```python
# setup.py
entry_points={
'nextcloud_mcp.providers': [
'ollama = nextcloud_mcp_server.providers.ollama:OllamaProvider',
'bedrock = nextcloud_mcp_server.providers.bedrock:BedrockProvider',
]
}
```
**Pros:**
- Most extensible
- Third-party providers possible
**Cons:**
- Over-engineered for current needs
- Added complexity
- No immediate benefit
**Decision:** Deferred - can add later if needed
## Future Work
1. **Additional Providers**
- OpenAI (embeddings + generation)
- Cohere (embeddings + generation)
- Google Vertex AI
- Azure OpenAI
2. **Provider Features**
- Streaming generation support
- Batch API optimization (when available)
- Model-specific optimizations
- Cost tracking and metrics
3. **Configuration Improvements**
- Provider profiles (development, production)
- Model aliasing (e.g., "small", "large")
- Fallback provider chains
4. **Testing**
- Integration tests with real Bedrock endpoints
- Performance benchmarking across providers
- Cost comparison analysis
## References
- [boto3 Bedrock Runtime Documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
- [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
- ADR-003: Vector Database and Semantic Search
- ADR-008: MCP Sampling for Semantic Search
- ADR-013: RAG Evaluation Framework
@@ -0,0 +1,492 @@
# ADR-016: Smithery Stateless Deployment for Multi-User Public Nextcloud Instances
**Status:** Proposed
**Date:** 2025-01-22
**Deciders:** Development Team
**Related:** ADR-004 (OAuth), ADR-007 (Background Vector Sync), ADR-015 (Unified Provider)
## Context
[Smithery](https://smithery.ai) is a hosting platform and marketplace for MCP servers that provides:
- **Discovery**: Marketplace listing for MCP servers
- **Hosting**: Containerized deployment with auto-scaling
- **Authentication UI**: OAuth flow presentation for users
- **Session Configuration**: Per-user settings passed via URL parameters
- **Observability**: Usage logs and monitoring
### Current Architecture Limitations
The current nextcloud-mcp-server architecture assumes a **self-hosted deployment** with:
1. **Persistent Infrastructure**
- Qdrant vector database for semantic search
- Background sync worker for content indexing
- Refresh token storage for offline access
2. **Single-Tenant Configuration**
- Environment variables configure one Nextcloud instance
- `NEXTCLOUD_HOST`, `NEXTCLOUD_USERNAME`, `NEXTCLOUD_PASSWORD`
- Or OAuth with a single IdP
3. **Stateful Operations**
- Vector sync maintains index state across requests
- Token storage persists between sessions
### Smithery Hosting Constraints
Smithery-hosted containers are **stateless by design**:
- No persistent storage between requests
- No background workers or cron jobs
- No databases (Qdrant, Redis, etc.)
- Containers may be recycled at any time
- Configuration passed per-session via URL parameters
### Opportunity
Many users have **publicly accessible Nextcloud instances** and want to:
1. Try the MCP server without self-hosting infrastructure
2. Connect multiple users to different Nextcloud instances
3. Use basic Nextcloud tools without semantic search
4. Benefit from Smithery's discovery and OAuth UI
## Decision
Implement a **stateless deployment mode** for Smithery that:
1. **Disables stateful features** (vector sync, semantic search)
2. **Creates clients per-session** from Smithery configuration
3. **Supports multiple Nextcloud instances** via session config
4. **Provides a useful subset of tools** that work without infrastructure
### Architecture
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Smithery-Hosted Stateless Mode │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ MCP Client Smithery │
│ (Cursor, Claude) Infrastructure │
│ │ │ │
│ │ 1. Connect │ │
│ ├───────────────────────────►│ │
│ │ │ │
│ │ 2. Config UI │ │
│ │◄───────────────────────────┤ User enters: │
│ │ (Smithery presents) │ - nextcloud_url │
│ │ │ - auth_mode (basic/oauth) │
│ │ │ - credentials │
│ │ 3. Tool call │ │
│ ├───────────────────────────►│ │
│ │ + session config │ │
│ │ │ │
│ │ ┌───────┴───────┐ │
│ │ │ MCP Server │ │
│ │ │ Container │ │
│ │ │ │ │
│ │ │ 4. Create │ │
│ │ │ client │ │
│ │ │ from │ │
│ │ │ config │ │
│ │ │ │ │ │
│ │ │ ▼ │ │
│ │ │ 5. Call │ │
│ │ │ Nextcloud │───────► User's Nextcloud │
│ │ │ API │ Instance │
│ │ │ │ │ │
│ │ │ ▼ │ │
│ │ 6. Response │ Return result │ │
│ │◄───────────────────┤ │ │
│ │ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
```
### Session Configuration Schema
```python
from pydantic import BaseModel, Field
class SmitheryConfigSchema(BaseModel):
"""Configuration schema for Smithery session."""
# Required: Nextcloud instance
nextcloud_url: str = Field(
...,
description="Your Nextcloud instance URL (e.g., https://cloud.example.com)"
)
# Authentication mode
auth_mode: str = Field(
"app_password",
description="Authentication method: 'app_password' or 'oauth'"
)
# App Password authentication (recommended for Smithery)
username: str | None = Field(
None,
description="Nextcloud username (required for app_password auth)"
)
app_password: str | None = Field(
None,
description="Nextcloud app password (Settings → Security → App passwords)"
)
# OAuth authentication (advanced)
# When auth_mode='oauth', Smithery handles the OAuth flow
# and passes the access token automatically
```
### Feature Matrix
| Feature | Self-Hosted | Smithery Stateless |
|---------|-------------|-------------------|
| **Notes** | | |
| List/Search notes | ✓ | ✓ |
| Get/Create/Update notes | ✓ | ✓ |
| Semantic search | ✓ | ✗ |
| **Calendar** | | |
| List calendars | ✓ | ✓ |
| Get/Create events | ✓ | ✓ |
| **Contacts** | | |
| List address books | ✓ | ✓ |
| Search/Get contacts | ✓ | ✓ |
| **Files (WebDAV)** | | |
| List/Download files | ✓ | ✓ |
| Upload files | ✓ | ✓ |
| Search files | ✓ | ✓ (keyword only) |
| **Deck** | | |
| List boards/cards | ✓ | ✓ |
| Create/Update cards | ✓ | ✓ |
| **Tables** | | |
| List/Query tables | ✓ | ✓ |
| Create/Update rows | ✓ | ✓ |
| **Cookbook** | | |
| List/Get recipes | ✓ | ✓ |
| **Semantic Search** | | |
| Vector search | ✓ | ✗ |
| RAG answers | ✓ | ✗ |
| **Background Sync** | | |
| Auto-indexing | ✓ | ✗ |
| Webhook sync | ✓ | ✗ |
| **Admin UI (`/app`)** | | |
| Vector sync status | ✓ | ✗ |
| Vector visualization | ✓ | ✗ |
| Webhook management | ✓ | ✗ |
| Session management | ✓ | ✗ |
### Implementation
#### 1. Deployment Mode Detection
```python
# nextcloud_mcp_server/config.py
class DeploymentMode(Enum):
SELF_HOSTED = "self_hosted" # Full features, env-based config
SMITHERY_STATELESS = "smithery" # Stateless, session-based config
def get_deployment_mode() -> DeploymentMode:
"""Detect deployment mode from environment."""
if os.getenv("SMITHERY_DEPLOYMENT") == "true":
return DeploymentMode.SMITHERY_STATELESS
return DeploymentMode.SELF_HOSTED
```
#### 2. Session-Based Client Factory
```python
# nextcloud_mcp_server/context.py
async def get_client(ctx: Context) -> NextcloudClient:
"""Get NextcloudClient - from session config or environment."""
mode = get_deployment_mode()
if mode == DeploymentMode.SMITHERY_STATELESS:
# Create client from Smithery session config
config = ctx.session_config
if not config:
raise McpError("Session configuration required")
return NextcloudClient(
base_url=config.nextcloud_url,
username=config.username,
password=config.app_password,
)
else:
# Existing behavior: from environment or OAuth context
return await _get_client_from_context(ctx)
```
#### 3. Conditional Tool Registration
```python
# nextcloud_mcp_server/app.py
def create_mcp_server(mode: DeploymentMode) -> FastMCP:
"""Create MCP server with mode-appropriate tools."""
mcp = FastMCP("Nextcloud MCP")
# Always register core tools
configure_notes_tools(mcp)
configure_calendar_tools(mcp)
configure_contacts_tools(mcp)
configure_webdav_tools(mcp)
configure_deck_tools(mcp)
configure_tables_tools(mcp)
configure_cookbook_tools(mcp)
# Only register stateful tools in self-hosted mode
if mode == DeploymentMode.SELF_HOSTED:
configure_semantic_tools(mcp) # Requires Qdrant
register_oauth_tools(mcp) # Requires token storage
return mcp
```
#### 4. Exclude Admin UI Routes
The `/app` admin UI should **not be installed** in Smithery mode because:
- **Vector sync status** - No vector sync in stateless mode
- **Vector visualization** - No Qdrant to visualize
- **Webhook management** - No webhook sync without background workers
- **Session management** - No persistent sessions to manage
```python
# nextcloud_mcp_server/app.py
def create_app(mode: DeploymentMode) -> Starlette:
"""Create Starlette app with mode-appropriate routes."""
routes = [
Route("/health/live", health_live, methods=["GET"]),
Route("/health/ready", health_ready, methods=["GET"]),
]
# Only mount admin UI in self-hosted mode
if mode == DeploymentMode.SELF_HOSTED:
browser_app = create_browser_app()
routes.append(
Route("/app", lambda r: RedirectResponse("/app/", status_code=307))
)
routes.append(Mount("/app", app=browser_app))
logger.info("Admin UI mounted at /app")
else:
logger.info("Admin UI disabled in Smithery stateless mode")
# Mount FastMCP at root
mcp_app = create_mcp_server(mode).streamable_http_app()
routes.append(Mount("/", app=mcp_app))
return Starlette(routes=routes, lifespan=starlette_lifespan)
```
**Endpoints by Mode:**
| Endpoint | Self-Hosted | Smithery |
|----------|-------------|----------|
| `/mcp` | ✓ | ✓ |
| `/health/live` | ✓ | ✓ |
| `/health/ready` | ✓ | ✓ |
| `/.well-known/mcp-config` | ✓ | ✓ |
| `/app` | ✓ | ✗ |
| `/app/vector-sync/status` | ✓ | ✗ |
| `/app/vector-viz` | ✓ | ✗ |
| `/app/webhooks` | ✓ | ✗ |
#### 5. Smithery Integration Files
**smithery.yaml:**
```yaml
runtime: "container"
build:
dockerfile: "Dockerfile.smithery"
dockerBuildPath: "."
startCommand:
type: "http"
configSchema:
type: "object"
required: ["nextcloud_url", "username", "app_password"]
properties:
nextcloud_url:
type: "string"
title: "Nextcloud URL"
description: "Your Nextcloud instance URL (e.g., https://cloud.example.com)"
username:
type: "string"
title: "Username"
description: "Your Nextcloud username"
app_password:
type: "string"
title: "App Password"
description: "Generate at Settings → Security → App passwords"
exampleConfig:
nextcloud_url: "https://cloud.example.com"
username: "alice"
app_password: "xxxxx-xxxxx-xxxxx-xxxxx-xxxxx"
```
**Dockerfile.smithery:**
```dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
# Copy project files
COPY pyproject.toml uv.lock ./
COPY nextcloud_mcp_server ./nextcloud_mcp_server
# Install dependencies (without vector/semantic extras)
RUN uv sync --frozen --no-dev
# Set Smithery mode
ENV SMITHERY_DEPLOYMENT=true
ENV VECTOR_SYNC_ENABLED=false
# Smithery sets PORT=8081
EXPOSE 8081
CMD ["uv", "run", "python", "-m", "nextcloud_mcp_server.smithery_main"]
```
**nextcloud_mcp_server/smithery_main.py:**
```python
"""Smithery-specific entrypoint for stateless deployment."""
import os
import uvicorn
from starlette.middleware.cors import CORSMiddleware
from nextcloud_mcp_server.app import create_mcp_server
from nextcloud_mcp_server.config import DeploymentMode
def main():
# Force stateless mode
os.environ["SMITHERY_DEPLOYMENT"] = "true"
os.environ["VECTOR_SYNC_ENABLED"] = "false"
mcp = create_mcp_server(DeploymentMode.SMITHERY_STATELESS)
app = mcp.streamable_http_app()
# Add CORS for browser-based clients
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["GET", "POST", "OPTIONS"],
allow_headers=["*"],
expose_headers=["mcp-session-id", "mcp-protocol-version"],
)
# Smithery sets PORT environment variable
port = int(os.environ.get("PORT", 8081))
uvicorn.run(app, host="0.0.0.0", port=port)
if __name__ == "__main__":
main()
```
### Security Considerations
1. **App Passwords over User Passwords**
- Smithery config encourages app passwords (revocable, scoped)
- Documentation guides users to create dedicated app passwords
- App passwords can be revoked without changing main password
2. **HTTPS Required**
- `nextcloud_url` must be HTTPS for production use
- Validation rejects HTTP URLs in Smithery mode
3. **No Credential Storage**
- Credentials exist only for request duration
- No server-side persistence of user credentials
- Smithery handles secure config transmission
4. **Scope Limitation**
- Stateless mode cannot access offline_access
- No background operations on user's behalf
- Clear user expectation: tools work during session only
### Migration Path
Users can start with Smithery stateless mode and migrate to self-hosted:
1. **Try on Smithery** → Basic tools, no setup
2. **Self-host for semantic search** → Add Qdrant, enable vector sync
3. **Full deployment** → Background sync, webhooks, multi-user OAuth
## Consequences
### Positive
1. **Lower barrier to entry** - Users can try without infrastructure
2. **Multi-user support** - Each session connects to different Nextcloud
3. **Smithery ecosystem** - Discovery, observability, OAuth UI
4. **Clear feature tiers** - Stateless (simple) vs self-hosted (full)
### Negative
1. **No semantic search** - Key differentiator unavailable on Smithery
2. **Per-request auth** - Credentials sent with each request
3. **No offline access** - Cannot perform background operations
4. **Maintenance burden** - Two deployment modes to support
### Neutral
1. **Feature subset** - May encourage users to self-host for full features
2. **Documentation needs** - Clear guidance on mode differences required
## Alternatives Considered
### 1. External MCP Only
**Approach:** Only support self-hosted external MCP registration on Smithery.
**Rejected because:**
- Higher barrier to entry for new users
- Misses opportunity for Smithery marketplace visibility
- Users want to try before committing to infrastructure
### 2. Embedded Vector DB (SQLite-vec)
**Approach:** Use SQLite with vector extensions for per-request indexing.
**Rejected because:**
- No persistence between requests anyway
- Indexing latency too high for synchronous requests
- Complexity without benefit in stateless context
### 3. External Vector DB Service
**Approach:** Connect to Pinecone/Weaviate Cloud from Smithery container.
**Rejected because:**
- Adds external dependency and cost
- Per-user collections require complex multi-tenancy
- Sync still impossible without background workers
### 4. Hybrid: Smithery + User's Qdrant
**Approach:** User provides their own Qdrant URL in session config.
**Considered for future:**
- Could enable semantic search for advanced users
- Adds complexity to session config
- Sync still requires external trigger (manual or webhook)
## References
- [Smithery Documentation](https://smithery.ai/docs)
- [Smithery Session Configuration](https://smithery.ai/docs/build/session-config)
- [Smithery External MCPs](https://smithery.ai/docs/build/external)
- [MCP Streamable HTTP Transport](https://modelcontextprotocol.io/docs/concepts/transports)
- [Nextcloud App Passwords](https://docs.nextcloud.com/server/latest/user_manual/en/session_management.html#app-passwords)
+338
View File
@@ -0,0 +1,338 @@
# Amazon Bedrock Setup Guide
This guide covers how to configure the Nextcloud MCP Server to use Amazon Bedrock for embeddings and text generation.
## Prerequisites
1. **AWS Account** with access to Amazon Bedrock
2. **boto3 library** installed: `pip install boto3` or `uv sync --group dev`
3. **Model Access** - Request access to models in AWS Bedrock console
## Required AWS Permissions
### IAM Policy for Bedrock Access
The AWS IAM user or role needs the following permissions:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockInvokeModels",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*"
]
}
]
}
```
### Minimal Permissions (Production)
For production deployments, restrict to specific models:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockEmbeddings",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
]
},
{
"Sid": "BedrockGeneration",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
]
}
]
}
```
### Additional Permissions (Optional)
For advanced use cases:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockListModels",
"Effect": "Allow",
"Action": [
"bedrock:ListFoundationModels",
"bedrock:GetFoundationModel"
],
"Resource": "*"
},
{
"Sid": "BedrockAsyncInvoke",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModelAsync",
"bedrock:GetAsyncInvoke",
"bedrock:ListAsyncInvokes"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*"
]
}
]
}
```
## Model Access
Before using Bedrock models, you must request access in the AWS Console:
1. Navigate to **Amazon Bedrock****Model access**
2. Click **Manage model access**
3. Select models you want to use:
- **Embeddings:** Amazon Titan Embed Text, Cohere Embed
- **Text Generation:** Anthropic Claude, Meta Llama, Amazon Titan Text
4. Click **Request model access**
5. Wait for approval (usually instant for most models)
## Supported Models
### Embedding Models
| Provider | Model ID | Dimensions | Best For |
|----------|----------|------------|----------|
| Amazon Titan | `amazon.titan-embed-text-v1` | 1,536 | General purpose |
| Amazon Titan | `amazon.titan-embed-text-v2:0` | 1,024 | Latest, improved quality |
| Cohere | `cohere.embed-english-v3` | 1,024 | English text |
| Cohere | `cohere.embed-multilingual-v3` | 1,024 | Multilingual |
### Text Generation Models
| Provider | Model ID | Context | Best For |
|----------|----------|---------|----------|
| Anthropic | `anthropic.claude-3-sonnet-20240229-v1:0` | 200K | Balanced performance |
| Anthropic | `anthropic.claude-3-haiku-20240307-v1:0` | 200K | Fast, cost-effective |
| Anthropic | `anthropic.claude-3-opus-20240229-v1:0` | 200K | Highest quality |
| Meta | `meta.llama3-8b-instruct-v1:0` | 8K | Fast, open-source |
| Meta | `meta.llama3-70b-instruct-v1:0` | 8K | High quality |
| Amazon | `amazon.titan-text-express-v1` | 8K | Fast, low cost |
| Mistral | `mistral.mistral-7b-instruct-v0:2` | 32K | Efficient |
## Configuration
### Environment Variables
**Required:**
```bash
AWS_REGION=us-east-1
```
**Optional (at least one model required):**
```bash
# For embeddings
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
# For text generation (RAG evaluation)
BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
```
**AWS Credentials (choose one method):**
**Method 1: Environment Variables**
```bash
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
```
**Method 2: AWS Credentials File** (`~/.aws/credentials`)
```ini
[default]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
```
**Method 3: IAM Role** (when running on AWS EC2/ECS/Lambda)
- No credentials needed, uses instance/task role automatically
### Docker Configuration
Add to your `docker-compose.yml`:
```yaml
services:
mcp:
environment:
- AWS_REGION=us-east-1
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
- BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
```
Or use AWS credentials file volume mount:
```yaml
services:
mcp:
volumes:
- ~/.aws:/root/.aws:ro
environment:
- AWS_REGION=us-east-1
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
```
## Usage Examples
### Embeddings Only
```bash
export AWS_REGION=us-east-1
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret
uv run nextcloud-mcp-server
```
### Both Embeddings and Generation
```bash
export AWS_REGION=us-east-1
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
export BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
# For RAG evaluation with Bedrock
export RAG_EVAL_PROVIDER=bedrock
export RAG_EVAL_BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
uv run python -m tests.rag_evaluation.evaluate
```
### Programmatic Usage
```python
from nextcloud_mcp_server.providers import BedrockProvider
# Embeddings only
provider = BedrockProvider(
region_name="us-east-1",
embedding_model="amazon.titan-embed-text-v2:0",
)
embeddings = await provider.embed_batch(["text1", "text2"])
# Both capabilities
provider = BedrockProvider(
region_name="us-east-1",
embedding_model="amazon.titan-embed-text-v2:0",
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
)
# Generate embeddings
embedding = await provider.embed("query text")
# Generate text
response = await provider.generate("Write a summary", max_tokens=500)
```
## Cost Considerations
### Embedding Costs (as of Jan 2025)
| Model | Price per 1K tokens |
|-------|---------------------|
| Titan Embed Text v2 | $0.0001 |
| Cohere Embed English v3 | $0.0001 |
### Generation Costs (as of Jan 2025)
| Model | Input (per 1K tokens) | Output (per 1K tokens) |
|-------|----------------------|------------------------|
| Claude 3 Haiku | $0.00025 | $0.00125 |
| Claude 3 Sonnet | $0.003 | $0.015 |
| Claude 3 Opus | $0.015 | $0.075 |
| Llama 3 8B | $0.0003 | $0.0006 |
| Titan Text Express | $0.0002 | $0.0006 |
**Note:** Prices vary by region. Check [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/) for current rates.
## Troubleshooting
### Error: "Executable doesn't exist" or boto3 not found
**Solution:**
```bash
uv sync --group dev # Installs boto3
```
### Error: "AccessDeniedException"
**Causes:**
1. IAM permissions missing
2. Model access not requested
3. Wrong AWS region
**Solution:**
1. Verify IAM policy includes `bedrock:InvokeModel`
2. Request model access in Bedrock console
3. Check model is available in your region
### Error: "ResourceNotFoundException"
**Cause:** Invalid model ID or model not available in region
**Solution:**
- Verify model ID matches exactly (case-sensitive)
- Check model availability in your AWS region
- Use `aws bedrock list-foundation-models` to see available models
### Error: "ThrottlingException"
**Cause:** Rate limit exceeded
**Solution:**
- Reduce request rate
- Request quota increase via AWS Support
- Use batch operations where possible
## Security Best Practices
1. **Use IAM Roles** when running on AWS infrastructure
2. **Rotate Access Keys** regularly if using IAM users
3. **Restrict Permissions** to only required models
4. **Enable CloudTrail** for audit logging
5. **Use AWS Secrets Manager** for credential management
6. **Monitor Costs** with AWS Cost Explorer and Budgets
## Regional Availability
Amazon Bedrock is available in:
- **US East (N. Virginia)**: `us-east-1` ✅ Most models
- **US West (Oregon)**: `us-west-2` ✅ Most models
- **Asia Pacific (Singapore)**: `ap-southeast-1`
- **Asia Pacific (Tokyo)**: `ap-northeast-1`
- **Europe (Frankfurt)**: `eu-central-1`
**Note:** Model availability varies by region. Check the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html) for current availability.
## References
- [AWS Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
- [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
- [boto3 Bedrock Runtime API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
- [Provider Architecture ADR](./ADR-015-unified-provider-architecture.md)
Binary file not shown.

After

Width:  |  Height:  |  Size: 83 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 282 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 143 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 244 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 483 KiB

+1 -1
View File
@@ -243,7 +243,7 @@ If you see cardinality warnings:
The observability stack integrates at multiple layers:
1. **HTTP Layer**: `ObservabilityMiddleware` tracks all HTTP requests
2. **MCP Layer**: Tools use `@trace_mcp_tool` for span creation
2. **MCP Layer**: Tools use `@instrument_tool` for automatic metrics and trace span creation
3. **Client Layer**: `BaseNextcloudClient` tracks all API calls
4. **OAuth Layer**: Token operations are traced and metered
5. **Background Tasks**: Vector sync operations emit metrics/traces
+93
View File
@@ -0,0 +1,93 @@
# Vector Sync UI Guide
This guide covers the browser-based interface for the Nextcloud MCP Server's semantic search and vector synchronization features.
## Overview
The Vector Sync UI (`/app`) provides an interactive interface to test semantic search queries and visualize results from your Nextcloud documents. It exposes the same retrieval capabilities that LLMs use in Retrieval-Augmented Generation (RAG) workflows, powered by Alpine.js for reactive state, htmx for dynamic updates, and Plotly.js for 3D visualization.
**Supported Apps**: Notes, Files (text/PDF), Calendar (events/tasks), Contacts (CardDAV), and Deck are indexed and searchable.
## Accessing the UI
Navigate to `/app` after authentication:
- **BasicAuth mode**: `http://localhost:8000/app` (uses credentials from environment)
- **OAuth mode**: `http://localhost:8000/app` (redirects to login if not authenticated)
## Tabs
### Welcome Page
Landing page that introduces semantic search and RAG workflows. Shows authentication status, explains how vector embeddings work, and provides feature navigation. Adapts content based on whether `VECTOR_SYNC_ENABLED=true`.
### User Info
Displays authentication details and session information:
- **BasicAuth**: Username, mode badge, Nextcloud host
- **OAuth**: Username, session ID (truncated), background access status, IdP profile, revocation option
### Vector Sync Status
Real-time monitoring of document indexing:
- **Indexed Documents**: Total chunks stored in Qdrant vector database (immediately searchable)
- **Pending Documents**: Queue awaiting embedding processing
- **Status**: "✓ Idle" (green) when up-to-date, "⟳ Syncing" (orange) during processing
Auto-refreshes every 10 seconds via htmx. Check this tab after adding content to verify indexing completion.
### Vector Visualization
Interactive search interface with 3D PCA plot of semantic space.
**Search Controls**:
- **Query**: Natural language search (e.g., "health benefits of coffee")
- **Algorithm**: Semantic (Dense) for pure vector search, or BM25 Hybrid (default) combining vectors + keywords
- **Fusion** (Hybrid only): RRF (Reciprocal Rank Fusion) or DBSF (Distribution-Based Score Fusion)
- **Advanced**: Filter by document type, adjust score threshold (0.0-1.0), set result limit (max 100)
**3D Visualization**:
The plot uses Principal Component Analysis (PCA) to reduce 768-dimensional embeddings to 3D. Documents are positioned by semantic similarity with the query point shown in red. Point size and opacity indicate relevance, and the Viridis color scale shows relative scores (yellow = highest match).
**Critical Fix**: Vectors are L2-normalized before PCA to match Qdrant's cosine distance, ensuring query points position accurately near similar documents. Without normalization, magnitude differences cause misleading spatial separation.
**Results List**:
Each result shows document title (clickable link to Nextcloud), excerpt, raw score, relative percentage, and document type. Click "Show Chunk" to view the matched text segment with surrounding context (up to 500 characters before/after).
## Configuration
**Required**:
```bash
VECTOR_SYNC_ENABLED=true
```
**Optional** (for browser-accessible links):
```bash
NEXTCLOUD_PUBLIC_ISSUER_URL=https://your-public-nextcloud-url.com
```
**Admin Access**: Webhooks tab only visible to Nextcloud admins (verified via Provisioning API).
## Use Cases
**Testing Search Queries**: Preview results before they reach LLMs in RAG workflows. Compare semantic vs. hybrid algorithms, verify relevance scores, and validate that correct documents are retrieved. Use chunk context to see exactly which text segments match and why unexpected documents appear.
**Monitoring Indexing**: Track real-time progress after creating or modifying documents. Check if the queue is backing up (high pending count) or confirm the system is idle after bulk imports. Verify documents become searchable immediately after indexing completes.
**Algorithm Comparison**: Pure semantic search excels at conceptual queries and synonyms. BM25 hybrid combines semantic understanding with precise keyword matching for better accuracy on specific terms. Experiment with RRF vs. DBSF fusion for different score distributions.
## Troubleshooting
**Vector Sync Tab Not Visible**: Set `VECTOR_SYNC_ENABLED=true` and restart the server.
**No Search Results**: Check Vector Sync Status to confirm documents are indexed (not just pending). Try broader queries or lower the score threshold in Advanced options. Initial indexing may take time depending on document volume.
**Links to Nextcloud Apps Not Working**: Set `NEXTCLOUD_PUBLIC_ISSUER_URL` to your browser-accessible Nextcloud URL for correct link generation.
## Related Documentation
- [Configuration Guide](../configuration.md) - Environment variables and settings
- [Authentication Modes](../authentication.md) - BasicAuth vs OAuth setup
- [Installation Guide](../installation.md) - Getting started
- [ADR-008: MCP Sampling for RAG](../ADR-008-mcp-sampling-for-rag.md) - Technical details on RAG workflows
+493 -309
View File
@@ -3,6 +3,7 @@ import os
import time
from collections.abc import AsyncIterator
from contextlib import AsyncExitStack, asynccontextmanager
from contextvars import ContextVar
from dataclasses import dataclass
from typing import TYPE_CHECKING, Optional
@@ -24,6 +25,9 @@ from starlette.middleware.authentication import AuthenticationMiddleware
from starlette.middleware.cors import CORSMiddleware
from starlette.responses import JSONResponse, RedirectResponse
from starlette.routing import Mount, Route
from starlette.staticfiles import StaticFiles
from starlette.types import ASGIApp, Receive, Send
from starlette.types import Scope as StarletteScope
from nextcloud_mcp_server.auth import (
InsufficientScopeError,
@@ -35,6 +39,8 @@ from nextcloud_mcp_server.auth import (
from nextcloud_mcp_server.auth.unified_verifier import UnifiedTokenVerifier
from nextcloud_mcp_server.client import NextcloudClient
from nextcloud_mcp_server.config import (
DeploymentMode,
get_deployment_mode,
get_document_processor_config,
get_settings,
)
@@ -54,6 +60,7 @@ from nextcloud_mcp_server.server import (
configure_contacts_tools,
configure_cookbook_tools,
configure_deck_tools,
configure_news_tools,
configure_notes_tools,
configure_semantic_tools,
configure_sharing_tools,
@@ -121,6 +128,26 @@ def initialize_document_processors():
except Exception as e:
logger.warning(f"Failed to register Tesseract processor: {e}")
# Register PyMuPDF processor (high priority, local, no API required)
if "pymupdf" in config["processors"]:
pymupdf_config = config["processors"]["pymupdf"]
try:
from nextcloud_mcp_server.document_processors.pymupdf import (
PyMuPDFProcessor,
)
processor = PyMuPDFProcessor(
extract_images=pymupdf_config.get("extract_images", True),
image_dir=pymupdf_config.get("image_dir"),
)
registry.register(processor, priority=15) # Higher than unstructured
logger.info(
f"Registered PyMuPDF processor: extract_images={pymupdf_config.get('extract_images', True)}"
)
registered_count += 1
except Exception as e:
logger.warning(f"Failed to register PyMuPDF processor: {e}")
# Register custom processor
if "custom" in config["processors"]:
custom_config = config["processors"]["custom"]
@@ -217,6 +244,25 @@ def validate_pkce_support(discovery: dict, discovery_url: str) -> None:
click.echo(f"✓ PKCE support validated: {code_challenge_methods}")
@dataclass
class VectorSyncState:
"""
Module-level state for vector sync background tasks.
This singleton bridges the Starlette server lifespan (where background tasks run)
and FastMCP session lifespans (where MCP tools need access to the streams).
"""
document_send_stream: Optional[MemoryObjectSendStream] = None
document_receive_stream: Optional[MemoryObjectReceiveStream] = None
shutdown_event: Optional[anyio.Event] = None
scanner_wake_event: Optional[anyio.Event] = None
# Module-level singleton for vector sync state
_vector_sync_state = VectorSyncState()
@dataclass
class AppContext:
"""Application context for BasicAuth mode."""
@@ -243,17 +289,160 @@ class OAuthAppContext:
)
@dataclass
class SmitheryAppContext:
"""Application context for Smithery stateless mode.
ADR-016: No shared client - clients created per-request from session config.
"""
pass # No shared state needed - everything comes from session config
# ADR-016: Smithery config schema for container runtime
# This schema is served at /.well-known/mcp-config for Smithery discovery
# See: https://smithery.ai/docs/build/session-config
SMITHERY_CONFIG_SCHEMA = {
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://server.smithery.ai/nextcloud-mcp-server/.well-known/mcp-config",
"title": "Nextcloud MCP Server Configuration",
"description": "Configuration for connecting to your Nextcloud instance via app password authentication",
"x-query-style": "flat", # Our schema has no nested objects, so flat style works
"type": "object",
"required": ["nextcloud_url", "username", "app_password"],
"properties": {
"nextcloud_url": {
"type": "string",
"title": "Nextcloud URL",
"description": "Your Nextcloud instance URL (e.g., https://cloud.example.com). Must be publicly accessible.",
"pattern": "^https?://.+",
},
"username": {
"type": "string",
"title": "Username",
"description": "Your Nextcloud username",
"minLength": 1,
},
"app_password": {
"type": "string",
"title": "App Password",
"description": "Nextcloud app password. Generate at Settings > Security > App passwords. Do NOT use your main password.",
"minLength": 1,
},
},
"additionalProperties": False,
}
# ADR-016: Context variable to hold Smithery session config per-request
# This is set by SmitheryConfigMiddleware and accessed in context.py
_smithery_session_config: ContextVar[dict[str, str] | None] = ContextVar(
"smithery_session_config"
)
_smithery_session_config.set(None) # Set initial value
def get_smithery_session_config() -> dict | None:
"""Get the current Smithery session config from context variable.
Used by context.py to access config extracted from URL query parameters.
"""
return _smithery_session_config.get()
class SmitheryConfigMiddleware:
"""Middleware to extract Smithery config from URL query parameters.
ADR-016: For container runtime, Smithery passes configuration as URL query
parameters to the /mcp endpoint. This middleware extracts those parameters
and stores them in a context variable for access in tools.
Configuration parameters:
- nextcloud_url: Nextcloud instance URL
- username: Nextcloud username
- app_password: Nextcloud app password
The extracted config is stored in a ContextVar and can be accessed via
get_smithery_session_config() in context.py.
"""
def __init__(self, app: ASGIApp):
self.app = app
async def __call__(
self, scope: StarletteScope, receive: Receive, send: Send
) -> None:
if scope["type"] == "http":
# Extract config from query parameters
from urllib.parse import parse_qs
query_string = scope.get("query_string", b"").decode("utf-8")
params = parse_qs(query_string)
# Build session config from query parameters
# Smithery uses dot notation for nested objects, but our schema is flat
session_config = {}
for key in ["nextcloud_url", "username", "app_password"]:
if key in params:
# parse_qs returns lists, take first value
session_config[key] = params[key][0]
# Store in context variable for access by context.py
if session_config:
_smithery_session_config.set(session_config)
logger.debug(
f"Smithery config extracted: nextcloud_url={session_config.get('nextcloud_url')}, "
f"username={session_config.get('username')}"
)
try:
await self.app(scope, receive, send)
finally:
# Clear context variable after request
_smithery_session_config.set(None)
@asynccontextmanager
async def app_lifespan_smithery(server: FastMCP) -> AsyncIterator[SmitheryAppContext]:
"""
Manage application lifecycle for Smithery stateless mode.
ADR-016: Minimal lifespan with no shared state.
- No shared Nextcloud client (created per-request from session config)
- No vector sync (disabled in Smithery mode)
- No persistent storage (stateless deployment)
- No document processors (not enabled in Smithery mode)
"""
logger.info("Starting MCP server in Smithery stateless mode")
logger.info("Clients will be created per-request from session config")
try:
yield SmitheryAppContext()
finally:
logger.info("Shutting down Smithery stateless mode")
def is_oauth_mode() -> bool:
"""
Determine if OAuth mode should be used.
OAuth mode is enabled when:
- NEXTCLOUD_USERNAME and NEXTCLOUD_PASSWORD are NOT set
- AND we are NOT in Smithery stateless mode
- Or explicitly enabled via configuration
Returns:
True if OAuth mode, False if BasicAuth mode
"""
# ADR-016: Smithery stateless mode uses per-request BasicAuth from session config
# It's not OAuth mode even though env credentials aren't set
deployment_mode = get_deployment_mode()
if deployment_mode == DeploymentMode.SMITHERY_STATELESS:
logger.info(
"BasicAuth mode (Smithery stateless - credentials from session config)"
)
return False
username = os.getenv("NEXTCLOUD_USERNAME")
password = os.getenv("NEXTCLOUD_PASSWORD")
@@ -326,7 +515,7 @@ async def load_oauth_client_credentials(
# and the authorization server will limit them to these allowed scopes.
#
# The PRM endpoint advertises the same scopes dynamically via @require_scopes decorators.
dcr_scopes = "openid profile email notes:read notes:write calendar:read calendar:write todo:read todo:write contacts:read contacts:write cookbook:read cookbook:write deck:read deck:write tables:read tables:write files:read files:write sharing:read sharing:write"
dcr_scopes = "openid profile email notes:read notes:write calendar:read calendar:write todo:read todo:write contacts:read contacts:write cookbook:read cookbook:write deck:read deck:write tables:read tables:write files:read files:write sharing:read sharing:write news:read news:write"
# Add offline_access scope if refresh tokens are enabled
enable_offline_access = os.getenv("ENABLE_OFFLINE_ACCESS", "false").lower() in (
@@ -386,15 +575,15 @@ async def load_oauth_client_credentials(
@asynccontextmanager
async def app_lifespan_basic(server: FastMCP) -> AsyncIterator[AppContext]:
"""
Manage application lifecycle for BasicAuth mode.
Manage application lifecycle for BasicAuth mode (FastMCP session lifespan).
Creates a single Nextcloud client with basic authentication
that is shared across all requests.
that is shared across all requests within a session.
If vector sync is enabled (VECTOR_SYNC_ENABLED=true), also starts
background tasks for automatic document indexing (ADR-007).
Note: Background tasks (scanner, processor) are started at server level
in starlette_lifespan, not here. This lifespan runs per-session.
"""
logger.info("Starting MCP server in BasicAuth mode")
logger.info("Starting MCP session in BasicAuth mode")
logger.info("Creating Nextcloud client with BasicAuth")
client = NextcloudClient.from_env()
@@ -410,91 +599,20 @@ async def app_lifespan_basic(server: FastMCP) -> AsyncIterator[AppContext]:
# Initialize document processors
initialize_document_processors()
settings = get_settings()
# Check if vector sync is enabled
if settings.vector_sync_enabled:
logger.info("Vector sync enabled - starting background tasks")
# Get username from environment for BasicAuth mode
username = os.getenv("NEXTCLOUD_USERNAME")
if not username:
raise ValueError(
"NEXTCLOUD_USERNAME is required for vector sync in BasicAuth mode"
)
# Initialize Qdrant collection before starting background tasks
logger.info("Initializing Qdrant collection...")
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
try:
await get_qdrant_client() # Triggers collection creation if needed
logger.info("Qdrant collection ready")
except Exception as e:
logger.error(f"Failed to initialize Qdrant collection: {e}")
raise RuntimeError(
f"Cannot start vector sync - Qdrant initialization failed: {e}"
) from e
# Initialize shared state
send_stream, receive_stream = anyio.create_memory_object_stream(
max_buffer_size=settings.vector_sync_queue_max_size
# Yield client context - scanner runs at server level (starlette_lifespan)
# Include vector sync state from module singleton (set by starlette_lifespan)
try:
yield AppContext(
client=client,
storage=storage,
document_send_stream=_vector_sync_state.document_send_stream,
document_receive_stream=_vector_sync_state.document_receive_stream,
shutdown_event=_vector_sync_state.shutdown_event,
scanner_wake_event=_vector_sync_state.scanner_wake_event,
)
shutdown_event = anyio.Event()
scanner_wake_event = anyio.Event()
# Start background tasks using anyio TaskGroup
async with anyio.create_task_group() as tg:
# Start scanner task
await tg.start(
scanner_task,
send_stream,
shutdown_event,
scanner_wake_event,
client,
username,
)
# Start processor pool (each gets a cloned receive stream)
for i in range(settings.vector_sync_processor_workers):
await tg.start(
processor_task,
i,
receive_stream.clone(),
shutdown_event,
client,
username,
)
logger.info(
f"Background sync tasks started: 1 scanner + {settings.vector_sync_processor_workers} processors"
)
# Yield with background tasks running
try:
yield AppContext(
client=client,
storage=storage,
document_send_stream=send_stream,
document_receive_stream=receive_stream,
shutdown_event=shutdown_event,
scanner_wake_event=scanner_wake_event,
)
finally:
# Shutdown signal
logger.info("Shutting down background sync tasks")
shutdown_event.set()
# TaskGroup automatically cancels all tasks on exit
logger.info("Background sync tasks stopped")
await client.close()
else:
# No vector sync - simple lifecycle
try:
yield AppContext(client=client, storage=storage)
finally:
logger.info("Shutting down BasicAuth mode")
await client.close()
finally:
logger.info("Shutting down BasicAuth session")
await client.close()
async def setup_oauth_config():
@@ -810,7 +928,7 @@ async def setup_oauth_config():
)
def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
def get_app(transport: str = "streamable-http", enabled_apps: list[str] | None = None):
# Initialize observability (logging will be configured by uvicorn)
settings = get_settings()
@@ -837,8 +955,9 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
"OpenTelemetry tracing disabled (set OTEL_EXPORTER_OTLP_ENDPOINT to enable)"
)
# Determine authentication mode
# Determine authentication mode and deployment mode
oauth_enabled = is_oauth_mode()
deployment_mode = get_deployment_mode()
if oauth_enabled:
logger.info("Configuring MCP server for OAuth mode")
@@ -899,8 +1018,17 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
auth=auth_settings,
)
else:
logger.info("Configuring MCP server for BasicAuth mode")
mcp = FastMCP("Nextcloud MCP", lifespan=app_lifespan_basic)
# ADR-016: Use Smithery lifespan for stateless mode, BasicAuth otherwise
if deployment_mode == DeploymentMode.SMITHERY_STATELESS:
logger.info("Configuring MCP server for Smithery stateless mode")
# json_response=True returns plain JSON-RPC instead of SSE format,
# required for Smithery scanner compatibility
mcp = FastMCP(
"Nextcloud MCP", lifespan=app_lifespan_smithery, json_response=True
)
else:
logger.info("Configuring MCP server for BasicAuth mode")
mcp = FastMCP("Nextcloud MCP", lifespan=app_lifespan_basic)
@mcp.resource("nc://capabilities")
async def nc_get_capabilities():
@@ -919,6 +1047,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
"contacts": configure_contacts_tools,
"cookbook": configure_cookbook_tools,
"deck": configure_deck_tools,
"news": configure_news_tools,
}
# If no specific apps are specified, enable all
@@ -936,8 +1065,12 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
)
# Register semantic search tools (cross-app feature)
# ADR-016: Skip in Smithery stateless mode (no vector database)
settings = get_settings()
if settings.vector_sync_enabled:
deployment_mode = get_deployment_mode()
if deployment_mode == DeploymentMode.SMITHERY_STATELESS:
logger.info("Skipping semantic search tools (Smithery stateless mode)")
elif settings.vector_sync_enabled:
logger.info("Configuring semantic search tools (vector sync enabled)")
configure_semantic_tools(mcp)
else:
@@ -1014,180 +1147,177 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
"Dynamic tool filtering enabled for OAuth mode (JWT and Bearer tokens)"
)
if transport == "sse":
mcp_app = mcp.sse_app()
starlette_lifespan = None
elif transport in ("http", "streamable-http"):
mcp_app = mcp.streamable_http_app()
mcp_app = mcp.streamable_http_app()
@asynccontextmanager
async def starlette_lifespan(app: Starlette):
# Set OAuth context for OAuth login routes (ADR-004)
if oauth_enabled:
# Prepare OAuth config from setup_oauth_config closure variables
mcp_server_url = os.getenv(
"NEXTCLOUD_MCP_SERVER_URL", "http://localhost:8000"
)
nextcloud_resource_uri = os.getenv(
"NEXTCLOUD_RESOURCE_URI", nextcloud_host
)
discovery_url = os.getenv(
"OIDC_DISCOVERY_URL",
f"{nextcloud_host}/.well-known/openid-configuration",
)
scopes = os.getenv("NEXTCLOUD_OIDC_SCOPES", "")
@asynccontextmanager
async def starlette_lifespan(app: Starlette):
# Set OAuth context for OAuth login routes (ADR-004)
if oauth_enabled:
# Prepare OAuth config from setup_oauth_config closure variables
mcp_server_url = os.getenv(
"NEXTCLOUD_MCP_SERVER_URL", "http://localhost:8000"
)
nextcloud_resource_uri = os.getenv("NEXTCLOUD_RESOURCE_URI", nextcloud_host)
discovery_url = os.getenv(
"OIDC_DISCOVERY_URL",
f"{nextcloud_host}/.well-known/openid-configuration",
)
scopes = os.getenv("NEXTCLOUD_OIDC_SCOPES", "")
oauth_context_dict = {
"storage": refresh_token_storage,
"oauth_client": oauth_client,
"token_verifier": token_verifier, # For querying IdP userinfo endpoint
"config": {
"mcp_server_url": mcp_server_url,
"discovery_url": discovery_url,
"client_id": client_id, # From setup_oauth_config (DCR or static)
"client_secret": client_secret, # From setup_oauth_config (DCR or static)
"scopes": scopes,
"nextcloud_host": nextcloud_host,
"nextcloud_resource_uri": nextcloud_resource_uri,
"oauth_provider": oauth_provider,
},
}
app.state.oauth_context = oauth_context_dict
oauth_context_dict = {
"storage": refresh_token_storage,
"oauth_client": oauth_client,
"token_verifier": token_verifier, # For querying IdP userinfo endpoint
"config": {
"mcp_server_url": mcp_server_url,
"discovery_url": discovery_url,
"client_id": client_id, # From setup_oauth_config (DCR or static)
"client_secret": client_secret, # From setup_oauth_config (DCR or static)
"scopes": scopes,
"nextcloud_host": nextcloud_host,
"nextcloud_resource_uri": nextcloud_resource_uri,
"oauth_provider": oauth_provider,
},
}
app.state.oauth_context = oauth_context_dict
# Also set oauth_context on browser_app for session authentication
# browser_app is in the same function scope (defined later in create_app)
# We need to find it in the mounted routes
for route in app.routes:
if isinstance(route, Mount) and route.path == "/app":
route.app.state.oauth_context = oauth_context_dict
logger.info(
"OAuth context shared with browser_app for session auth"
)
break
logger.info(
f"OAuth context initialized for login routes (client_id={client_id[:16]}...)"
)
else:
# BasicAuth mode - share storage with browser_app for webhook management
from nextcloud_mcp_server.auth.storage import RefreshTokenStorage
storage = RefreshTokenStorage.from_env()
await storage.initialize()
app.state.storage = storage
# Also share with browser_app for webhook routes
for route in app.routes:
if isinstance(route, Mount) and route.path == "/app":
route.app.state.storage = storage
logger.info(
"Storage shared with browser_app for webhook management"
)
break
# Start background vector sync tasks for BasicAuth mode (ADR-007)
# For streamable-http transport, FastMCP lifespan isn't automatically triggered
# so we manually start background tasks here if vector sync is enabled
import anyio as anyio_module
settings = get_settings()
if not oauth_enabled and settings.vector_sync_enabled:
logger.info("Starting background vector sync tasks for BasicAuth mode")
# Get username from environment
username = os.getenv("NEXTCLOUD_USERNAME")
if not username:
raise ValueError(
"NEXTCLOUD_USERNAME required for vector sync in BasicAuth mode"
# Also set oauth_context on browser_app for session authentication
# browser_app is in the same function scope (defined later in create_app)
# We need to find it in the mounted routes
for route in app.routes:
if isinstance(route, Mount) and route.path == "/app":
route.app.state.oauth_context = oauth_context_dict
logger.info(
"OAuth context shared with browser_app for session auth"
)
break
# Get Nextcloud client from MCP app context
# Create client since we're outside FastMCP lifespan
client = NextcloudClient.from_env()
logger.info(
f"OAuth context initialized for login routes (client_id={client_id[:16]}...)"
)
else:
# BasicAuth mode - share storage with browser_app for webhook management
from nextcloud_mcp_server.auth.storage import RefreshTokenStorage
# Initialize Qdrant collection before starting background tasks
logger.info("Initializing Qdrant collection...")
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
storage = RefreshTokenStorage.from_env()
await storage.initialize()
try:
await get_qdrant_client() # Triggers collection creation if needed
logger.info("Qdrant collection ready")
except Exception as e:
logger.error(f"Failed to initialize Qdrant collection: {e}")
raise RuntimeError(
f"Cannot start vector sync - Qdrant initialization failed: {e}"
) from e
app.state.storage = storage
# Initialize shared state
send_stream, receive_stream = anyio_module.create_memory_object_stream(
max_buffer_size=settings.vector_sync_queue_max_size
# Also share with browser_app for webhook routes
for route in app.routes:
if isinstance(route, Mount) and route.path == "/app":
route.app.state.storage = storage
logger.info(
"Storage shared with browser_app for webhook management"
)
break
# Start background vector sync tasks for BasicAuth mode (ADR-007)
# Scanner runs at server-level (once), not per-session
import anyio as anyio_module
settings = get_settings()
if not oauth_enabled and settings.vector_sync_enabled:
logger.info("Starting background vector sync tasks for BasicAuth mode")
# Get username from environment
username = os.getenv("NEXTCLOUD_USERNAME")
if not username:
raise ValueError(
"NEXTCLOUD_USERNAME required for vector sync in BasicAuth mode"
)
shutdown_event = anyio_module.Event()
scanner_wake_event = anyio_module.Event()
# Store in app state for access from routes (ADR-007)
app.state.document_send_stream = send_stream
app.state.document_receive_stream = receive_stream
app.state.shutdown_event = shutdown_event
app.state.scanner_wake_event = scanner_wake_event
# Create client for vector sync (server-level, not per-session)
client = NextcloudClient.from_env()
# Also share with browser_app for /app route
for route in app.routes:
if isinstance(route, Mount) and route.path == "/app":
route.app.state.document_send_stream = send_stream
route.app.state.document_receive_stream = receive_stream
route.app.state.shutdown_event = shutdown_event
route.app.state.scanner_wake_event = scanner_wake_event
logger.info(
"Vector sync state shared with browser_app for /app"
)
break
# Initialize Qdrant collection before starting background tasks
logger.info("Initializing Qdrant collection...")
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
# Start background tasks using anyio TaskGroup
async with anyio_module.create_task_group() as tg:
# Start scanner task
try:
await get_qdrant_client() # Triggers collection creation if needed
logger.info("Qdrant collection ready")
except Exception as e:
logger.error(f"Failed to initialize Qdrant collection: {e}")
raise RuntimeError(
f"Cannot start vector sync - Qdrant initialization failed: {e}"
) from e
# Initialize shared state
send_stream, receive_stream = anyio_module.create_memory_object_stream(
max_buffer_size=settings.vector_sync_queue_max_size
)
shutdown_event = anyio_module.Event()
scanner_wake_event = anyio_module.Event()
# Store in app state for access from routes (ADR-007)
app.state.document_send_stream = send_stream
app.state.document_receive_stream = receive_stream
app.state.shutdown_event = shutdown_event
app.state.scanner_wake_event = scanner_wake_event
# Also store in module singleton for FastMCP session lifespans
_vector_sync_state.document_send_stream = send_stream
_vector_sync_state.document_receive_stream = receive_stream
_vector_sync_state.shutdown_event = shutdown_event
_vector_sync_state.scanner_wake_event = scanner_wake_event
logger.info("Vector sync state stored in module singleton")
# Also share with browser_app for /app route
for route in app.routes:
if isinstance(route, Mount) and route.path == "/app":
route.app.state.document_send_stream = send_stream
route.app.state.document_receive_stream = receive_stream
route.app.state.shutdown_event = shutdown_event
route.app.state.scanner_wake_event = scanner_wake_event
logger.info("Vector sync state shared with browser_app for /app")
break
# Start background tasks using anyio TaskGroup
async with anyio_module.create_task_group() as tg:
# Start scanner task
await tg.start(
scanner_task,
send_stream,
shutdown_event,
scanner_wake_event,
client,
username,
)
# Start processor pool (each gets a cloned receive stream)
for i in range(settings.vector_sync_processor_workers):
await tg.start(
scanner_task,
send_stream,
processor_task,
i,
receive_stream.clone(),
shutdown_event,
scanner_wake_event,
client,
username,
)
# Start processor pool (each gets a cloned receive stream)
for i in range(settings.vector_sync_processor_workers):
await tg.start(
processor_task,
i,
receive_stream.clone(),
shutdown_event,
client,
username,
)
logger.info(
f"Background sync tasks started: 1 scanner + "
f"{settings.vector_sync_processor_workers} processors"
)
logger.info(
f"Background sync tasks started: 1 scanner + "
f"{settings.vector_sync_processor_workers} processors"
)
# Run MCP session manager and yield
async with AsyncExitStack() as stack:
await stack.enter_async_context(mcp.session_manager.run())
try:
yield
finally:
# Shutdown signal
logger.info("Shutting down background sync tasks")
shutdown_event.set()
await client.close()
# TaskGroup automatically cancels all tasks on exit
else:
# No vector sync - just run MCP session manager
# Run MCP session manager and yield
async with AsyncExitStack() as stack:
await stack.enter_async_context(mcp.session_manager.run())
yield
try:
yield
finally:
# Shutdown signal
logger.info("Shutting down background sync tasks")
shutdown_event.set()
await client.close()
# TaskGroup automatically cancels all tasks on exit
else:
# No vector sync - just run MCP session manager
async with AsyncExitStack() as stack:
await stack.enter_async_context(mcp.session_manager.run())
yield
# Health check endpoints for Kubernetes probes
def health_live(request):
@@ -1340,6 +1470,26 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
)
logger.info("Test webhook endpoint enabled: /webhooks/nextcloud")
# ADR-016: Add Smithery well-known config endpoint for container runtime discovery
if deployment_mode == DeploymentMode.SMITHERY_STATELESS:
def smithery_mcp_config(request):
"""Smithery MCP configuration endpoint.
Returns JSON Schema for Smithery's configuration UI.
This endpoint is required for Smithery container runtime discovery.
"""
return JSONResponse(SMITHERY_CONFIG_SCHEMA)
routes.append(
Route(
"/.well-known/mcp-config",
smithery_mcp_config,
methods=["GET"],
)
)
logger.info("Smithery config endpoint enabled: /.well-known/mcp-config")
# Note: Metrics endpoint is NOT exposed on main HTTP port for security reasons.
# Metrics are served on dedicated port via setup_metrics() (default: 9090)
@@ -1470,71 +1620,98 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
)
# Add user info routes (available in both BasicAuth and OAuth modes)
# These require session authentication, so we wrap them in a separate app
from nextcloud_mcp_server.auth.session_backend import SessionAuthBackend
from nextcloud_mcp_server.auth.userinfo_routes import (
revoke_session,
user_info_html,
vector_sync_status_fragment,
)
from nextcloud_mcp_server.auth.viz_routes import (
vector_visualization_html,
vector_visualization_search,
)
from nextcloud_mcp_server.auth.webhook_routes import (
disable_webhook_preset,
enable_webhook_preset,
webhook_management_pane,
)
# Create a separate Starlette app for browser routes that need session auth
# This prevents SessionAuthBackend from interfering with FastMCP's OAuth
browser_routes = [
Route("/", user_info_html, methods=["GET"]), # /app → webapp (HTML UI)
Route(
"/revoke", revoke_session, methods=["POST"], name="revoke_session_endpoint"
), # /app/revoke → revoke_session
# Vector sync status fragment (htmx polling)
Route(
"/vector-sync/status",
# ADR-016: Skip /app admin UI in Smithery stateless mode (no vector sync, webhooks)
if deployment_mode != DeploymentMode.SMITHERY_STATELESS:
# These require session authentication, so we wrap them in a separate app
from nextcloud_mcp_server.auth.session_backend import SessionAuthBackend
from nextcloud_mcp_server.auth.userinfo_routes import (
revoke_session,
user_info_html,
vector_sync_status_fragment,
methods=["GET"],
), # /app/vector-sync/status
# Vector visualization routes
Route(
"/vector-viz", vector_visualization_html, methods=["GET"]
), # /app/vector-viz
Route(
"/vector-viz/search",
)
from nextcloud_mcp_server.auth.viz_routes import (
chunk_context_endpoint,
vector_visualization_html,
vector_visualization_search,
methods=["GET"],
), # /app/vector-viz/search
# Webhook management routes (admin-only)
Route("/webhooks", webhook_management_pane, methods=["GET"]), # /app/webhooks
Route(
"/webhooks/enable/{preset_id:str}", enable_webhook_preset, methods=["POST"]
),
Route(
"/webhooks/disable/{preset_id:str}",
)
from nextcloud_mcp_server.auth.webhook_routes import (
disable_webhook_preset,
methods=["DELETE"],
),
]
enable_webhook_preset,
webhook_management_pane,
)
browser_app = Starlette(routes=browser_routes)
browser_app.add_middleware(
AuthenticationMiddleware,
backend=SessionAuthBackend(oauth_enabled=oauth_enabled),
)
# Create a separate Starlette app for browser routes that need session auth
# This prevents SessionAuthBackend from interfering with FastMCP's OAuth
browser_routes = [
Route(
"/", user_info_html, methods=["GET"]
), # /app → user info with all tabs
Route(
"/revoke",
revoke_session,
methods=["POST"],
name="revoke_session_endpoint",
), # /app/revoke → revoke_session
# Vector sync status fragment (htmx polling)
Route(
"/vector-sync/status",
vector_sync_status_fragment,
methods=["GET"],
), # /app/vector-sync/status
# Vector visualization routes
Route(
"/vector-viz", vector_visualization_html, methods=["GET"]
), # /app/vector-viz
Route(
"/vector-viz/search",
vector_visualization_search,
methods=["GET"],
), # /app/vector-viz/search
Route(
"/chunk-context",
chunk_context_endpoint,
methods=["GET"],
), # /app/chunk-context
# Webhook management routes (admin-only)
Route(
"/webhooks", webhook_management_pane, methods=["GET"]
), # /app/webhooks
Route(
"/webhooks/enable/{preset_id:str}",
enable_webhook_preset,
methods=["POST"],
),
Route(
"/webhooks/disable/{preset_id:str}",
disable_webhook_preset,
methods=["DELETE"],
),
]
# Add redirect from /app to /app/ (Starlette requires trailing slash for mounted apps)
routes.append(
Route("/app", lambda request: RedirectResponse("/app/", status_code=307))
)
# Add static files mount if directory exists
static_dir = os.path.join(os.path.dirname(__file__), "auth", "static")
if os.path.isdir(static_dir):
browser_routes.append(
Mount("/static", StaticFiles(directory=static_dir), name="static")
)
logger.info(f"Mounted static files from {static_dir}")
# Mount browser app at /app (webapp and admin routes)
routes.append(Mount("/app", app=browser_app))
logger.info("App routes with session auth: /app, /app/webhooks, /app/revoke")
browser_app = Starlette(routes=browser_routes)
browser_app.add_middleware(
AuthenticationMiddleware, # type: ignore[invalid-argument-type]
backend=SessionAuthBackend(oauth_enabled=oauth_enabled),
)
# Add redirect from /app to /app/ (Starlette requires trailing slash for mounted apps)
routes.append(
Route("/app", lambda request: RedirectResponse("/app/", status_code=307))
)
# Mount browser app at /app (webapp and admin routes)
routes.append(Mount("/app", app=browser_app))
logger.info("App routes with session auth: /app, /app/webhooks, /app/revoke")
else:
logger.info("Admin UI (/app) disabled in Smithery stateless mode")
# Mount FastMCP at root last (catch-all, handles OAuth via token_verifier)
routes.append(Mount("/", app=mcp_app))
@@ -1613,7 +1790,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
# Add CORS middleware to allow browser-based clients like MCP Inspector
app.add_middleware(
CORSMiddleware,
CORSMiddleware, # type: ignore[invalid-argument-type]
allow_origins=["*"], # Allow all origins for development
allow_credentials=True,
allow_methods=["*"],
@@ -1623,7 +1800,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
# Add observability middleware (metrics + tracing)
if settings.metrics_enabled or settings.otel_exporter_otlp_endpoint:
app.add_middleware(ObservabilityMiddleware)
app.add_middleware(ObservabilityMiddleware) # type: ignore[invalid-argument-type]
logger.info("Observability middleware enabled (metrics and/or tracing)")
# Add exception handler for scope challenges (OAuth mode only)
@@ -1654,4 +1831,11 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
logger.info("WWW-Authenticate scope challenge handler enabled")
# ADR-016: Apply SmitheryConfigMiddleware in Smithery stateless mode
# This must be the outermost middleware to extract config from URL query parameters
# before any other middleware processes the request
if deployment_mode == DeploymentMode.SMITHERY_STATELESS:
app = SmitheryConfigMiddleware(app)
logger.info("SmitheryConfigMiddleware enabled for query parameter config")
return app
Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

@@ -0,0 +1,219 @@
.viz-layout {
display: flex;
flex-direction: column;
gap: 16px;
height: 100%;
min-height: 0;
overflow-y: auto;
}
.viz-card {
background: var(--color-main-background);
border-radius: 0;
padding: 16px;
box-shadow: none;
}
.viz-controls-card {
flex: 0 0 auto;
border-bottom: 1px solid var(--color-border);
padding-bottom: 16px;
}
.viz-controls-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
gap: 12px;
align-items: end;
}
@media (min-width: 768px) {
.viz-controls-grid {
grid-template-columns: 2fr 1.5fr 1.5fr auto auto;
}
}
.viz-control-group {
display: flex;
flex-direction: column;
gap: 4px;
}
.viz-control-group label {
font-weight: 500;
color: var(--color-main-text);
font-size: 13px;
}
.viz-control-group input[type="text"],
.viz-control-group input[type="number"],
.viz-control-group select {
width: 100%;
padding: 7px 10px;
border: 1px solid var(--color-border-dark);
border-radius: var(--border-radius);
font-size: 14px;
background: var(--color-main-background);
color: var(--color-main-text);
}
.viz-control-group input:focus,
.viz-control-group select:focus {
outline: none;
border-color: var(--color-primary-element);
}
.viz-control-group input[type="range"] {
width: 100%;
}
.viz-control-group select[multiple] {
min-height: 100px;
}
.viz-weight-display {
display: inline-block;
min-width: 40px;
text-align: right;
color: #666;
}
.viz-btn {
background: var(--color-primary-element);
color: white;
border: none;
padding: 7px 16px;
border-radius: var(--border-radius);
cursor: pointer;
font-size: 14px;
font-weight: 500;
white-space: nowrap;
}
.viz-btn:hover {
background: #0052a3;
}
.viz-btn-secondary {
background: #6c757d;
color: white;
border: none;
padding: 7px 16px;
border-radius: var(--border-radius);
cursor: pointer;
font-size: 14px;
white-space: nowrap;
}
.viz-btn-secondary:hover {
background: #5a6268;
}
.viz-card-plot {
flex: 0 0 auto;
display: flex;
flex-direction: column;
min-height: 500px;
height: 600px;
/* Remove horizontal padding to extend to full viewport width */
padding-left: 0;
padding-right: 0;
margin-left: -16px;
margin-right: -16px;
}
#viz-plot-container {
width: 100%;
height: 100%;
position: relative;
overflow: visible;
}
#viz-plot {
width: 100%;
height: 100%;
}
.viz-loading {
text-align: center;
padding: 40px;
color: #666;
}
.viz-loading-overlay {
position: absolute;
inset: 0;
display: flex;
align-items: center;
justify-content: center;
background: white;
color: #666;
}
.viz-no-results {
text-align: center;
padding: 40px;
color: #666;
font-style: italic;
}
.viz-advanced-section {
margin-top: 12px;
padding: 12px;
background: var(--color-background-hover);
border-radius: var(--border-radius);
border: 1px solid var(--color-border);
}
.viz-info-box {
background: var(--color-primary-element-light);
border-left: 3px solid var(--color-primary-element);
padding: 10px 12px;
margin-bottom: 16px;
font-size: 13px;
color: var(--color-main-text);
}
.chunk-toggle-btn {
background: #6c757d;
color: white;
border: none;
padding: 4px 10px;
border-radius: 3px;
cursor: pointer;
font-size: 12px;
margin-top: 6px;
}
.chunk-toggle-btn:hover {
background: #5a6268;
}
.chunk-context {
background: var(--color-background-hover);
border: 1px solid var(--color-border);
border-radius: var(--border-radius);
padding: 12px;
margin-top: 8px;
font-family: 'SFMono-Regular', 'Consolas', 'Liberation Mono', 'Menlo', monospace;
font-size: 13px;
line-height: 1.6;
white-space: pre-wrap;
word-wrap: break-word;
}
.chunk-text {
color: var(--color-text-maxcontrast);
}
.chunk-matched {
background: #fff3cd;
border: 1px solid #ffc107;
padding: 2px 4px;
border-radius: var(--border-radius);
font-weight: 500;
color: var(--color-main-text);
}
.chunk-ellipsis {
color: var(--color-text-maxcontrast);
font-style: italic;
}
/* PDF highlighted image styles */
.chunk-image-container {
margin-bottom: 16px;
border: 1px solid var(--color-border);
border-radius: var(--border-radius);
overflow: hidden;
background: #fff;
}
.chunk-image-header {
background: var(--color-background-dark);
padding: 8px 12px;
font-size: 12px;
font-weight: 500;
color: var(--color-text-maxcontrast);
border-bottom: 1px solid var(--color-border);
font-family: var(--font-face);
}
.chunk-highlighted-image {
display: block;
max-width: 100%;
height: auto;
cursor: zoom-in;
}
.chunk-highlighted-image:hover {
opacity: 0.95;
}
@@ -0,0 +1,253 @@
// Initialize vizApp for vector visualization
function vizApp() {
return {
query: '',
algorithm: 'bm25_hybrid',
fusion: 'rrf',
showAdvanced: false,
showQueryPoint: true,
docTypes: [''],
limit: 50,
scoreThreshold: 0.0,
loading: false,
results: [],
coordinates: null,
queryCoords: null,
expandedChunks: {},
chunkLoading: {},
init() {
// Set up window resize listener to resize plot
window.addEventListener('resize', () => {
if (this.coordinates && this.results.length > 0) {
Plotly.Plots.resize('viz-plot');
}
});
},
async executeSearch() {
this.loading = true;
this.results = [];
try {
const params = new URLSearchParams({
query: this.query,
algorithm: this.algorithm,
limit: this.limit,
score_threshold: this.scoreThreshold,
});
if (this.algorithm === 'bm25_hybrid') {
params.append('fusion', this.fusion);
}
const selectedTypes = this.docTypes.filter(t => t !== '');
if (selectedTypes.length > 0) {
params.append('doc_types', selectedTypes.join(','));
}
const response = await fetch(`/app/vector-viz/search?${params}`);
const data = await response.json();
if (data.success) {
this.results = data.results;
this.coordinates = data.coordinates_3d;
this.queryCoords = data.query_coords;
this.renderPlot(this.coordinates, this.queryCoords, this.results);
} else {
alert('Search failed: ' + data.error);
}
} catch (error) {
alert('Error: ' + error.message);
} finally {
this.loading = false;
}
},
updatePlot() {
// Toggle query point visibility without recreating the plot
// This preserves camera position naturally since layout is untouched
if (this.coordinates && this.queryCoords && this.results.length > 0) {
const plotDiv = document.getElementById('viz-plot');
// If plot exists, just toggle the query trace visibility
if (plotDiv && plotDiv.data && plotDiv.data.length >= 2) {
// Trace index 1 is the query point
Plotly.restyle('viz-plot', { visible: this.showQueryPoint }, [1]);
} else {
// Plot doesn't exist yet, render it
this.renderPlot(this.coordinates, this.queryCoords, this.results);
}
}
},
renderPlot(coordinates, queryCoords, results) {
// Get container dimensions before creating layout
const container = document.getElementById('viz-plot-container');
const width = container.clientWidth;
const height = container.clientHeight;
const scores = results.map(r => r.score);
// Trace 1: Document results (always visible)
const documentTrace = {
x: coordinates.map(c => c[0]),
y: coordinates.map(c => c[1]),
z: coordinates.map(c => c[2]),
mode: 'markers',
type: 'scatter3d',
name: 'Documents',
visible: true,
customdata: results.map((r, i) => ({
title: r.title,
raw_score: r.original_score,
relative_score: r.score,
x: coordinates[i][0],
y: coordinates[i][1],
z: coordinates[i][2]
})),
hovertemplate:
'<b>%{customdata.title}</b><br>' +
'Raw Score: %{customdata.raw_score:.3f} (%{customdata.relative_score:.0%} relative)<br>' +
'(x=%{customdata.x}, y=%{customdata.y}, z=%{customdata.z})' +
'<extra></extra>',
marker: {
size: results.map(r => 4 + (Math.pow(r.score, 2) * 10)),
opacity: results.map(r => 0.3 + (r.score * 0.7)),
color: scores,
colorscale: 'Viridis',
showscale: true,
colorbar: {
title: 'Relative Score',
x: 1.02,
xanchor: 'left',
thickness: 20,
len: 0.8
},
cmin: 0,
cmax: 1
}
};
// Trace 2: Query point (visibility controlled by toggle)
const queryTrace = {
x: [queryCoords[0]],
y: [queryCoords[1]],
z: [queryCoords[2]],
mode: 'markers',
type: 'scatter3d',
name: 'Query',
visible: this.showQueryPoint, // Initial visibility from state
hovertemplate:
'<b>Search Query</b><br>' +
`(x=${queryCoords[0]}, y=${queryCoords[1]}, z=${queryCoords[2]})` +
'<extra></extra>',
marker: {
size: 10,
color: '#ef5350', // Subdued red (Material Design Red 400)
line: {
color: '#c62828', // Darker red border (Material Design Red 800)
width: 1
}
}
};
const layout = {
title: `Vector Space (PCA 3D) - ${results.length} results`,
width: width, // Explicit width from container
height: height, // Explicit height from container
scene: {
xaxis: { title: 'PC1' },
yaxis: { title: 'PC2' },
zaxis: { title: 'PC3' },
camera: {
eye: { x: 1.5, y: 1.5, z: 1.5 }
},
// Full width for 3D scene
domain: {
x: [0, 1],
y: [0, 1]
}
},
hovermode: 'closest',
autosize: true, // Enable auto-sizing for window resizes
showlegend: false, // Hide legend
margin: { l: 0, r: 100, t: 40, b: 0 } // Right margin for colorbar
};
// Always render both traces - visibility is controlled by the visible property
const traces = [documentTrace, queryTrace];
// Enable responsive resizing
const config = {
responsive: true,
displayModeBar: true
};
// Use newPlot() with explicit dimensions - renders at correct size immediately
// Camera position will be preserved by subsequent Plotly.restyle() calls in updatePlot()
Plotly.newPlot('viz-plot', traces, layout, config);
},
getNextcloudUrl(result) {
// Use global NEXTCLOUD_BASE_URL if set, otherwise construct from window location
const baseUrl = window.NEXTCLOUD_BASE_URL || '';
switch (result.doc_type) {
case 'note':
return `${baseUrl}/apps/notes/note/${result.id}`;
case 'file':
return `${baseUrl}/apps/files/?fileId=${result.id}`;
case 'calendar':
return `${baseUrl}/apps/calendar`;
case 'contact':
return `${baseUrl}/apps/contacts`;
case 'deck':
return `${baseUrl}/apps/deck`;
default:
return `${baseUrl}`;
}
},
hasChunkPosition(result) {
return result.chunk_start_offset != null && result.chunk_end_offset != null;
},
isChunkExpanded(resultKey) {
return this.expandedChunks[resultKey] !== undefined;
},
async toggleChunk(result) {
const resultKey = `${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`;
if (this.isChunkExpanded(resultKey)) {
delete this.expandedChunks[resultKey];
return;
}
this.chunkLoading[resultKey] = true;
try {
const params = new URLSearchParams({
doc_type: result.doc_type,
doc_id: result.id,
start: result.chunk_start_offset,
end: result.chunk_end_offset,
context: 500
});
const response = await fetch(`/app/chunk-context?${params}`);
const data = await response.json();
if (data.success) {
this.expandedChunks[resultKey] = data;
} else {
alert('Failed to load chunk: ' + data.error);
}
} catch (error) {
alert('Error loading chunk: ' + error.message);
} finally {
delete this.chunkLoading[resultKey];
}
}
};
}
@@ -0,0 +1,524 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="theme-color" content="#0082c9">
<title>{% block title %}Nextcloud MCP Server{% endblock %}</title>
<!-- Favicon -->
<link rel="icon" type="image/svg+xml" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' width='32' height='32' viewBox='0 0 512 512'><rect width='512' height='512' rx='80' ry='80' fill='%230082C9'/><path d='M255.9 21.04c-11.8 0-22.2 4.08-28.6 10.01-5.6 4.98-8.6 11.41-8.6 18.11 0 5.55 2.2 11.01 5.9 15.48-16.4 4.97-30.1 13.64-39 24.53 22.1-7.67 45.7-11.86 70.3-11.86 24.6 0 48.3 4.19 70.3 11.86-8.9-10.89-22.6-19.56-39-24.53 3.9-4.47 5.9-9.93 5.9-15.48 0-6.7-3-13.13-8.5-18.11-6.4-5.93-16.9-10.01-28.7-10.01zm0 20.34c5.3 0 10.1 1.27 13.6 3.52 1.7 1.16 3.4 2.43 3.4 4.27 0 1.76-1.7 3.03-3.4 4.19-3.5 2.33-8.3 3.61-13.6 3.61-5.3 0-10.1-1.28-13.6-3.61-1.6-1.16-3.3-2.43-3.3-4.19 0-1.84 1.7-3.11 3.3-4.27 3.5-2.25 8.3-3.52 13.6-3.52zm.1 48.1c-110.8 0-200.72 90.02-200.72 200.82S145.2 491 256 491s200.7-89.9 200.7-200.7c0-110.8-89.9-200.82-200.7-200.82zm0 32.62c92.9 0 168.2 75.3 168.2 168.2 0 92.8-75.3 168.2-168.2 168.2-92.9 0-168.26-75.4-168.26-168.2 0-92.9 75.36-168.2 168.26-168.2zm-8.2 6.3c-9.6.5-19 1.9-28.3 4.1l2.3 7.8c8.4-2 17.1-3.3 26-3.8v-8.1zm16.2 0v8.1c9 .5 17.7 1.8 26 3.8l2.2-7.8c-9.1-2.2-18.6-3.6-28.2-4.1zm-60 8.5c-9 3.2-17.6 7-25.8 11.6l4.1 7.1c7.7-4.3 15.6-7.9 23.9-10.8l-2.2-7.9zm103.7 0-2 7.9c8.4 2.9 16.2 6.5 23.8 10.8l4.2-7.1c-8.2-4.6-16.9-8.4-26-11.6zm-143.3 20.3c-7.5 5.4-14.6 11.4-21.1 17.9l5.8 5.8c5.9-6.1 12.5-11.7 19.5-16.6l-4.2-7.1zm182.9 0-4 7.1c6.9 4.9 13.5 10.5 19.5 16.6l5.7-5.8c-6.5-6.5-13.7-12.5-21.2-17.9zm-91.4 11.5c-37 0-67.4 28.6-70.3 64.9l15.9 4.7c.7-29.6 24.7-53.4 54.4-53.4 30.1 0 54.4 24.4 54.4 54.3 0 15-6.2 28.7-16 38.5l.1.1c1.7 2.7 3 5.6 4.1 8.6.9 3 1.7 5.7 2.3 8.6v.4c33.8-16.7 57.2-51.5 57.2-91.7 0-3.8-.2-7.3-.6-10.9-3.2-3.3-6.3-6.4-9.8-9.5 1.5 6.5 2.3 13.4 2.3 20.4 0 28.7-13 54.7-33.5 71.8 6.3-10.6 10.1-23 10.1-36.3 0-38.9-31.7-70.5-70.6-70.5zm-91.8 14.6c-3.3 3.1-6.5 6.2-9.7 9.5-.3 3.6-.5 7.1-.5 10.9 0 7.3.7 14.2 2.1 20.9l9.1 2.7c-2.1-7.5-3.1-15.4-3.1-23.6 0-7 .7-13.9 2.1-20.4zm-31.6 4c-5.8 7.1-10.9 14.6-15.4 22.6l7.1 4c4.1-7.4 8.8-14.3 14-20.8l-5.7-5.8zm246.8 0-5.7 5.8c5.3 6.5 10 13.4 13.9 20.8l7.1-4c-4.4-8-9.5-15.5-15.3-22.6zm-269.2 37.1c-2.5 5.7-4.6 11.4-6.4 17.6l.1-.3c3.4-5 7.9-9.3 12.9-12.5l.3-.6-6.9-4.2zm291.8 0-7.2 4.2c3.2 7.3 5.7 15.1 7.6 23.1l7.9-2.1c-2.1-8.8-4.9-17.3-8.3-25.2zm-261.2 11.5c-13.4.1-25.7 9-29.7 22.5l114.8 34.2c-4.9 16.7 4.6 34.2 21.2 39.2L361.7 366c16.6 5 34.1-4.4 39.1-21l-114.6-34.4c4.9-16.5-4.7-34.1-21.3-39.1 0 0-72.4-21.5-114.8-34.3-3.1-.9-6.3-1.4-9.4-1.3zm-42.09 29.7c-.9 6.9-1.4 14-1.4 21.3 0 1.3.1 2.9.1 4.2h8.09v-4.2c0-6.5.4-12.9 1.2-19.2l-7.99-2.1zm314.59 0-7.9 2.1c.7 6.3 1.3 12.7 1.3 19.2 0 1.3 0 2.9-.2 4.2h8.2v-4.2c0-7.3-.5-14.4-1.4-21.3zm-157.3 24.7c6.3 0 11.5 5 11.5 11.3 0 6.4-5.2 11.6-11.5 11.6s-11.5-5.2-11.5-11.6c0-6.3 5.2-11.3 11.5-11.3zM98.51 307.4c1 8.2 2.89 16.4 5.09 24.3l7.9-2.1c-2.1-7.2-3.8-14.6-4.8-22.2h-8.19zm306.69 0c-1.1 7.6-2.7 15-4.8 22.2l7.8 2.1c2.2-7.9 4.1-16.1 5.2-24.3h-8.2zm-191.3 10.9c-19 13.3-31.4 35.3-31.4 60.1 0 10.4 2.3 20.4 6.2 29.7 8.8 4.9 17.9 8.8 27.6 11.7-10.8-10.7-17.5-25.2-17.5-41.4 0-19 9.3-36 23.7-46.3-3.8-4.1-6.7-8.7-8.6-13.8zM116.8 345l-7.9 2c3.1 7.6 6.8 14.7 11 21.6l6.9-4.2c-3.8-6.2-7-12.8-10-19.4zm194.8 20.5c.9 4.1 1.4 8.5 1.4 12.9 0 16.2-6.7 30.7-17.4 41.4 9.6-2.9 18.8-6.8 27.5-11.7 4-9.3 6.2-19.3 6.2-29.7 0-2.7-.2-5.2-.4-7.7l-17.3-5.2zM136 377.9l-7.1 4.1c4.7 6.2 9.7 12.1 15.3 17.3l5.7-5.5c-5.1-5-9.7-10.3-13.9-15.9zm243.9 2.3-.2.1c-2.1.3-4 .6-6.2.7h-.1c-3.6 4.5-7.3 8.8-11.5 12.8l5.8 5.5c5.5-5.2 10.5-11.1 15.2-17.3l-3-1.8zm-217.8 24-5.9 5.9c6 4.8 12.2 9.7 18.8 13.6l3.8-7.8c-5.7-2.9-11.4-6.8-16.7-11.7zm187.7 0c-5.4 4.9-11.1 8.8-16.8 11.7l3.9 7.8c6.5-3.9 12.8-8.8 18.7-13.6l-5.8-5.9zm-156.4 19.5-4.1 6.8c6.6 4 13.7 5.8 20.7 8.8l2.2-7.9c-6.5-1.9-12.7-4.8-18.8-7.7zm125.2 0c-6.2 2.9-12.5 5.8-19.1 7.7l2.3 7.9c7.2-3 14-4.8 20.7-8.8l-3.9-6.8zm-90.7 11.7-2 7.8c7.1 1 14.5 1.9 21.9 1.9v-7.7c-6.8 0-13.5-1.1-19.9-2zm55.9 0c-6.3.9-13 2-19.8 2v7.7c7.5 0 14.8-.9 22.1-1.9l-2.3-7.8z' fill='%23fff'/></svg>">
<!-- Open Sans font -->
<style>
@font-face {
font-family: 'Open Sans';
font-style: normal;
font-weight: normal;
src: local('Open Sans'), local('OpenSans');
}
@font-face {
font-family: 'Open Sans';
font-style: normal;
font-weight: bold;
src: local('Open Sans Semibold'), local('OpenSans-Semibold');
}
</style>
{% block extra_head %}{% endblock %}
<style>
/* Nextcloud App Design System */
/* CSS Variables */
:root {
/* Primary Colors */
--color-primary: #00679e;
--color-primary-element: #00679e;
--color-primary-light: #e5eff5;
--color-primary-element-light: #e5eff5;
/* Background Colors */
--color-main-background: #ffffff;
--color-background-dark: #ededed;
--color-background-hover: #f5f5f5;
/* Text Colors */
--color-main-text: #222222;
--color-text-maxcontrast: #6b6b6b;
--color-text-light: #767676;
/* Border Colors */
--color-border: #ededed;
--color-border-dark: #dbdbdb;
/* Borders & Radius */
--border-radius: 3px;
--border-radius-large: 10px;
--border-radius-pill: 100px;
/* Spacing */
--default-grid-baseline: 4px;
--default-clickable-area: 44px;
}
/* SVG Icon Styles */
.nav-icon {
width: 20px;
height: 20px;
display: inline-block;
fill: var(--color-main-text);
opacity: 0.7;
}
.app-navigation-entry.active .nav-icon {
fill: var(--color-primary-element);
opacity: 1;
}
/* General */
* {
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
color: var(--color-main-text);
background: var(--color-main-background);
margin: 0;
padding: 0;
}
h1, h2, h3 {
font-weight: 300;
line-height: 1.2;
}
h1 {
font-size: 32px;
margin: 0 0 20px 0;
color: var(--color-main-text);
}
h2 {
font-size: 20px;
margin: 20px 0 12px 0;
color: var(--color-main-text);
border-bottom: 1px solid var(--color-border);
padding-bottom: 8px;
}
h3 {
font-size: 16px;
margin: 16px 0 8px 0;
color: var(--color-main-text);
font-weight: 500;
}
img {
max-width: 100%;
}
/* App Header (simplified, no full menu) */
.app-header {
height: 50px;
background: var(--color-primary-element);
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
position: sticky;
top: 0;
z-index: 100;
display: flex;
align-items: center;
padding: 0 20px;
}
.app-header__brand {
color: white;
font-size: 18px;
font-weight: 600;
text-decoration: none;
display: flex;
align-items: center;
gap: 12px;
}
.app-header__brand:hover {
opacity: 0.9;
}
.app-header__logo {
height: 32px;
width: 32px;
fill: white;
}
/* App Layout */
.app-content-wrapper {
display: flex;
height: calc(100vh - 50px);
overflow: hidden;
}
/* Side Navigation */
#app-navigation {
width: 250px;
background: var(--color-main-background);
border-right: 1px solid var(--color-border);
display: flex;
flex-direction: column;
flex-shrink: 0;
transition: margin-left 0.3s ease;
}
#app-navigation.app-navigation--closed {
margin-left: -250px;
}
.app-navigation__content {
flex: 1;
overflow-y: auto;
padding: 8px;
display: flex;
flex-direction: column;
}
.app-navigation-list {
list-style: none;
padding: 0;
margin: 0;
flex: 1;
}
.app-navigation-entry {
position: relative;
margin-bottom: 2px;
}
.app-navigation-entry__wrapper {
display: flex;
align-items: center;
position: relative;
}
.app-navigation-entry-link {
display: flex;
align-items: center;
padding: 0 8px;
min-height: var(--default-clickable-area);
border-radius: var(--border-radius);
transition: background-color 100ms ease-in-out;
text-decoration: none;
color: var(--color-main-text);
flex: 1;
font-size: 14px;
}
.app-navigation-entry-link:hover {
background-color: var(--color-background-hover);
}
.app-navigation-entry.active .app-navigation-entry-link {
background-color: var(--color-primary-element-light);
font-weight: 500;
}
.app-navigation-entry-icon {
width: var(--default-clickable-area);
height: var(--default-clickable-area);
display: flex;
align-items: center;
justify-content: center;
margin-right: 0;
}
.app-navigation-entry__name {
flex: 1;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.app-navigation-entry__counter {
margin-left: auto;
padding: 2px 6px;
border-radius: var(--border-radius-pill);
background-color: var(--color-background-dark);
font-size: 11px;
color: var(--color-text-maxcontrast);
min-width: 20px;
text-align: center;
}
.app-navigation__settings {
list-style: none;
padding: 8px 0 0 0;
margin: 8px 0 0 0;
border-top: 1px solid var(--color-border);
flex-shrink: 0;
}
.app-navigation-toggle {
display: flex;
align-items: center;
justify-content: center;
position: fixed;
top: 60px;
left: 10px;
z-index: 110;
background: var(--color-main-background);
border: 1px solid var(--color-border);
border-radius: var(--border-radius);
padding: 8px 12px;
cursor: pointer;
box-shadow: 0 0 5px rgba(0,0,0,0.1);
transition: left 0.3s ease;
}
.app-navigation-toggle:hover {
background: var(--color-background-hover);
}
#app-navigation:not(.app-navigation--closed) ~ * .app-navigation-toggle {
left: 260px;
}
/* Main Content Area */
#app-content {
flex: 1;
overflow-y: auto;
background: var(--color-main-background);
}
.page-content {
max-width: 1000px;
margin: 0 auto;
padding: 24px;
}
.content-section {
background: var(--color-main-background);
border-radius: 0;
padding: 0;
box-shadow: none;
}
.content-section h1 {
font-size: 24px;
font-weight: 600;
margin-bottom: 24px;
}
.content-section h2 {
font-size: 18px;
font-weight: 500;
margin: 24px 0 12px 0;
border-bottom: none;
padding-bottom: 0;
}
.content-section h3 {
font-size: 16px;
font-weight: 500;
}
/* Responsive */
@media (max-width: 768px) {
#app-navigation {
position: fixed;
height: calc(100vh - 50px);
z-index: 105;
box-shadow: 2px 0 8px rgba(0,0,0,0.1);
}
.page-content {
padding: 16px;
}
}
/* Footer */
footer.page-footer {
background-color: #0F0833;
color: #ffffff;
padding: 40px 0;
margin-top: 60px;
}
footer.page-footer .bootstrap-container {
max-width: 1200px;
margin: 0 auto;
padding: 0 20px;
}
footer.page-footer h1 {
font-size: 15px;
font-weight: bold;
line-height: 1.8;
color: #ffffff;
margin-top: 20px;
}
footer.page-footer ul {
list-style-type: none;
padding-left: 0;
}
footer.page-footer li {
font-size: 13px;
line-height: 1.8;
color: #ffffff;
margin-top: 0;
}
footer.page-footer li a {
color: #ffffff;
text-decoration: none;
display: block;
padding: 4px 0;
}
footer.page-footer li a:hover {
text-decoration: underline;
}
footer.page-footer p {
font-size: 15px;
line-height: 1.8;
color: #ffffff;
}
footer.page-footer p.copyright {
color: rgba(255, 255, 255, 0.5);
font-size: 13px;
text-align: center;
margin-top: 30px;
}
/* Buttons */
.btn {
border-radius: 50px;
padding: 10px 20px;
text-decoration: none;
display: inline-block;
cursor: pointer;
border: none;
font-size: 14px;
transition: all 0.3s;
}
.btn-primary {
background: #0082C9;
border: 1px solid #0062C9;
color: #fff;
}
.btn-primary:hover {
background: #006ba3;
}
/* Tables */
table {
width: 100%;
border-collapse: collapse;
margin: 20px 0;
}
td {
padding: 12px 8px;
border-bottom: 1px solid var(--color-border);
font-size: 14px;
}
td:first-child {
width: 180px;
color: var(--color-text-maxcontrast);
font-weight: 500;
}
code {
background-color: var(--color-background-dark);
padding: 2px 6px;
border-radius: var(--border-radius);
font-family: 'SFMono-Regular', 'Consolas', 'Liberation Mono', 'Menlo', monospace;
font-size: 90%;
color: var(--color-main-text);
}
/* Badges */
.badge {
display: inline-block;
padding: 3px 8px;
border-radius: 12px;
font-size: 12px;
font-weight: bold;
text-transform: uppercase;
}
.badge-oauth {
background-color: #4caf50;
color: white;
}
.badge-basic {
background-color: #2196f3;
color: white;
}
/* Messages */
.warning {
background-color: #fff3cd;
border-left: 4px solid #ffc107;
padding: 15px;
margin: 15px 0;
color: #856404;
}
.info-message {
background-color: #e3f2fd;
border-left: 4px solid #2196f3;
padding: 15px;
margin: 15px 0;
color: #1565c0;
}
.error {
background-color: #ffebee;
border-left: 4px solid #d32f2f;
padding: 15px;
margin: 15px 0;
color: #c62828;
}
.success {
background-color: #e8f5e9;
border: 2px solid #4caf50;
padding: 30px;
border-radius: 8px;
text-align: center;
}
.success h1 {
color: #4caf50;
}
{% block extra_styles %}{% endblock %}
</style>
</head>
<body>
<!-- App Header -->
<header class="app-header">
<a href="/app" class="app-header__brand">
<svg class="app-header__logo" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512">
<path d="M255.9 21.04c-11.8 0-22.2 4.08-28.6 10.01-5.6 4.98-8.6 11.41-8.6 18.11 0 5.55 2.2 11.01 5.9 15.48-16.4 4.97-30.1 13.64-39 24.53 22.1-7.67 45.7-11.86 70.3-11.86 24.6 0 48.3 4.19 70.3 11.86-8.9-10.89-22.6-19.56-39-24.53 3.9-4.47 5.9-9.93 5.9-15.48 0-6.7-3-13.13-8.5-18.11-6.4-5.93-16.9-10.01-28.7-10.01zm0 20.34c5.3 0 10.1 1.27 13.6 3.52 1.7 1.16 3.4 2.43 3.4 4.27 0 1.76-1.7 3.03-3.4 4.19-3.5 2.33-8.3 3.61-13.6 3.61-5.3 0-10.1-1.28-13.6-3.61-1.6-1.16-3.3-2.43-3.3-4.19 0-1.84 1.7-3.11 3.3-4.27 3.5-2.25 8.3-3.52 13.6-3.52zm.1 48.1c-110.8 0-200.72 90.02-200.72 200.82S145.2 491 256 491s200.7-89.9 200.7-200.7c0-110.8-89.9-200.82-200.7-200.82zm0 32.62c92.9 0 168.2 75.3 168.2 168.2 0 92.8-75.3 168.2-168.2 168.2-92.9 0-168.26-75.4-168.26-168.2 0-92.9 75.36-168.2 168.26-168.2zm-8.2 6.3c-9.6.5-19 1.9-28.3 4.1l2.3 7.8c8.4-2 17.1-3.3 26-3.8v-8.1zm16.2 0v8.1c9 .5 17.7 1.8 26 3.8l2.2-7.8c-9.1-2.2-18.6-3.6-28.2-4.1zm-60 8.5c-9 3.2-17.6 7-25.8 11.6l4.1 7.1c7.7-4.3 15.6-7.9 23.9-10.8l-2.2-7.9zm103.7 0-2 7.9c8.4 2.9 16.2 6.5 23.8 10.8l4.2-7.1c-8.2-4.6-16.9-8.4-26-11.6zm-143.3 20.3c-7.5 5.4-14.6 11.4-21.1 17.9l5.8 5.8c5.9-6.1 12.5-11.7 19.5-16.6l-4.2-7.1zm182.9 0-4 7.1c6.9 4.9 13.5 10.5 19.5 16.6l5.7-5.8c-6.5-6.5-13.7-12.5-21.2-17.9zm-91.4 11.5c-37 0-67.4 28.6-70.3 64.9l15.9 4.7c.7-29.6 24.7-53.4 54.4-53.4 30.1 0 54.4 24.4 54.4 54.3 0 15-6.2 28.7-16 38.5l.1.1c1.7 2.7 3 5.6 4.1 8.6.9 3 1.7 5.7 2.3 8.6v.4c33.8-16.7 57.2-51.5 57.2-91.7 0-3.8-.2-7.3-.6-10.9-3.2-3.3-6.3-6.4-9.8-9.5 1.5 6.5 2.3 13.4 2.3 20.4 0 28.7-13 54.7-33.5 71.8 6.3-10.6 10.1-23 10.1-36.3 0-38.9-31.7-70.5-70.6-70.5zm-91.8 14.6c-3.3 3.1-6.5 6.2-9.7 9.5-.3 3.6-.5 7.1-.5 10.9 0 7.3.7 14.2 2.1 20.9l9.1 2.7c-2.1-7.5-3.1-15.4-3.1-23.6 0-7 .7-13.9 2.1-20.4zm-31.6 4c-5.8 7.1-10.9 14.6-15.4 22.6l7.1 4c4.1-7.4 8.8-14.3 14-20.8l-5.7-5.8zm246.8 0-5.7 5.8c5.3 6.5 10 13.4 13.9 20.8l7.1-4c-4.4-8-9.5-15.5-15.3-22.6zm-269.2 37.1c-2.5 5.7-4.6 11.4-6.4 17.6l.1-.3c3.4-5 7.9-9.3 12.9-12.5l.3-.6-6.9-4.2zm291.8 0-7.2 4.2c3.2 7.3 5.7 15.1 7.6 23.1l7.9-2.1c-2.1-8.8-4.9-17.3-8.3-25.2zm-261.2 11.5c-13.4.1-25.7 9-29.7 22.5l114.8 34.2c-4.9 16.7 4.6 34.2 21.2 39.2L361.7 366c16.6 5 34.1-4.4 39.1-21l-114.6-34.4c4.9-16.5-4.7-34.1-21.3-39.1 0 0-72.4-21.5-114.8-34.3-3.1-.9-6.3-1.4-9.4-1.3zm-42.09 29.7c-.9 6.9-1.4 14-1.4 21.3 0 1.3.1 2.9.1 4.2h8.09v-4.2c0-6.5.4-12.9 1.2-19.2l-7.99-2.1zm314.59 0-7.9 2.1c.7 6.3 1.3 12.7 1.3 19.2 0 1.3 0 2.9-.2 4.2h8.2v-4.2c0-7.3-.5-14.4-1.4-21.3zm-157.3 24.7c6.3 0 11.5 5 11.5 11.3 0 6.4-5.2 11.6-11.5 11.6s-11.5-5.2-11.5-11.6c0-6.3 5.2-11.3 11.5-11.3zM98.51 307.4c1 8.2 2.89 16.4 5.09 24.3l7.9-2.1c-2.1-7.2-3.8-14.6-4.8-22.2h-8.19zm306.69 0c-1.1 7.6-2.7 15-4.8 22.2l7.8 2.1c2.2-7.9 4.1-16.1 5.2-24.3h-8.2zm-191.3 10.9c-19 13.3-31.4 35.3-31.4 60.1 0 10.4 2.3 20.4 6.2 29.7 8.8 4.9 17.9 8.8 27.6 11.7-10.8-10.7-17.5-25.2-17.5-41.4 0-19 9.3-36 23.7-46.3-3.8-4.1-6.7-8.7-8.6-13.8zM116.8 345l-7.9 2c3.1 7.6 6.8 14.7 11 21.6l6.9-4.2c-3.8-6.2-7-12.8-10-19.4zm194.8 20.5c.9 4.1 1.4 8.5 1.4 12.9 0 16.2-6.7 30.7-17.4 41.4 9.6-2.9 18.8-6.8 27.5-11.7 4-9.3 6.2-19.3 6.2-29.7 0-2.7-.2-5.2-.4-7.7l-17.3-5.2zM136 377.9l-7.1 4.1c4.7 6.2 9.7 12.1 15.3 17.3l5.7-5.5c-5.1-5-9.7-10.3-13.9-15.9zm243.9 2.3-.2.1c-2.1.3-4 .6-6.2.7h-.1c-3.6 4.5-7.3 8.8-11.5 12.8l5.8 5.5c5.5-5.2 10.5-11.1 15.2-17.3l-3-1.8zm-217.8 24-5.9 5.9c6 4.8 12.2 9.7 18.8 13.6l3.8-7.8c-5.7-2.9-11.4-6.8-16.7-11.7zm187.7 0c-5.4 4.9-11.1 8.8-16.8 11.7l3.9 7.8c6.5-3.9 12.8-8.8 18.7-13.6l-5.8-5.9zm-156.4 19.5-4.1 6.8c6.6 4 13.7 5.8 20.7 8.8l2.2-7.9c-6.5-1.9-12.7-4.8-18.8-7.7zm125.2 0c-6.2 2.9-12.5 5.8-19.1 7.7l2.3 7.9c7.2-3 14-4.8 20.7-8.8l-3.9-6.8zm-90.7 11.7-2 7.8c7.1 1 14.5 1.9 21.9 1.9v-7.7c-6.8 0-13.5-1.1-19.9-2zm55.9 0c-6.3.9-13 2-19.8 2v7.7c7.5 0 14.8-.9 22.1-1.9l-2.3-7.8z" fill="#fff"/>
</svg>
<span>Nextcloud MCP Server</span>
</a>
</header>
<!-- App Content Wrapper (Sidebar + Main Content) -->
{% block content %}{% endblock %}
{% block scripts %}{% endblock %}
</body>
</html>
@@ -0,0 +1,19 @@
{% extends "base.html" %}
{% block title %}{{ error_title|default('Error') }} - Nextcloud MCP Server{% endblock %}
{% block content %}
<h1>{{ error_title|default('Error') }}</h1>
<div class="error">
<strong>Error:</strong> {{ error_message }}
</div>
{% if login_url %}
<p><a href="{{ login_url }}" class="btn btn-primary">Login again</a></p>
{% endif %}
{% if back_url %}
<p><a href="{{ back_url }}" class="btn">Go Back</a></p>
{% endif %}
{% endblock %}
@@ -0,0 +1,21 @@
{% extends "base.html" %}
{% block title %}{{ success_title|default('Success') }} - Nextcloud MCP Server{% endblock %}
{% block extra_head %}
{% if redirect_url and redirect_delay %}
<meta http-equiv="refresh" content="{{ redirect_delay }};url={{ redirect_url }}">
{% endif %}
{% endblock %}
{% block content %}
<div class="success">
<h1>{{ success_title|default('✓ Success') }}</h1>
{% for message in success_messages %}
<p>{{ message }}</p>
{% endfor %}
{% if redirect_url %}
<p>Redirecting...</p>
{% endif %}
</div>
{% endblock %}
@@ -0,0 +1,650 @@
{% extends "base.html" %}
{% block title %}Nextcloud MCP Server{% endblock %}
{% block extra_head %}
<!-- htmx for dynamic loading -->
<script src="https://unpkg.com/htmx.org@1.9.10"></script>
<!-- Alpine.js for state management -->
<script defer src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"></script>
<!-- Plotly.js for vector visualization -->
<script src="https://cdn.plot.ly/plotly-3.3.0.min.js"></script>
<!-- Vector Viz static assets -->
<link rel="stylesheet" href="/app/static/vector-viz.css">
{% endblock %}
{% block extra_styles %}
/* Smooth htmx transitions */
.htmx-swapping {
opacity: 0;
transition: opacity 200ms ease-out;
}
.htmx-settling {
opacity: 1;
transition: opacity 200ms ease-in;
}
/* Logout button styling */
.logout-section {
margin-top: 20px;
padding-top: 20px;
border-top: 1px solid var(--color-border);
}
/* Welcome tab specific styles */
.hero-section {
background: linear-gradient(135deg, var(--color-primary-element) 0%, #0082c9 100%);
color: white;
padding: 60px 24px;
margin: -24px -24px 40px -24px;
border-radius: 0 0 var(--border-radius-large) var(--border-radius-large);
text-align: center;
}
.hero-section h1 {
color: white;
font-size: 36px;
margin: 0 0 16px 0;
font-weight: 600;
}
.hero-section p {
font-size: 18px;
opacity: 0.95;
max-width: 700px;
margin: 0 auto;
line-height: 1.6;
}
.feature-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
gap: 24px;
margin: 32px 0;
}
.feature-card {
background: var(--color-main-background);
border: 2px solid var(--color-border);
border-radius: var(--border-radius-large);
padding: 24px;
transition: all 0.2s;
cursor: pointer;
text-decoration: none;
color: inherit;
display: block;
}
.feature-card:hover {
border-color: var(--color-primary-element);
box-shadow: 0 4px 12px rgba(0, 103, 158, 0.15);
transform: translateY(-2px);
}
.feature-card h3 {
color: var(--color-primary-element);
font-size: 20px;
margin: 12px 0 8px 0;
font-weight: 600;
display: flex;
align-items: center;
gap: 12px;
}
.feature-card p {
color: var(--color-text-maxcontrast);
font-size: 14px;
line-height: 1.6;
margin: 8px 0 0 0;
}
.feature-icon {
width: 48px;
height: 48px;
background: var(--color-primary-element-light);
border-radius: var(--border-radius);
display: flex;
align-items: center;
justify-content: center;
margin-bottom: 8px;
}
.feature-icon svg {
width: 28px;
height: 28px;
fill: var(--color-primary-element);
}
.info-section {
background: var(--color-background-hover);
border-radius: var(--border-radius-large);
padding: 32px;
margin: 32px 0;
}
.info-section h2 {
color: var(--color-main-text);
font-size: 24px;
margin: 0 0 16px 0;
border: none;
padding: 0;
}
.info-section p {
color: var(--color-text-maxcontrast);
line-height: 1.7;
margin: 12px 0;
}
.info-section ul {
margin: 12px 0;
padding-left: 24px;
}
.info-section li {
color: var(--color-text-maxcontrast);
line-height: 1.7;
margin: 8px 0;
}
.info-section code {
background: var(--color-main-background);
padding: 2px 8px;
border-radius: var(--border-radius);
font-size: 13px;
}
.auth-status {
background: var(--color-primary-element-light);
border-left: 4px solid var(--color-primary-element);
padding: 16px 20px;
margin: 24px 0;
border-radius: var(--border-radius);
display: flex;
align-items: center;
gap: 12px;
}
.auth-status svg {
width: 24px;
height: 24px;
fill: var(--color-primary-element);
flex-shrink: 0;
}
.auth-status-text {
flex: 1;
}
.auth-status-text strong {
display: block;
color: var(--color-main-text);
font-size: 14px;
margin-bottom: 4px;
}
.auth-status-text span {
color: var(--color-text-maxcontrast);
font-size: 13px;
}
{% endblock %}
{% block content %}
<div class="app-content-wrapper" x-data="{ activeSection: 'welcome', navOpen: true }">
<!-- Side Navigation -->
<nav id="app-navigation" :class="{ 'app-navigation--closed': !navOpen }">
<div class="app-navigation__content">
<!-- Navigation List -->
<ul class="app-navigation-list">
<li class="app-navigation-entry" :class="{ 'active': activeSection === 'welcome' }">
<div class="app-navigation-entry__wrapper">
<a href="#"
@click.prevent="activeSection = 'welcome'"
class="app-navigation-entry-link">
<span class="app-navigation-entry-icon">
<svg class="nav-icon" viewBox="0 0 24 24">
<path d="M10,20V14H14V20H19V12H22L12,3L2,12H5V20H10Z" />
</svg>
</span>
<span class="app-navigation-entry__name">Welcome</span>
</a>
</div>
</li>
<li class="app-navigation-entry" :class="{ 'active': activeSection === 'user-info' }">
<div class="app-navigation-entry__wrapper">
<a href="#"
@click.prevent="activeSection = 'user-info'"
class="app-navigation-entry-link">
<span class="app-navigation-entry-icon">
<svg class="nav-icon" viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
</span>
<span class="app-navigation-entry__name">User Info</span>
</a>
</div>
</li>
{% if show_vector_sync_tab %}
<li class="app-navigation-entry" :class="{ 'active': activeSection === 'vector-sync' }">
<div class="app-navigation-entry__wrapper">
<a href="#"
@click.prevent="activeSection = 'vector-sync'"
class="app-navigation-entry-link">
<span class="app-navigation-entry-icon">
<svg class="nav-icon" viewBox="0 0 24 24">
<path d="M12,18A6,6 0 0,1 6,12C6,11 6.25,10.03 6.7,9.2L5.24,7.74C4.46,8.97 4,10.43 4,12A8,8 0 0,0 12,20V23L16,19L12,15M12,4V1L8,5L12,9V6A6,6 0 0,1 18,12C18,13 17.75,13.97 17.3,14.8L18.76,16.26C19.54,15.03 20,13.57 20,12A8,8 0 0,0 12,4Z" />
</svg>
</span>
<span class="app-navigation-entry__name">Vector Sync</span>
</a>
</div>
</li>
<li class="app-navigation-entry" :class="{ 'active': activeSection === 'vector-viz' }">
<div class="app-navigation-entry__wrapper">
<a href="#"
@click.prevent="activeSection = 'vector-viz'"
class="app-navigation-entry-link">
<span class="app-navigation-entry-icon">
<svg class="nav-icon" viewBox="0 0 24 24">
<path d="M22,21H2V3H4V19H6V10H10V19H12V6H16V19H18V14H22V21Z" />
</svg>
</span>
<span class="app-navigation-entry__name">Vector Viz</span>
</a>
</div>
</li>
{% endif %}
{% if show_webhooks_tab %}
<li class="app-navigation-entry" :class="{ 'active': activeSection === 'webhooks' }">
<div class="app-navigation-entry__wrapper">
<a href="#"
@click.prevent="activeSection = 'webhooks'"
class="app-navigation-entry-link">
<span class="app-navigation-entry-icon">
<svg class="nav-icon" viewBox="0 0 24 24">
<path d="M10.59,13.41C11,13.8 11,14.44 10.59,14.83C10.2,15.22 9.56,15.22 9.17,14.83C7.22,12.88 7.22,9.71 9.17,7.76V7.76L12.71,4.22C14.66,2.27 17.83,2.27 19.78,4.22C21.73,6.17 21.73,9.34 19.78,11.29L18.29,12.78C18.3,11.96 18.17,11.14 17.89,10.36L18.36,9.88C19.54,8.71 19.54,6.81 18.36,5.64C17.19,4.46 15.29,4.46 14.12,5.64L10.59,9.17C9.41,10.34 9.41,12.24 10.59,13.41M13.41,9.17C13.8,8.78 14.44,8.78 14.83,9.17C16.78,11.12 16.78,14.29 14.83,16.24V16.24L11.29,19.78C9.34,21.73 6.17,21.73 4.22,19.78C2.27,17.83 2.27,14.66 4.22,12.71L5.71,11.22C5.7,12.04 5.83,12.86 6.11,13.65L5.64,14.12C4.46,15.29 4.46,17.19 5.64,18.36C6.81,19.54 8.71,19.54 9.88,18.36L13.41,14.83C14.59,13.66 14.59,11.76 13.41,10.59C13,10.2 13,9.56 13.41,9.17Z" />
</svg>
</span>
<span class="app-navigation-entry__name">Webhooks</span>
</a>
</div>
</li>
{% endif %}
</ul>
<!-- Settings/Logout at bottom -->
{% if logout_url %}
<ul class="app-navigation__settings">
<li class="app-navigation-entry">
<div class="app-navigation-entry__wrapper">
<a href="{{ logout_url }}" class="app-navigation-entry-link">
<span class="app-navigation-entry-icon">
<svg class="nav-icon" viewBox="0 0 24 24">
<path d="M16,17V14H9V10H16V7L21,12L16,17M14,2A2,2 0 0,1 16,4V6H14V4H5V20H14V18H16V20A2,2 0 0,1 14,22H5A2,2 0 0,1 3,20V4A2,2 0 0,1 5,2H14Z" />
</svg>
</span>
<span class="app-navigation-entry__name">Logout</span>
</a>
</div>
</li>
</ul>
{% endif %}
</div>
<!-- Toggle Button (mobile) -->
<button @click="navOpen = !navOpen"
class="app-navigation-toggle"
:aria-expanded="navOpen.toString()">
</button>
</nav>
<!-- Main Content Area -->
<main id="app-content">
<div class="page-content">
<!-- Welcome Section -->
<div x-show="activeSection === 'welcome'">
<!-- Hero Section -->
<div class="hero-section">
<h1>Welcome to Nextcloud MCP Server</h1>
<p>
Interactive user interface for semantic search and document retrieval.
Test queries, visualize results, and explore your Nextcloud content using RAG workflows.
</p>
</div>
<!-- Authentication Status -->
<div class="auth-status">
<svg viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
<div class="auth-status-text">
<strong>Authenticated as: {{ username }}</strong>
<span>Authentication mode: <code>{{ auth_mode }}</code></span>
</div>
</div>
{% if vector_sync_enabled %}
<!-- Vector Sync Enabled Content -->
<div class="info-section">
<h2>About Semantic Search</h2>
<p>
This interface provides access to <strong>semantic search</strong> capabilities powered by vector embeddings.
Unlike traditional keyword search, semantic search understands the <em>meaning</em> of your queries and finds
conceptually similar content across your Nextcloud apps.
</p>
<p>
<strong>How it works:</strong>
</p>
<ul>
<li>Documents from Notes, Calendar, Files, Contacts, and Deck are indexed into a vector database</li>
<li>Each document chunk is converted to a 768-dimensional vector embedding that captures semantic meaning</li>
<li>Queries are also converted to embeddings and matched against document vectors using similarity search</li>
<li>Results can be retrieved using pure semantic search or hybrid BM25 search combining keywords and semantics</li>
</ul>
</div>
<div class="info-section">
<h2>RAG Workflow Integration</h2>
<p>
This UI allows you to <strong>test the same queries that Large Language Models (LLMs) would use</strong> in a
Retrieval-Augmented Generation (RAG) workflow. When an AI assistant needs to answer questions about your data:
</p>
<ul>
<li><strong>Step 1:</strong> The assistant converts your question into a search query</li>
<li><strong>Step 2:</strong> The MCP server retrieves relevant document chunks using semantic search</li>
<li><strong>Step 3:</strong> Retrieved context is passed to the LLM to generate an informed answer</li>
</ul>
<!-- RAG Workflow Diagram -->
<div style="background: var(--color-main-background); border: 2px solid var(--color-primary-element); border-radius: var(--border-radius-large); padding: 24px; margin: 24px 0; overflow-x: auto;">
<div style="text-align: center; font-weight: 600; margin-bottom: 20px; color: var(--color-primary-element); font-size: 16px;">
MCP Sampling RAG Workflow
</div>
<!-- Four-component bidirectional flow -->
<div style="max-width: 1000px; margin: 0 auto;">
<div style="display: grid; grid-template-columns: 0.7fr auto 1fr auto 1fr auto 0.9fr; gap: 10px; align-items: center;">
<!-- User -->
<div style="background: var(--color-background-hover); border: 2px solid var(--color-border); border-radius: var(--border-radius-large); padding: 14px; text-align: center;">
<div style="font-size: 26px; margin-bottom: 5px;">👤</div>
<div style="font-weight: 600; color: var(--color-main-text); font-size: 12px;">User</div>
<div style="font-size: 9px; color: var(--color-text-maxcontrast); font-style: italic; margin-top: 5px; line-height: 1.2;">
"What are health<br>benefits of coffee?"
</div>
</div>
<!-- Arrow User <-> Client -->
<div style="text-align: center;">
<div style="font-size: 20px; color: var(--color-text-maxcontrast);"></div>
</div>
<!-- MCP Client + LLM (combined) -->
<div style="background: var(--color-primary-element-light); border: 2px solid var(--color-primary-element); border-radius: var(--border-radius-large); padding: 12px; text-align: center;">
<div style="font-weight: 600; color: var(--color-primary-element); font-size: 13px; margin-bottom: 8px;">MCP Client + LLM</div>
<div style="background: var(--color-main-background); border-radius: var(--border-radius); padding: 8px; margin-bottom: 6px;">
<div style="font-size: 9px; color: var(--color-text-maxcontrast);">(Claude Code)</div>
</div>
<div style="background: var(--color-main-background); border-radius: var(--border-radius); padding: 8px; border: 2px solid var(--color-primary-element);">
<div style="font-size: 16px; margin-bottom: 2px;">🧠</div>
<div style="font-weight: 600; color: var(--color-main-text); font-size: 10px;">Client's LLM</div>
<div style="font-size: 8px; color: var(--color-text-maxcontrast);">(Claude)</div>
</div>
<div style="margin-top: 8px; font-size: 8px; color: var(--color-text-maxcontrast); line-height: 1.2;">
<strong>Enables RAG:</strong><br>
Receives context,<br>
generates answer
</div>
</div>
<!-- Arrow Client <-> Server -->
<div style="text-align: center;">
<div style="font-size: 20px; color: var(--color-primary-element);"></div>
<div style="font-size: 7px; color: var(--color-text-maxcontrast); margin-top: 2px; font-weight: 600; line-height: 1.1;">
Query +<br>
Sampling
</div>
</div>
<!-- MCP Server -->
<div style="background: var(--color-primary-element-light); border: 2px solid var(--color-primary-element); border-radius: var(--border-radius-large); padding: 12px; text-align: center;">
<div style="font-weight: 600; color: var(--color-primary-element); font-size: 13px; margin-bottom: 8px;">MCP Server</div>
<div style="background: var(--color-main-background); border-radius: var(--border-radius); padding: 7px; margin-bottom: 5px;">
<div style="font-weight: 600; color: var(--color-main-text); font-size: 9px; margin-bottom: 2px;">1. Semantic Search</div>
<div style="font-size: 7px; color: var(--color-text-maxcontrast); line-height: 1.2;">
Vector embeddings<br>
BM25 Hybrid + RRF
</div>
</div>
<div style="background: var(--color-main-background); border-radius: var(--border-radius); padding: 7px; margin-bottom: 5px;">
<div style="font-weight: 600; color: var(--color-main-text); font-size: 9px; margin-bottom: 2px;">2. Retrieve Context</div>
<div style="font-size: 7px; color: var(--color-text-maxcontrast); line-height: 1.2;">
Top relevant docs<br>
with scores
</div>
</div>
<div style="background: var(--color-main-background); border-radius: var(--border-radius); padding: 7px; margin-bottom: 5px;">
<div style="font-weight: 600; color: var(--color-main-text); font-size: 9px; margin-bottom: 2px;">3. Format Response</div>
<div style="font-size: 7px; color: var(--color-text-maxcontrast); line-height: 1.2;">
Document chunks<br>
with citations
</div>
</div>
<div style="background: var(--color-main-background); border-radius: var(--border-radius); padding: 7px;">
<div style="font-weight: 600; color: var(--color-main-text); font-size: 9px; margin-bottom: 2px;">4. Send to LLM</div>
<div style="font-size: 7px; color: var(--color-text-maxcontrast); line-height: 1.2;">
Via MCP sampling<br>
for answer generation
</div>
</div>
</div>
<!-- Arrow Server <-> Nextcloud -->
<div style="text-align: center;">
<div style="font-size: 20px; color: var(--color-primary-element);"></div>
<div style="font-size: 7px; color: var(--color-text-maxcontrast); margin-top: 2px; font-weight: 600; line-height: 1.1;">
Retrieve
</div>
</div>
<!-- Nextcloud -->
<div style="background: var(--color-background-hover); border: 2px solid var(--color-border); border-radius: var(--border-radius-large); padding: 12px; text-align: center; position: relative;">
<img src="/app/static/nextcloud-logo.png" alt="Nextcloud" style="width: 40px; height: 40px; margin-bottom: 6px;" />
<div style="font-weight: 600; color: var(--color-main-text); font-size: 12px; margin-bottom: 4px;">Nextcloud</div>
<div style="font-size: 8px; color: var(--color-text-maxcontrast); line-height: 1.2;">
Notes, Calendar,<br>
Files, Contacts,<br>
Deck
</div>
</div>
</div>
<!-- Explanation below diagram -->
<div style="margin-top: 24px; padding: 16px; background: var(--color-background-hover); border-radius: var(--border-radius); border-left: 4px solid var(--color-primary-element);">
<div style="font-size: 12px; color: var(--color-main-text); line-height: 1.6;">
<strong>How RAG works via MCP Sampling:</strong>
</div>
<ol style="margin: 8px 0 0 0; padding-left: 20px; font-size: 11px; color: var(--color-text-maxcontrast); line-height: 1.6;">
<li>User asks question through MCP Client</li>
<li>Client sends query to MCP Server</li>
<li>Server retrieves relevant document context from Nextcloud</li>
<li><strong>Server sends context back to Client's LLM</strong> (MCP Sampling)</li>
<li>Client's LLM generates answer with citations using retrieved context</li>
<li>Answer returned to user</li>
</ol>
<div style="margin-top: 8px; font-size: 10px; color: var(--color-text-maxcontrast); font-style: italic;">
The server has no LLM - it only retrieves context. The client's existing LLM is reused for answer generation.
</div>
</div>
</div>
</div>
<p style="margin-top: 16px;">
<strong>Key Point:</strong> The MCP server retrieves context but doesn't generate answers itself.
Through <strong>MCP sampling</strong>, it requests the client's LLM to generate responses, giving users
full control over which model is used and ensuring all processing happens client-side.
</p>
<p>
By using this interface, you can preview search results, understand relevance scores, and verify
that the system retrieves the right information before it reaches the LLM.
</p>
</div>
<!-- Feature Cards -->
<h2>Available Features</h2>
<div class="feature-grid">
<a href="#" @click.prevent="activeSection = 'user-info'" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
</div>
<h3>User Information</h3>
<p>
View your authentication details, session information, and IdP profile.
Manage background access permissions.
</p>
</a>
<a href="#" @click.prevent="activeSection = 'vector-sync'" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M12,18A6,6 0 0,1 6,12C6,11 6.25,10.03 6.7,9.2L5.24,7.74C4.46,8.97 4,10.43 4,12A8,8 0 0,0 12,20V23L16,19L12,15M12,4V1L8,5L12,9V6A6,6 0 0,1 18,12C18,13 17.75,13.97 17.3,14.8L18.76,16.26C19.54,15.03 20,13.57 20,12A8,8 0 0,0 12,4Z" />
</svg>
</div>
<h3>Vector Sync Status</h3>
<p>
Monitor real-time indexing progress with metrics for indexed documents, pending queue,
and synchronization status.
</p>
</a>
<a href="#" @click.prevent="activeSection = 'vector-viz'" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M22,21H2V3H4V19H6V10H10V19H12V6H16V19H18V14H22V21Z" />
</svg>
</div>
<h3>Vector Visualization</h3>
<p>
Interactive search interface with 2D PCA visualization. Compare algorithms,
view relevance scores, and explore matched document chunks.
</p>
</a>
</div>
{% else %}
<!-- Vector Sync Disabled Content -->
<div class="warning">
<h3 style="margin-top: 0;">Vector Sync is Disabled</h3>
<p>
Semantic search and vector visualization features are currently disabled.
To enable these features, set <code>VECTOR_SYNC_ENABLED=true</code> in your environment configuration.
</p>
<p style="margin-bottom: 0;">
<strong>Learn more:</strong>
<a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/configuration.md" target="_blank" style="color: inherit; text-decoration: underline;">
Configuration Guide
</a>
</p>
</div>
<!-- Limited Feature Card -->
<h2>Available Features</h2>
<div class="feature-grid">
<a href="#" @click.prevent="activeSection = 'user-info'" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
</div>
<h3>User Information</h3>
<p>
View your authentication details, session information, and IdP profile.
Manage background access permissions.
</p>
</a>
</div>
{% endif %}
<!-- Documentation Section -->
<div class="info-section" style="margin-top: 40px;">
<h2>Documentation</h2>
<p>
For detailed information about configuration, authentication modes, and advanced features,
please refer to the project documentation:
</p>
<ul>
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/installation.md" target="_blank">Installation Guide</a></li>
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/configuration.md" target="_blank">Configuration Options</a></li>
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/authentication.md" target="_blank">Authentication Modes</a></li>
{% if vector_sync_enabled %}
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/user-guide/vector-sync-ui.md" target="_blank">Vector Sync UI Guide</a></li>
{% endif %}
</ul>
</div>
</div>
<!-- User Info Section -->
<div x-show="activeSection === 'user-info'">
<div class="content-section">
<h1>User Information</h1>
{{ user_info_tab_html|safe }}
</div>
</div>
{% if show_vector_sync_tab %}
<!-- Vector Sync Section -->
<div x-show="activeSection === 'vector-sync'">
<div class="content-section">
<h1>Vector Sync Status</h1>
{{ vector_sync_tab_html|safe }}
</div>
</div>
<!-- Vector Viz Section -->
<div x-show="activeSection === 'vector-viz'">
<div class="content-section">
<h1>Vector Visualization</h1>
<div hx-get="/app/vector-viz" hx-trigger="load" hx-swap="outerHTML">
<p style="color: #999;">Loading vector visualization...</p>
</div>
</div>
</div>
{% endif %}
{% if show_webhooks_tab %}
<!-- Webhooks Section -->
<div x-show="activeSection === 'webhooks'">
<div class="content-section">
<h1>Webhook Management</h1>
{{ webhooks_tab_html|safe }}
</div>
</div>
{% endif %}
</div>
</main>
</div>
<script>
// Set global Nextcloud base URL for use in external JS
window.NEXTCLOUD_BASE_URL = '{{ nextcloud_host_for_links }}';
</script>
<script src="/app/static/vector-viz.js"></script>
{% endblock %}
@@ -0,0 +1,180 @@
<div x-data="vizApp()">
<div class="viz-layout">
<!-- Top: Search Controls -->
<div class="viz-card viz-controls-card">
<form @submit.prevent="executeSearch">
<div class="viz-controls-grid">
<div class="viz-control-group">
<label>Search Query</label>
<input type="text" x-model="query" placeholder="Enter search query..." required />
</div>
<div class="viz-control-group">
<label>Algorithm</label>
<select x-model="algorithm">
<option value="semantic">Semantic (Dense)</option>
<option value="bm25_hybrid" selected>BM25 Hybrid</option>
</select>
</div>
<div class="viz-control-group">
<label>Fusion</label>
<select x-model="fusion" :disabled="algorithm !== 'bm25_hybrid'" :style="algorithm !== 'bm25_hybrid' ? 'opacity: 0.5; cursor: not-allowed;' : ''">
<option value="rrf" selected>RRF</option>
<option value="dbsf">DBSF</option>
</select>
</div>
<div class="viz-control-group">
<label>&nbsp;</label>
<button type="submit" class="viz-btn">Search</button>
</div>
<div class="viz-control-group">
<label>&nbsp;</label>
<button type="button" class="viz-btn-secondary" @click="showAdvanced = !showAdvanced">
<span x-text="showAdvanced ? 'Hide' : 'Advanced'"></span>
</button>
</div>
</div>
<!-- Advanced Options (Collapsible) -->
<div x-show="showAdvanced" style="margin-top: 16px;">
<div class="viz-controls-grid" style="grid-template-columns: repeat(auto-fit, minmax(150px, 1fr));">
<div class="viz-control-group">
<label>Document Types</label>
<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 8px; font-size: 13px;">
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal;">
<input type="checkbox" x-model="docTypes" value="" style="margin-right: 4px;">
<span>All</span>
</label>
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal;">
<input type="checkbox" x-model="docTypes" value="note" style="margin-right: 4px;">
<span>Notes</span>
</label>
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal;">
<input type="checkbox" x-model="docTypes" value="file" style="margin-right: 4px;">
<span>Files</span>
</label>
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal;">
<input type="checkbox" x-model="docTypes" value="calendar" style="margin-right: 4px;">
<span>Calendar</span>
</label>
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal;">
<input type="checkbox" x-model="docTypes" value="contact" style="margin-right: 4px;">
<span>Contacts</span>
</label>
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal;">
<input type="checkbox" x-model="docTypes" value="deck" style="margin-right: 4px;">
<span>Deck</span>
</label>
</div>
</div>
<div class="viz-control-group">
<label>Score Threshold</label>
<input type="number" x-model.number="scoreThreshold" min="0" max="1" step="any" />
</div>
<div class="viz-control-group">
<label>Result Limit</label>
<input type="number" x-model.number="limit" min="1" max="1000" />
</div>
<div class="viz-control-group">
<label>Display Options</label>
<label style="display: flex; align-items: center; cursor: pointer; font-weight: normal; margin-top: 4px;">
<input type="checkbox" x-model="showQueryPoint" @change="updatePlot()" style="margin-right: 6px;">
<span>Show Query Point</span>
</label>
</div>
</div>
</div>
</form>
</div>
<!-- Plot -->
<div class="viz-card viz-card-plot">
<div id="viz-plot-container">
<div x-show="loading" class="viz-loading-overlay" x-transition.opacity.duration.200ms>
Executing search and computing PCA projection...
</div>
<div id="viz-plot" x-show="!loading" x-transition.opacity.duration.200ms></div>
</div>
</div>
<!-- Results -->
<div class="viz-card" style="flex: 0 0 auto;">
<h3 style="margin-top: 0;">Search Results (<span x-text="loading ? '...' : results.length"></span>)</h3>
<div x-show="loading" class="viz-loading" x-transition.opacity.duration.200ms>
Loading results...
</div>
<div x-show="!loading && results.length === 0" class="viz-no-results" x-transition.opacity.duration.200ms>
No results found. Try a different query or adjust your search parameters.
</div>
<template x-if="!loading && results.length > 0">
<div x-transition.opacity.duration.200ms>
<template x-for="result in results" :key="`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`">
<div style="padding: 12px; border-bottom: 1px solid #eee;">
<a :href="getNextcloudUrl(result)" target="_blank" style="font-weight: 500; color: #0066cc; text-decoration: none;">
<span x-text="result.title"></span>
</a>
<div style="font-size: 14px; color: #666; margin-top: 4px;"
x-text="result.excerpt.length > 200 ? result.excerpt.substring(0, 200) + '...' : result.excerpt"></div>
<div style="font-size: 12px; color: #999; margin-top: 4px;">
Raw Score: <span x-text="result.original_score.toFixed(3)"></span>
(<span x-text="(result.score * 100).toFixed(0)"></span>% relative) |
Type: <span x-text="result.doc_type"></span>
</div>
<!-- Show Chunk button (only if chunk position is available) -->
<template x-if="hasChunkPosition(result)">
<button
class="chunk-toggle-btn"
@click="toggleChunk(result)"
x-text="isChunkExpanded(`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`) ? 'Hide Chunk' : 'Show Chunk'"
></button>
</template>
<!-- Chunk context (expanded inline) -->
<template x-if="isChunkExpanded(`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`)">
<div class="chunk-context" x-transition.opacity.duration.200ms>
<template x-if="chunkLoading[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]">
<div style="color: #666; font-style: italic;">Loading chunk...</div>
</template>
<template x-if="!chunkLoading[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]">
<div>
<!-- Highlighted page image for PDFs -->
<template x-if="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.highlighted_page_image">
<div class="chunk-image-container">
<div class="chunk-image-header">
<span>Page <span x-text="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.page_number"></span></span>
</div>
<img
:src="'data:image/png;base64,' + expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.highlighted_page_image"
:alt="'Page ' + expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.page_number"
class="chunk-highlighted-image"
/>
</div>
</template>
<!-- Text context -->
<template x-if="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.has_more_before">
<span class="chunk-ellipsis">...</span>
</template>
<span class="chunk-text" x-text="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.before_context"></span><span class="chunk-matched" x-text="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.chunk_text"></span><span class="chunk-text" x-text="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.after_context"></span><template x-if="expandedChunks[`${result.doc_type}_${result.id}_${result.chunk_start_offset || 0}`]?.has_more_after">
<span class="chunk-ellipsis">...</span>
</template>
</div>
</template>
</div>
</template>
</div>
</template>
</div>
</template>
</div><!-- Search Results -->
</div><!-- .viz-layout -->
</div><!-- x-data="vizApp()" -->
@@ -0,0 +1,392 @@
{% extends "base.html" %}
{% block title %}Welcome - Nextcloud MCP Server{% endblock %}
{% block extra_head %}
<!-- Alpine.js for interactive elements -->
<script defer src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"></script>
{% endblock %}
{% block extra_styles %}
/* Welcome page specific styles */
.hero-section {
background: linear-gradient(135deg, var(--color-primary-element) 0%, #0082c9 100%);
color: white;
padding: 60px 24px;
margin: -24px -24px 40px -24px;
border-radius: 0 0 var(--border-radius-large) var(--border-radius-large);
text-align: center;
}
.hero-section h1 {
color: white;
font-size: 36px;
margin: 0 0 16px 0;
font-weight: 600;
}
.hero-section p {
font-size: 18px;
opacity: 0.95;
max-width: 700px;
margin: 0 auto;
line-height: 1.6;
}
.feature-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
gap: 24px;
margin: 32px 0;
}
.feature-card {
background: var(--color-main-background);
border: 2px solid var(--color-border);
border-radius: var(--border-radius-large);
padding: 24px;
transition: all 0.2s;
cursor: pointer;
text-decoration: none;
color: inherit;
display: block;
}
.feature-card:hover {
border-color: var(--color-primary-element);
box-shadow: 0 4px 12px rgba(0, 103, 158, 0.15);
transform: translateY(-2px);
}
.feature-card h3 {
color: var(--color-primary-element);
font-size: 20px;
margin: 12px 0 8px 0;
font-weight: 600;
display: flex;
align-items: center;
gap: 12px;
}
.feature-card p {
color: var(--color-text-maxcontrast);
font-size: 14px;
line-height: 1.6;
margin: 8px 0 0 0;
}
.feature-icon {
width: 48px;
height: 48px;
background: var(--color-primary-element-light);
border-radius: var(--border-radius);
display: flex;
align-items: center;
justify-content: center;
margin-bottom: 8px;
}
.feature-icon svg {
width: 28px;
height: 28px;
fill: var(--color-primary-element);
}
.info-section {
background: var(--color-background-hover);
border-radius: var(--border-radius-large);
padding: 32px;
margin: 32px 0;
}
.info-section h2 {
color: var(--color-main-text);
font-size: 24px;
margin: 0 0 16px 0;
border: none;
padding: 0;
}
.info-section p {
color: var(--color-text-maxcontrast);
line-height: 1.7;
margin: 12px 0;
}
.info-section ul {
margin: 12px 0;
padding-left: 24px;
}
.info-section li {
color: var(--color-text-maxcontrast);
line-height: 1.7;
margin: 8px 0;
}
.info-section code {
background: var(--color-main-background);
padding: 2px 8px;
border-radius: var(--border-radius);
font-size: 13px;
}
.auth-status {
background: var(--color-primary-element-light);
border-left: 4px solid var(--color-primary-element);
padding: 16px 20px;
margin: 24px 0;
border-radius: var(--border-radius);
display: flex;
align-items: center;
gap: 12px;
}
.auth-status svg {
width: 24px;
height: 24px;
fill: var(--color-primary-element);
flex-shrink: 0;
}
.auth-status-text {
flex: 1;
}
.auth-status-text strong {
display: block;
color: var(--color-main-text);
font-size: 14px;
margin-bottom: 4px;
}
.auth-status-text span {
color: var(--color-text-maxcontrast);
font-size: 13px;
}
{% endblock %}
{% block content %}
<div class="app-content-wrapper">
<!-- Main Content Area -->
<main id="app-content">
<div class="page-content">
<!-- Hero Section -->
<div class="hero-section">
<h1>Welcome to Nextcloud MCP Server</h1>
<p>
Interactive user interface for semantic search and document retrieval.
Test queries, visualize results, and explore your Nextcloud content using RAG workflows.
</p>
</div>
<!-- Authentication Status -->
<div class="auth-status">
<svg viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
<div class="auth-status-text">
<strong>Authenticated as: {{ username }}</strong>
<span>Authentication mode: <code>{{ auth_mode }}</code></span>
</div>
</div>
{% if vector_sync_enabled %}
<!-- Vector Sync Enabled Content -->
<div class="info-section">
<h2>About Semantic Search</h2>
<p>
This interface provides access to <strong>semantic search</strong> capabilities powered by vector embeddings.
Unlike traditional keyword search, semantic search understands the <em>meaning</em> of your queries and finds
conceptually similar content across your Nextcloud apps.
</p>
<p>
<strong>How it works:</strong>
</p>
<ul>
<li>Documents from Notes, Calendar, Files, Contacts, and Deck are indexed into a vector database</li>
<li>Each document chunk is converted to a 768-dimensional vector embedding that captures semantic meaning</li>
<li>Queries are also converted to embeddings and matched against document vectors using similarity search</li>
<li>Results can be retrieved using pure semantic search or hybrid BM25 search combining keywords and semantics</li>
</ul>
</div>
<div class="info-section">
<h2>RAG Workflow Integration</h2>
<p>
This UI allows you to <strong>test the same queries that Large Language Models (LLMs) would use</strong> in a
Retrieval-Augmented Generation (RAG) workflow. When an AI assistant needs to answer questions about your data:
</p>
<ul>
<li><strong>Step 1:</strong> The assistant converts your question into a search query</li>
<li><strong>Step 2:</strong> The MCP server retrieves relevant document chunks using semantic search</li>
<li><strong>Step 3:</strong> Retrieved context is passed to the LLM to generate an informed answer</li>
</ul>
<!-- RAG Workflow Diagram -->
<div style="background: var(--color-main-background); border: 2px solid var(--color-primary-element); border-radius: var(--border-radius-large); padding: 24px; margin: 24px 0; font-family: 'SFMono-Regular', 'Consolas', 'Liberation Mono', 'Menlo', monospace; font-size: 13px; line-height: 1.8; overflow-x: auto;">
<div style="text-align: center; font-weight: 600; margin-bottom: 16px; color: var(--color-primary-element); font-size: 14px;">
MCP Sampling RAG Workflow
</div>
<pre style="margin: 0; color: var(--color-main-text);">
┌─────────────────┐
<strong>MCP Client</strong> │ User asks: "What are health benefits of coffee?"
│ (Claude Code) │
└────────┬────────┘
│ (1) User question
┌────────────────────────────────────────────────────────────────────────┐
<strong>Nextcloud MCP Server</strong>
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ <strong>nc_semantic_search_answer</strong> Tool (MCP Sampling-enabled) │ │
│ │ │ │
│ │ (2) Semantic Search │ │
│ │ ┌────────────────────────────────────────────────────────┐ │ │
│ │ │ Query: "health benefits of coffee" │ │ │
│ │ │ → Convert to 768D vector embedding │ │ │
│ │ │ → Search Qdrant (BM25 Hybrid + RRF fusion) │ │ │
│ │ │ → Retrieve top 5 relevant document chunks │ │ │
│ │ └────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ (3) Construct Prompt with Context │ │
│ │ ┌────────────────────────────────────────────────────────┐ │ │
│ │ │ "What are health benefits of coffee? │ │ │
│ │ │ │ │ │
│ │ │ Documents: │ │ │
│ │ │ - [MED-2155] Effects of habitual coffee consumption...│ │ │
│ │ │ - [MED-1646] Beverage consumption guidance... │ │ │
│ │ │ - [MED-1627] Coffee and depression risk... │ │ │
│ │ │ ... │ │ │
│ │ │ │ │ │
│ │ │ Provide answer with citations." │ │ │
│ │ └────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ (4) MCP Sampling Request │ │
│ │ ─────────────────────────────────────────────────────────────> │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────────┘
│ Sampling request with prompt + context
┌─────────────────┐
<strong>MCP Client</strong> │ (5) Client's LLM generates answer using retrieved context
│ (Claude) │ → "Coffee consumption (2-3 cups/day) is associated with
└────────┬────────┘ reduced risk of type 2 diabetes, cardiovascular disease,
│ and improved liver health (Document 1, 2)..."
│ (6) Answer with citations
┌─────────────────┐
│ User │ Receives comprehensive answer with source citations
└─────────────────┘</pre>
</div>
<p style="margin-top: 16px;">
<strong>Key Point:</strong> The MCP server retrieves context but doesn't generate answers itself.
Through <strong>MCP sampling</strong>, it requests the client's LLM to generate responses, giving users
full control over which model is used and ensuring all processing happens client-side.
</p>
<p>
By using this interface, you can preview search results, understand relevance scores, and verify
that the system retrieves the right information before it reaches the LLM.
</p>
</div>
<!-- Feature Cards -->
<h2>Available Features</h2>
<div class="feature-grid">
<a href="/app/user-info" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
</div>
<h3>User Information</h3>
<p>
View your authentication details, session information, and IdP profile.
Manage background access permissions.
</p>
</a>
<a href="/app/user-info#vector-sync" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M12,18A6,6 0 0,1 6,12C6,11 6.25,10.03 6.7,9.2L5.24,7.74C4.46,8.97 4,10.43 4,12A8,8 0 0,0 12,20V23L16,19L12,15M12,4V1L8,5L12,9V6A6,6 0 0,1 18,12C18,13 17.75,13.97 17.3,14.8L18.76,16.26C19.54,15.03 20,13.57 20,12A8,8 0 0,0 12,4Z" />
</svg>
</div>
<h3>Vector Sync Status</h3>
<p>
Monitor real-time indexing progress with metrics for indexed documents, pending queue,
and synchronization status.
</p>
</a>
<a href="/app/user-info#vector-viz" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M22,21H2V3H4V19H6V10H10V19H12V6H16V19H18V14H22V21Z" />
</svg>
</div>
<h3>Vector Visualization</h3>
<p>
Interactive search interface with 2D PCA visualization. Compare algorithms,
view relevance scores, and explore matched document chunks.
</p>
</a>
</div>
{% else %}
<!-- Vector Sync Disabled Content -->
<div class="warning">
<h3 style="margin-top: 0;">Vector Sync is Disabled</h3>
<p>
Semantic search and vector visualization features are currently disabled.
To enable these features, set <code>VECTOR_SYNC_ENABLED=true</code> in your environment configuration.
</p>
<p style="margin-bottom: 0;">
<strong>Learn more:</strong>
<a href="https://github.com/YOUR_REPO/docs/configuration.md" target="_blank" style="color: inherit; text-decoration: underline;">
Configuration Guide
</a>
</p>
</div>
<!-- Limited Feature Card -->
<h2>Available Features</h2>
<div class="feature-grid">
<a href="/app/user-info" class="feature-card">
<div class="feature-icon">
<svg viewBox="0 0 24 24">
<path d="M12,4A4,4 0 0,1 16,8A4,4 0 0,1 12,12A4,4 0 0,1 8,8A4,4 0 0,1 12,4M12,14C16.42,14 20,15.79 20,18V20H4V18C4,15.79 7.58,14 12,14Z" />
</svg>
</div>
<h3>User Information</h3>
<p>
View your authentication details, session information, and IdP profile.
Manage background access permissions.
</p>
</a>
</div>
{% endif %}
<!-- Documentation Section -->
<div class="info-section" style="margin-top: 40px;">
<h2>Documentation</h2>
<p>
For detailed information about configuration, authentication modes, and advanced features,
please refer to the project documentation:
</p>
<ul>
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/installation.md" target="_blank">Installation Guide</a></li>
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/configuration.md" target="_blank">Configuration Options</a></li>
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/authentication.md" target="_blank">Authentication Modes</a></li>
{% if vector_sync_enabled %}
<li><a href="https://github.com/cbcoutinho/nextcloud-mcp-server/blob/master/docs/user-guide/vector-sync-ui.md" target="_blank">Vector Sync UI Guide</a></li>
{% endif %}
</ul>
</div>
</div>
</main>
</div>
{% endblock %}
@@ -303,10 +303,13 @@ class UnifiedTokenVerifier(TokenVerifier):
try:
# Introspection requires client authentication
client_id = self.settings.oidc_client_id
client_secret = self.settings.oidc_client_secret
assert client_id is not None and client_secret is not None
response = await self.http_client.post(
self.introspection_uri,
data={"token": token},
auth=(self.settings.oidc_client_id, self.settings.oidc_client_secret),
auth=(client_id, client_secret),
)
if response.status_code == 200:
+86 -531
View File
@@ -9,24 +9,38 @@ For OAuth mode: Requires browser-based OAuth login to establish session.
import logging
import os
from pathlib import Path
from typing import Any
import httpx
from jinja2 import Environment, FileSystemLoader
from starlette.authentication import requires
from starlette.requests import Request
from starlette.responses import HTMLResponse, JSONResponse
from nextcloud_mcp_server.client import NextcloudClient
logger = logging.getLogger(__name__)
# Setup Jinja2 environment for templates
_template_dir = Path(__file__).parent / "templates"
_jinja_env = Environment(loader=FileSystemLoader(_template_dir))
async def _get_authenticated_client_for_userinfo(request: Request) -> httpx.AsyncClient:
"""Get an authenticated HTTP client for user info page operations.
async def _get_authenticated_client_for_userinfo(request: Request) -> NextcloudClient:
"""Get an authenticated Nextcloud client for user info page operations.
This is a shared helper for authenticated routes that need to access
Nextcloud APIs. It handles both BasicAuth and OAuth authentication modes.
Args:
request: Starlette request object
Returns:
Authenticated httpx.AsyncClient
Authenticated NextcloudClient
Raises:
RuntimeError: If credentials/session not configured
"""
oauth_ctx = getattr(request.app.state, "oauth_context", None)
@@ -39,11 +53,15 @@ async def _get_authenticated_client_for_userinfo(request: Request) -> httpx.Asyn
if not all([nextcloud_host, username, password]):
raise RuntimeError("BasicAuth credentials not configured")
assert nextcloud_host is not None # Type narrowing for type checker
return httpx.AsyncClient(
from httpx import BasicAuth
assert nextcloud_host is not None
assert username is not None
assert password is not None
return NextcloudClient(
base_url=nextcloud_host,
auth=(username, password),
timeout=30.0,
username=username,
auth=BasicAuth(username, password),
)
# OAuth mode - get token from session
@@ -58,15 +76,14 @@ async def _get_authenticated_client_for_userinfo(request: Request) -> httpx.Asyn
raise RuntimeError("No access token found in session")
access_token = token_data["access_token"]
username = token_data.get("username")
nextcloud_host = oauth_ctx.get("config", {}).get("nextcloud_host", "")
if not nextcloud_host:
raise RuntimeError("Nextcloud host not configured")
if not nextcloud_host or not username:
raise RuntimeError("Nextcloud host or username not configured")
return httpx.AsyncClient(
base_url=nextcloud_host,
headers={"Authorization": f"Bearer {access_token}"},
timeout=30.0,
return NextcloudClient.from_token(
base_url=nextcloud_host, token=access_token, username=username
)
@@ -417,10 +434,10 @@ async def user_info_html(request: Request) -> HTMLResponse:
try:
from nextcloud_mcp_server.auth.permissions import is_nextcloud_admin
# Get authenticated HTTP client
http_client = await _get_authenticated_client_for_userinfo(request)
is_admin = await is_nextcloud_admin(request, http_client)
await http_client.aclose()
# Get authenticated Nextcloud client
nc_client = await _get_authenticated_client_for_userinfo(request)
is_admin = await is_nextcloud_admin(request, nc_client._client)
await nc_client.close()
except Exception as e:
logger.warning(f"Failed to check admin status: {e}")
# Default to not admin if check fails
@@ -431,51 +448,14 @@ async def user_info_html(request: Request) -> HTMLResponse:
oauth_ctx = getattr(request.app.state, "oauth_context", None)
login_url = str(request.url_for("oauth_login")) if oauth_ctx else "/oauth/login"
error_html = f"""
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Error - Nextcloud MCP Server</title>
<style>
body {{
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
max-width: 800px;
margin: 50px auto;
padding: 20px;
background-color: #f5f5f5;
}}
.container {{
background: white;
border-radius: 8px;
padding: 30px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
}}
h1 {{
color: #d32f2f;
margin-top: 0;
}}
.error {{
background-color: #ffebee;
border-left: 4px solid #d32f2f;
padding: 15px;
margin: 20px 0;
}}
</style>
</head>
<body>
<div class="container">
<h1>Error Retrieving User Info</h1>
<div class="error">
<strong>Error:</strong> {user_context["error"]}
</div>
<p><a href="{login_url}">Login again</a></p>
</div>
</body>
</html>
"""
return HTMLResponse(content=error_html)
template = _jinja_env.get_template("error.html")
return HTMLResponse(
content=template.render(
error_title="Error Retrieving User Info",
error_message=user_context["error"],
login_url=login_url,
)
)
# Build HTML response
auth_mode = user_context.get("auth_mode", "unknown")
@@ -654,410 +634,26 @@ async def user_info_html(request: Request) -> HTMLResponse:
</div>
"""
html_content = f"""
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Nextcloud MCP Server</title>
# Check if vector sync is enabled (needed for Welcome tab)
vector_sync_enabled = os.getenv("VECTOR_SYNC_ENABLED", "false").lower() == "true"
<!-- htmx for dynamic loading -->
<script src="https://unpkg.com/htmx.org@1.9.10"></script>
<!-- Alpine.js for tab state management -->
<script defer src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"></script>
<!-- Plotly.js for vector visualization -->
<script src="https://cdn.plot.ly/plotly-2.27.0.min.js"></script>
<!-- Vector visualization app (Alpine.js component) -->
<script>
function vizApp() {{
return {{
query: '',
algorithm: 'hybrid',
showAdvanced: false,
docTypes: [''], // Default to "All Types"
limit: 50,
scoreThreshold: 0.7,
semanticWeight: 0.5,
keywordWeight: 0.3,
fuzzyWeight: 0.2,
loading: false,
results: [],
async executeSearch() {{
this.loading = true;
this.results = [];
try {{
const params = new URLSearchParams({{
query: this.query,
algorithm: this.algorithm,
limit: this.limit,
score_threshold: this.scoreThreshold,
semantic_weight: this.semanticWeight,
keyword_weight: this.keywordWeight,
fuzzy_weight: this.fuzzyWeight,
}});
// Add doc_types parameter (filter out empty string for "All Types")
const selectedTypes = this.docTypes.filter(t => t !== '');
if (selectedTypes.length > 0) {{
params.append('doc_types', selectedTypes.join(','));
}}
const response = await fetch(`/app/vector-viz/search?${{params}}`);
const data = await response.json();
if (data.success) {{
this.results = data.results;
this.renderPlot(data.coordinates_2d, data.results);
}} else {{
alert('Search failed: ' + data.error);
}}
}} catch (error) {{
alert('Error: ' + error.message);
}} finally {{
this.loading = false;
}}
}},
renderPlot(coordinates, results) {{
// Calculate score range for auto-scaling
const scores = results.map(r => r.score);
const minScore = Math.min(...scores);
const maxScore = Math.max(...scores);
const trace = {{
x: coordinates.map(c => c[0]),
y: coordinates.map(c => c[1]),
mode: 'markers',
type: 'scatter',
text: results.map(r => `${{r.title}}<br>Score: ${{r.score.toFixed(3)}}`),
marker: {{
// Multi-channel encoding: size + opacity + color for visual hierarchy
// Power scaling (score^2) amplifies visual differences dramatically
// score=0.0 → 6px, score=0.5 → 9.5px, score=1.0 → 20px
size: results.map(r => 6 + (Math.pow(r.score, 2) * 14)),
// Linear opacity scaling (0.2-1.0 range keeps all points visible)
opacity: results.map(r => 0.2 + (r.score * 0.8)),
// Color gradient shows score
color: scores,
colorscale: 'Viridis',
showscale: true,
colorbar: {{ title: 'Relative Score' }},
// Scores are normalized 0-1 within result set
cmin: 0,
cmax: 1
}}
}};
const layout = {{
title: `Vector Space (PCA 2D) - ${{results.length}} results`,
xaxis: {{ title: 'PC1' }},
yaxis: {{ title: 'PC2' }},
hovermode: 'closest',
height: 600
}};
Plotly.newPlot('viz-plot', [trace], layout);
}},
getNextcloudUrl(result) {{
// Generate Nextcloud URL based on document type
// Use the actual Nextcloud host (port 8080), not the MCP server
const baseUrl = '{nextcloud_host_for_links}';
switch (result.doc_type) {{
case 'note':
return `${{baseUrl}}/apps/notes/note/${{result.id}}`;
case 'file':
return `${{baseUrl}}/apps/files/?fileId=${{result.id}}`;
case 'calendar':
return `${{baseUrl}}/apps/calendar`;
case 'contact':
return `${{baseUrl}}/apps/contacts`;
case 'deck':
return `${{baseUrl}}/apps/deck`;
default:
return `${{baseUrl}}`;
}}
}}
}}
}}
</script>
<style>
body {{
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
max-width: 900px;
margin: 50px auto;
padding: 20px;
background-color: #f5f5f5;
}}
.container {{
background: white;
border-radius: 8px;
padding: 30px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
min-height: calc(100vh - 200px);
}}
h1 {{
color: #0082c9;
margin-top: 0;
border-bottom: 2px solid #0082c9;
padding-bottom: 10px;
}}
h2 {{
color: #333;
margin-top: 20px;
border-bottom: 1px solid #e0e0e0;
padding-bottom: 5px;
}}
/* Tab navigation */
.tabs {{
display: flex;
gap: 0;
margin: 20px 0 0 0;
border-bottom: 2px solid #e0e0e0;
}}
.tab {{
padding: 12px 24px;
cursor: pointer;
background: transparent;
border: none;
font-size: 14px;
font-weight: 500;
color: #666;
border-bottom: 2px solid transparent;
margin-bottom: -2px;
transition: all 0.2s;
}}
.tab:hover {{
color: #0082c9;
background-color: #f5f5f5;
}}
.tab.active {{
color: #0082c9;
border-bottom-color: #0082c9;
}}
/* Tab content - use grid to overlay panes */
.tab-content {{
padding: 20px 0;
display: grid;
}}
/* Tab panes - all occupy the same grid cell to overlay */
.tab-pane {{
grid-area: 1 / 1;
}}
/* Tables */
table {{
width: 100%;
border-collapse: collapse;
margin: 15px 0;
}}
td {{
padding: 10px;
border-bottom: 1px solid #e0e0e0;
}}
td:first-child {{
width: 200px;
color: #666;
}}
code {{
background-color: #f5f5f5;
padding: 2px 6px;
border-radius: 3px;
font-family: 'Courier New', monospace;
}}
/* Badges */
.badge {{
display: inline-block;
padding: 3px 8px;
border-radius: 12px;
font-size: 12px;
font-weight: bold;
text-transform: uppercase;
}}
.badge-oauth {{
background-color: #4caf50;
color: white;
}}
.badge-basic {{
background-color: #2196f3;
color: white;
}}
/* Messages */
.warning {{
background-color: #fff3cd;
border-left: 4px solid #ffc107;
padding: 15px;
margin: 15px 0;
color: #856404;
}}
.info-message {{
background-color: #e3f2fd;
border-left: 4px solid #2196f3;
padding: 15px;
margin: 15px 0;
color: #1565c0;
}}
/* Buttons */
.button {{
display: inline-block;
padding: 10px 20px;
background-color: #d32f2f;
color: white;
text-decoration: none;
border-radius: 4px;
transition: background-color 0.3s;
border: none;
cursor: pointer;
font-size: 14px;
}}
.button:hover {{
background-color: #b71c1c;
}}
.button-primary {{
background-color: #0082c9;
}}
.button-primary:hover {{
background-color: #006ba3;
}}
/* Logout section */
.logout {{
margin-top: 30px;
padding-top: 20px;
border-top: 1px solid #e0e0e0;
}}
/* Smooth htmx content swaps */
.htmx-swapping {{
opacity: 0;
transition: opacity 200ms ease-out;
}}
/* Smooth htmx content settling */
.htmx-settling {{
opacity: 1;
transition: opacity 200ms ease-in;
}}
</style>
</head>
<body>
<div class="container" x-data="{{ activeTab: 'user-info' }}">
<h1>Nextcloud MCP Server</h1>
<!-- Tab Navigation -->
<div class="tabs">
<button
class="tab"
:class="activeTab === 'user-info' ? 'active' : ''"
@click="activeTab = 'user-info'">
User Info
</button>
{
""
if not show_vector_sync_tab
else '''
<button
class="tab"
:class="activeTab === 'vector-sync' ? 'active' : ''"
@click="activeTab = 'vector-sync'">
Vector Sync
</button>
'''
}
{
""
if not show_vector_sync_tab
else '''
<button
class="tab"
:class="activeTab === 'vector-viz' ? 'active' : ''"
@click="activeTab = 'vector-viz'">
Vector Viz
</button>
'''
}
{
""
if not show_webhooks_tab
else '''
<button
class="tab"
:class="activeTab === 'webhooks' ? 'active' : ''"
@click="activeTab = 'webhooks'">
Webhooks
</button>
'''
}
</div>
<!-- Tab Content -->
<div class="tab-content">
<!-- User Info Tab -->
<div class="tab-pane" x-show="activeTab === 'user-info'" x-transition.opacity.duration.150ms>
{user_info_tab_html}
</div>
{
""
if not show_vector_sync_tab
else f'''
<!-- Vector Sync Tab -->
<div class="tab-pane" x-show="activeTab === 'vector-sync'" x-transition.opacity.duration.150ms>
{vector_sync_tab_html}
</div>
'''
}
{
""
if not show_vector_sync_tab
else '''
<!-- Vector Viz Tab -->
<div class="tab-pane" x-show="activeTab === 'vector-viz'" x-transition.opacity.duration.150ms>
<div hx-get="/app/vector-viz" hx-trigger="load" hx-swap="outerHTML">
<p style="color: #999;">Loading vector visualization...</p>
</div>
</div>
'''
}
{
""
if not show_webhooks_tab
else f'''
<!-- Webhooks Tab (admin-only, loaded dynamically) -->
<div class="tab-pane" x-show="activeTab === 'webhooks'" x-transition.opacity.duration.150ms>
{webhooks_tab_html}
</div>
'''
}
</div>
{
f'<div class="logout"><a href="{logout_url}" class="button">Logout</a></div>'
if auth_mode == "oauth"
else ""
}
</div>
</body>
</html>
"""
return HTMLResponse(content=html_content)
# Render template
template = _jinja_env.get_template("user_info.html")
return HTMLResponse(
content=template.render(
user_info_tab_html=user_info_tab_html,
vector_sync_tab_html=vector_sync_tab_html,
webhooks_tab_html=webhooks_tab_html,
show_vector_sync_tab=show_vector_sync_tab,
show_webhooks_tab=show_webhooks_tab,
logout_url=logout_url if auth_mode == "oauth" else None,
nextcloud_host_for_links=nextcloud_host_for_links,
# Additional context for Welcome tab
vector_sync_enabled=vector_sync_enabled,
username=username,
auth_mode=auth_mode,
)
)
@requires("authenticated", redirect="oauth_login")
@@ -1077,17 +673,12 @@ async def revoke_session(request: Request) -> HTMLResponse:
oauth_ctx = getattr(request.app.state, "oauth_context", None)
if not oauth_ctx:
template = _jinja_env.get_template("error.html")
return HTMLResponse(
"""
<!DOCTYPE html>
<html>
<head><title>Error</title></head>
<body>
<h1>Error</h1>
<p>OAuth mode not enabled</p>
</body>
</html>
""",
content=template.render(
error_title="Error",
error_message="OAuth mode not enabled",
),
status_code=400,
)
@@ -1095,17 +686,12 @@ async def revoke_session(request: Request) -> HTMLResponse:
session_id = request.cookies.get("mcp_session")
if not storage or not session_id:
template = _jinja_env.get_template("error.html")
return HTMLResponse(
"""
<!DOCTYPE html>
<html>
<head><title>Error</title></head>
<body>
<h1>Error</h1>
<p>Session not found</p>
</body>
</html>
""",
content=template.render(
error_title="Error",
error_message="Session not found",
),
status_code=400,
)
@@ -1118,57 +704,26 @@ async def revoke_session(request: Request) -> HTMLResponse:
# Redirect back to user page
user_page_url = str(request.url_for("user_info_html"))
template = _jinja_env.get_template("success.html")
return HTMLResponse(
f"""
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="refresh" content="2;url={user_page_url}">
<title>Background Access Revoked</title>
<style>
body {{
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
max-width: 600px;
margin: 50px auto;
padding: 20px;
text-align: center;
}}
.success {{
background-color: #e8f5e9;
border: 2px solid #4caf50;
padding: 30px;
border-radius: 8px;
}}
h1 {{
color: #4caf50;
}}
</style>
</head>
<body>
<div class="success">
<h1>✓ Background Access Revoked</h1>
<p>Your refresh token has been deleted successfully.</p>
<p>Browser session remains active.</p>
<p>Redirecting back to user page...</p>
</div>
</body>
</html>
"""
content=template.render(
success_title="✓ Background Access Revoked",
success_messages=[
"Your refresh token has been deleted successfully.",
"Browser session remains active.",
],
redirect_url=user_page_url,
redirect_delay=2,
)
)
except Exception as e:
logger.error(f"Failed to revoke background access: {e}")
template = _jinja_env.get_template("error.html")
return HTMLResponse(
f"""
<!DOCTYPE html>
<html>
<head><title>Error</title></head>
<body>
<h1>Error</h1>
<p>Failed to revoke background access: {e}</p>
</body>
</html>
""",
content=template.render(
error_title="Error",
error_message=f"Failed to revoke background access: {e}",
),
status_code=500,
)
+427 -370
View File
@@ -1,35 +1,42 @@
"""Vector visualization routes for testing search algorithms.
Provides a web UI for users to test different search algorithms on their own
indexed documents and visualize results in 2D space using PCA.
indexed documents and visualize results in 3D space using PCA.
All processing happens server-side following ADR-012:
- Search execution via shared search/algorithms.py
- PCA dimensionality reduction (768-dim → 2D)
- Only 2D coordinates + metadata sent to client
- Bandwidth-efficient (2 floats per doc vs 768)
- Query embedding generation
- PCA dimensionality reduction (768-dim → 3D)
- Only 3D coordinates + metadata sent to client
- Bandwidth-efficient (3 floats per doc vs 768)
"""
import logging
import time
from pathlib import Path
import numpy as np
from jinja2 import Environment, FileSystemLoader
from starlette.authentication import requires
from starlette.requests import Request
from starlette.responses import HTMLResponse, JSONResponse
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.observability.tracing import trace_operation
from nextcloud_mcp_server.search import (
FuzzySearchAlgorithm,
HybridSearchAlgorithm,
KeywordSearchAlgorithm,
BM25HybridSearchAlgorithm,
SemanticSearchAlgorithm,
)
from nextcloud_mcp_server.vector.pca import PCA
from nextcloud_mcp_server.vector.placeholder import get_placeholder_filter
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
# Setup Jinja2 environment for templates
_template_dir = Path(__file__).parent / "templates"
_jinja_env = Environment(loader=FileSystemLoader(_template_dir))
@requires("authenticated", redirect="oauth_login")
async def vector_visualization_html(request: Request) -> HTMLResponse:
@@ -65,284 +72,28 @@ async def vector_visualization_html(request: Request) -> HTMLResponse:
else "unknown"
)
html_content = f"""
<style>
.viz-card {{
background: white;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
}}
.viz-controls {{
margin-bottom: 20px;
}}
.viz-control-row {{
display: grid;
grid-template-columns: 2fr 1fr auto;
gap: 12px;
margin-bottom: 12px;
align-items: end;
}}
.viz-control-group {{
margin-bottom: 15px;
}}
.viz-control-group label {{
display: block;
margin-bottom: 5px;
font-weight: 500;
color: #333;
}}
.viz-control-group input[type="text"],
.viz-control-group input[type="number"],
.viz-control-group select {{
width: 100%;
padding: 8px 12px;
border: 1px solid #ddd;
border-radius: 4px;
font-size: 14px;
}}
.viz-control-group input[type="range"] {{
width: 100%;
}}
.viz-control-group select[multiple] {{
min-height: 100px;
}}
.viz-weight-display {{
display: inline-block;
min-width: 40px;
text-align: right;
color: #666;
}}
.viz-btn {{
background: #0066cc;
color: white;
border: none;
padding: 10px 20px;
border-radius: 4px;
cursor: pointer;
font-size: 14px;
font-weight: 500;
}}
.viz-btn:hover {{
background: #0052a3;
}}
.viz-btn-secondary {{
background: #6c757d;
color: white;
border: none;
padding: 6px 12px;
border-radius: 4px;
cursor: pointer;
font-size: 13px;
margin-bottom: 12px;
}}
.viz-btn-secondary:hover {{
background: #5a6268;
}}
#viz-plot-container {{
width: 100%;
height: 600px;
position: relative;
}}
#viz-plot {{
width: 100%;
height: 100%;
}}
.viz-loading {{
text-align: center;
padding: 40px;
color: #666;
}}
.viz-loading-overlay {{
position: absolute;
inset: 0;
display: flex;
align-items: center;
justify-content: center;
background: white;
color: #666;
}}
.viz-no-results {{
text-align: center;
padding: 40px;
color: #666;
font-style: italic;
}}
.viz-advanced-section {{
margin-top: 16px;
padding: 16px;
background: #f8f9fa;
border-radius: 4px;
border: 1px solid #dee2e6;
}}
.viz-advanced-grid {{
display: grid;
grid-template-columns: 1fr 1fr;
gap: 20px;
}}
.viz-info-box {{
background: #e3f2fd;
border-left: 4px solid #2196f3;
padding: 12px;
margin-bottom: 20px;
font-size: 14px;
}}
</style>
<div x-data="vizApp()">
<div class="viz-card">
<h2>Vector Visualization</h2>
<div class="viz-info-box">
Testing search algorithms on your indexed documents. User: <strong>{username}</strong>
</div>
<form @submit.prevent="executeSearch">
<div class="viz-controls">
<!-- Main Controls -->
<div class="viz-control-group">
<label>Search Query</label>
<input type="text" x-model="query" placeholder="Enter search query..." required />
</div>
<div class="viz-control-row">
<div class="viz-control-group" style="margin-bottom: 0;">
<label>Algorithm</label>
<select x-model="algorithm">
<option value="semantic">Semantic (Vector Similarity)</option>
<option value="keyword">Keyword (Token Matching)</option>
<option value="fuzzy">Fuzzy (Character Overlap)</option>
<option value="hybrid" selected>Hybrid (RRF Fusion)</option>
</select>
</div>
<div style="display: flex; align-items: flex-end;">
<button type="submit" class="viz-btn" style="width: 100%;">Search & Visualize</button>
</div>
<div style="display: flex; align-items: flex-end;">
<button type="button" class="viz-btn-secondary" @click="showAdvanced = !showAdvanced" style="white-space: nowrap;">
<span x-text="showAdvanced ? 'Hide Advanced' : 'Advanced'"></span>
</button>
</div>
</div>
<!-- Advanced Options (Collapsible) -->
<div class="viz-advanced-section" x-show="showAdvanced" x-transition.opacity.duration.200ms>
<h3 style="margin-top: 0; margin-bottom: 16px; font-size: 16px;">Advanced Options</h3>
<div class="viz-advanced-grid">
<div class="viz-control-group">
<label>Document Types</label>
<select x-model="docTypes" multiple>
<option value="">All Types (cross-app search)</option>
<option value="note">Notes</option>
<option value="file">Files</option>
<option value="calendar">Calendar Events</option>
<option value="contact">Contacts</option>
<option value="deck">Deck Cards</option>
</select>
<small style="color: #666; display: block; margin-top: 4px;">
Hold Ctrl/Cmd to select multiple
</small>
</div>
<div>
<div class="viz-control-group">
<label>Score Threshold (Semantic/Hybrid)</label>
<input type="number" x-model.number="scoreThreshold" min="0" max="1" step="0.1" />
</div>
<div class="viz-control-group">
<label>Result Limit</label>
<input type="number" x-model.number="limit" min="1" max="100" />
</div>
</div>
</div>
<!-- Hybrid Weights (only when hybrid selected) -->
<div x-show="algorithm === 'hybrid'" style="margin-top: 16px; padding: 12px; background: #e9ecef; border-radius: 4px;">
<label style="margin-bottom: 12px; display: block;">Hybrid Algorithm Weights</label>
<div style="margin-bottom: 8px;">
<label style="display: inline-block; width: 100px; font-weight: normal;">Semantic:</label>
<input type="range" x-model.number="semanticWeight" min="0" max="1" step="0.1" style="width: 200px; display: inline-block;">
<span class="viz-weight-display" x-text="semanticWeight.toFixed(1)"></span>
</div>
<div style="margin-bottom: 8px;">
<label style="display: inline-block; width: 100px; font-weight: normal;">Keyword:</label>
<input type="range" x-model.number="keywordWeight" min="0" max="1" step="0.1" style="width: 200px; display: inline-block;">
<span class="viz-weight-display" x-text="keywordWeight.toFixed(1)"></span>
</div>
<div>
<label style="display: inline-block; width: 100px; font-weight: normal;">Fuzzy:</label>
<input type="range" x-model.number="fuzzyWeight" min="0" max="1" step="0.1" style="width: 200px; display: inline-block;">
<span class="viz-weight-display" x-text="fuzzyWeight.toFixed(1)"></span>
</div>
</div>
</div>
</div>
</form>
</div>
<div class="viz-card">
<div id="viz-plot-container">
<div x-show="loading" class="viz-loading-overlay" x-transition.opacity.duration.200ms>
Executing search and computing PCA projection...
</div>
<div id="viz-plot" x-show="!loading" x-transition.opacity.duration.200ms></div>
</div>
</div>
<div class="viz-card">
<h3>Search Results (<span x-text="loading ? '...' : results.length"></span>)</h3>
<div x-show="loading" class="viz-loading" x-transition.opacity.duration.200ms>
Loading results...
</div>
<div x-show="!loading && results.length === 0" class="viz-no-results" x-transition.opacity.duration.200ms>
No results found. Try a different query or adjust your search parameters.
</div>
<template x-if="!loading && results.length > 0">
<div x-transition.opacity.duration.200ms>
<template x-for="result in results" :key="result.id">
<div style="padding: 12px; border-bottom: 1px solid #eee;">
<a :href="getNextcloudUrl(result)" target="_blank" style="font-weight: 500; color: #0066cc; text-decoration: none;">
<span x-text="result.title"></span>
</a>
<div style="font-size: 14px; color: #666; margin-top: 4px;" x-text="result.excerpt"></div>
<div style="font-size: 12px; color: #999; margin-top: 4px;">
Score: <span x-text="result.score.toFixed(3)"></span> |
Type: <span x-text="result.doc_type"></span>
</div>
</div>
</template>
</div>
</template>
</div>
</div>
"""
# Load and render template
template = _jinja_env.get_template("vector_viz.html")
html_content = template.render(username=username)
return HTMLResponse(content=html_content)
@requires("authenticated", redirect="oauth_login")
async def vector_visualization_search(request: Request) -> JSONResponse:
"""Execute server-side search and return 2D coordinates + results.
"""Execute server-side search and return 3D coordinates + results.
All processing happens server-side:
1. Execute search via shared algorithm module
2. Fetch matching vectors from Qdrant
3. Apply PCA reduction (768-dim → 2D)
4. Return coordinates + metadata only
2. Generate query embedding
3. Fetch matching vectors from Qdrant
4. Apply PCA reduction (768-dim → 3D) to query + documents
5. Return coordinates + metadata only
Args:
request: Starlette request with query parameters
Returns:
JSON response with coordinates_2d and results
JSON response with coordinates_3d and results (including query point)
"""
settings = get_settings()
@@ -365,12 +116,10 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
# Parse query parameters
query = request.query_params.get("query", "")
algorithm = request.query_params.get("algorithm", "hybrid")
algorithm = request.query_params.get("algorithm", "bm25_hybrid")
limit = int(request.query_params.get("limit", "50"))
score_threshold = float(request.query_params.get("score_threshold", "0.7"))
semantic_weight = float(request.query_params.get("semantic_weight", "0.5"))
keyword_weight = float(request.query_params.get("keyword_weight", "0.3"))
fuzzy_weight = float(request.query_params.get("fuzzy_weight", "0.2"))
score_threshold = float(request.query_params.get("score_threshold", "0.0"))
fusion = request.query_params.get("fusion", "rrf") # Default to RRF
# Parse doc_types (comma-separated list, None = all types)
doc_types_param = request.query_params.get("doc_types", "")
@@ -378,7 +127,7 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
logger.info(
f"Viz search: user={username}, query='{query}', "
f"algorithm={algorithm}, limit={limit}, doc_types={doc_types}"
f"algorithm={algorithm}, fusion={fusion}, limit={limit}, doc_types={doc_types}"
)
try:
@@ -391,19 +140,16 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
_get_authenticated_client_for_userinfo,
)
async with await _get_authenticated_client_for_userinfo(request) as http_client: # noqa: F841
with trace_operation("vector_viz.get_auth_client"):
auth_client_ctx = await _get_authenticated_client_for_userinfo(request)
async with auth_client_ctx as nc_client: # noqa: F841
# Create search algorithm (no client needed - verification removed)
if algorithm == "semantic":
search_algo = SemanticSearchAlgorithm(score_threshold=score_threshold)
elif algorithm == "keyword":
search_algo = KeywordSearchAlgorithm()
elif algorithm == "fuzzy":
search_algo = FuzzySearchAlgorithm()
elif algorithm == "hybrid":
search_algo = HybridSearchAlgorithm(
semantic_weight=semantic_weight,
keyword_weight=keyword_weight,
fuzzy_weight=fuzzy_weight,
elif algorithm == "bm25_hybrid":
search_algo = BM25HybridSearchAlgorithm(
score_threshold=score_threshold, fusion=fusion
)
else:
return JSONResponse(
@@ -417,24 +163,40 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
all_results = []
if doc_types is None or len(doc_types) == 0:
# Cross-app search - search all indexed types
unverified_results = await search_algo.search(
query=query,
user_id=username,
limit=limit * 2, # Buffer for verification filtering
doc_type=None, # Search all types
score_threshold=score_threshold,
)
all_results.extend(unverified_results)
else:
# Search each document type and combine
for doc_type in doc_types:
with trace_operation(
"vector_viz.search_execute",
attributes={
"search.algorithm": algorithm,
"search.limit": limit * 2,
"search.doc_type": "all",
},
):
unverified_results = await search_algo.search(
query=query,
user_id=username,
limit=limit * 2, # Buffer for verification filtering
doc_type=doc_type,
doc_type=None, # Search all types
score_threshold=score_threshold,
)
all_results.extend(unverified_results)
else:
# Search each document type and combine
for doc_type in doc_types:
with trace_operation(
"vector_viz.search_execute",
attributes={
"search.algorithm": algorithm,
"search.limit": limit * 2,
"search.doc_type": doc_type,
},
):
unverified_results = await search_algo.search(
query=query,
user_id=username,
limit=limit * 2, # Buffer for verification filtering
doc_type=doc_type,
score_threshold=score_threshold,
)
all_results.extend(unverified_results)
# Sort by score before verification
all_results.sort(key=lambda r: r.score, reverse=True)
@@ -445,78 +207,87 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
search_results = all_results[:limit]
search_duration = time.perf_counter() - search_start
# Normalize scores relative to this result set for better visualization
# Store original scores and normalize for visualization
# (best result = 1.0, worst result = 0.0 within THIS result set)
# This makes visual encoding meaningful regardless of RRF normalization
if search_results:
scores = [r.score for r in search_results]
min_score, max_score = min(scores), max(scores)
score_range = max_score - min_score if max_score > min_score else 1.0
with trace_operation(
"vector_viz.score_normalize",
attributes={"normalize.num_results": len(search_results)},
):
if search_results:
scores = [r.score for r in search_results]
min_score, max_score = min(scores), max(scores)
score_range = max_score - min_score if max_score > min_score else 1.0
logger.info(
f"Normalizing scores for viz: original range [{min_score:.3f}, {max_score:.3f}] "
f"→ [0.0, 1.0]"
)
logger.info(
f"Normalizing scores for viz: original range [{min_score:.3f}, {max_score:.3f}] "
f"→ [0.0, 1.0]"
)
# Rescale each result's score to 0-1 within this result set
for r in search_results:
r.score = (r.score - min_score) / score_range
# Store original score and rescale to 0-1 for visualization
for r in search_results:
# Store original score before normalization
r.original_score = r.score
# Rescale for visual encoding
r.score = (r.score - min_score) / score_range
if not search_results:
return JSONResponse(
{
"success": True,
"results": [],
"coordinates_2d": [],
"coordinates_3d": [],
"query_coords": [],
"message": "No results found",
}
)
# Fetch vectors for matching results from Qdrant
# Fetch vectors for specific matching chunks from Qdrant using batch retrieve
vector_fetch_start = time.perf_counter()
qdrant_client = await get_qdrant_client()
doc_ids = [r.id for r in search_results]
# Retrieve vectors for the matching documents
from qdrant_client.models import FieldCondition, Filter, MatchAny
with trace_operation("vector_viz.get_qdrant_client"):
qdrant_client = await get_qdrant_client()
points_response = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(
key="doc_id",
match=MatchAny(any=[str(doc_id) for doc_id in doc_ids]),
),
FieldCondition(
key="user_id",
match={"value": username},
),
]
),
limit=len(doc_ids) * 2, # Account for multiple chunks per doc
with_vectors=True,
with_payload=["doc_id"], # Need doc_id to map vectors to results
)
chunk_vectors_map = {} # Map (doc_id, chunk_start, chunk_end) -> vector
points = points_response[0]
# Collect point IDs from search results for batch retrieval
# point_id is the Qdrant internal ID returned by search algorithms
point_ids = [r.point_id for r in search_results if r.point_id]
if not points:
return JSONResponse(
{
"success": True,
"results": [],
"coordinates_2d": [],
"message": "No vectors found for results",
}
)
if point_ids:
# Single batch retrieve call instead of N sequential scroll calls
# This is ~50x faster for 50 results (1 HTTP request vs 50)
with trace_operation(
"vector_viz.vector_retrieve",
attributes={"retrieve.num_points": len(point_ids)},
):
points_response = await qdrant_client.retrieve(
collection_name=settings.get_collection_name(),
ids=point_ids,
with_vectors=["dense"],
with_payload=["doc_id", "chunk_start_offset", "chunk_end_offset"],
)
# Build chunk_vectors_map from batch response
for point in points_response:
if point.vector is not None:
# Extract dense vector (handle both named and unnamed vectors)
if isinstance(point.vector, dict):
vector = point.vector.get("dense")
else:
vector = point.vector
if vector is not None and point.payload:
doc_id = point.payload.get("doc_id")
chunk_start = point.payload.get("chunk_start_offset")
chunk_end = point.payload.get("chunk_end_offset")
chunk_key = (doc_id, chunk_start, chunk_end)
chunk_vectors_map[chunk_key] = vector
# Extract vectors
vectors = np.array([p.vector for p in points if p.vector is not None])
vector_fetch_duration = time.perf_counter() - vector_fetch_start
if len(vectors) < 2:
# Not enough points for PCA
if len(chunk_vectors_map) < 2:
# Not enough chunks for PCA
return JSONResponse(
{
"success": True,
@@ -530,35 +301,149 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
}
for r in search_results
],
"coordinates_2d": [[0, 0]] * len(search_results),
"message": "Not enough vectors for PCA",
"coordinates_3d": [[0, 0, 0]] * len(search_results),
"query_coords": [0, 0, 0],
"message": "Not enough chunks for PCA",
}
)
# Apply PCA dimensionality reduction (768-dim → 2D)
# Detect embedding dimension from first available vector
embedding_dim = None
for vector in chunk_vectors_map.values():
if vector is not None:
embedding_dim = len(vector)
break
if embedding_dim is None:
return JSONResponse(
{
"success": False,
"error": "Could not determine embedding dimension",
},
status_code=500,
)
logger.info(f"Detected embedding dimension: {embedding_dim}")
# Build chunk vectors array in search_results order (1:1 mapping)
chunk_vectors = []
for result in search_results:
chunk_key = (result.id, result.chunk_start_offset, result.chunk_end_offset)
if chunk_key in chunk_vectors_map:
chunk_vectors.append(chunk_vectors_map[chunk_key])
else:
# Chunk not found in vectors (shouldn't happen)
logger.warning(
f"Chunk {chunk_key} not found in fetched vectors, using zero vector"
)
# Use zero vector as fallback
chunk_vectors.append(np.zeros(embedding_dim))
chunk_vectors = np.array(chunk_vectors)
# Reuse query embedding from search algorithm (avoids redundant embedding call)
query_embed_start = time.perf_counter()
if search_algo.query_embedding is not None:
query_embedding = search_algo.query_embedding
logger.info(
f"Reusing query embedding from search algorithm "
f"(dimension={len(query_embedding)})"
)
else:
# Fallback: generate embedding if not available from search
from nextcloud_mcp_server.embedding.service import get_embedding_service
embedding_service = get_embedding_service()
query_embedding = await embedding_service.embed(query)
logger.info(f"Generated query embedding (dimension={len(query_embedding)})")
query_embed_duration = time.perf_counter() - query_embed_start
# Combine query vector with chunk vectors for PCA
# Query will be the last point in the array
all_vectors = np.vstack([chunk_vectors, np.array([query_embedding])])
# Normalize vectors to unit length (L2 normalization)
# This is critical because Qdrant uses COSINE distance, which only measures
# vector direction (angle), not magnitude. PCA uses Euclidean distance which
# considers both direction and magnitude. By normalizing to unit length,
# Euclidean distances in PCA space will match cosine distances.
norms = np.linalg.norm(all_vectors, axis=1, keepdims=True)
# Check for zero-norm vectors (can happen with empty/corrupted embeddings)
zero_norm_mask = norms[:, 0] < 1e-10
if zero_norm_mask.any():
zero_indices = np.where(zero_norm_mask)[0]
logger.warning(
f"Found {zero_norm_mask.sum()} zero-norm vectors at indices {zero_indices.tolist()}. "
"Replacing with small epsilon to avoid division by zero."
)
# Replace zero norms with small epsilon to avoid NaN
norms[zero_norm_mask] = 1e-10
all_vectors_normalized = all_vectors / norms
logger.info(
f"Normalized vectors: query_norm={norms[-1][0]:.3f}, "
f"doc_norm_range=[{norms[:-1].min():.3f}, {norms[:-1].max():.3f}]"
)
# Apply PCA dimensionality reduction (768-dim → 3D) on normalized vectors
# Run in thread pool to avoid blocking the event loop (CPU-bound)
pca_start = time.perf_counter()
pca = PCA(n_components=2)
coords_2d = pca.fit_transform(vectors)
def _compute_pca(vectors: np.ndarray) -> tuple[np.ndarray, PCA]:
pca = PCA(n_components=3)
coords = pca.fit_transform(vectors)
return coords, pca
import anyio
with trace_operation(
"vector_viz.pca_compute",
attributes={
"pca.num_vectors": len(all_vectors_normalized),
"pca.embedding_dim": embedding_dim,
},
):
coords_3d, pca = await anyio.to_thread.run_sync( # type: ignore[attr-defined]
lambda: _compute_pca(all_vectors_normalized)
)
pca_duration = time.perf_counter() - pca_start
# After fit, these attributes are guaranteed to be set
assert pca.explained_variance_ratio_ is not None
# Check for NaN values in PCA output (numerical instability)
nan_mask = np.isnan(coords_3d)
if nan_mask.any():
nan_rows = np.where(nan_mask.any(axis=1))[0]
logger.error(
f"Found NaN values in PCA output at {len(nan_rows)} points: {nan_rows.tolist()[:10]}. "
"Replacing NaN with 0.0 to prevent JSON serialization error."
)
# Replace NaN with 0 to allow JSON serialization
coords_3d = np.nan_to_num(coords_3d, nan=0.0)
# Split query coords from chunk coords
# Round to 2 decimal places for cleaner display
query_coords_3d = [
round(float(x), 2) for x in coords_3d[-1]
] # Last point is query
chunk_coords_3d = coords_3d[:-1] # All but last are chunks
logger.info(
f"PCA explained variance: PC1={pca.explained_variance_ratio_[0]:.3f}, "
f"PC2={pca.explained_variance_ratio_[1]:.3f}"
f"PC2={pca.explained_variance_ratio_[1]:.3f}, "
f"PC3={pca.explained_variance_ratio_[2]:.3f}"
)
logger.info(
f"Embedding stats: chunks={len(chunk_vectors)}, "
f"query_dim={len(query_embedding)}, chunk_vector_dim={chunk_vectors.shape[1] if chunk_vectors.size > 0 else 0}"
)
# Map results to coordinates (use first chunk per document)
result_coords = []
seen_doc_ids = set()
for point, coord in zip(points, coords_2d):
if point.payload:
doc_id = int(point.payload.get("doc_id", 0))
if doc_id not in seen_doc_ids and doc_id in doc_ids:
seen_doc_ids.add(doc_id)
result_coords.append(coord.tolist())
# Coordinates already match search_results order (1:1 mapping)
result_coords = [
[round(float(x), 2) for x in coord] for coord in chunk_coords_3d
]
# Build response
response_results = [
@@ -567,7 +452,12 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
"doc_type": r.doc_type,
"title": r.title,
"excerpt": r.excerpt,
"score": r.score,
"score": r.score, # Normalized score for visual encoding (0-1)
"original_score": getattr(
r, "original_score", r.score
), # Raw score from algorithm
"chunk_start_offset": r.chunk_start_offset,
"chunk_end_offset": r.chunk_end_offset,
}
for r in search_results
]
@@ -580,26 +470,30 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
f"Viz search timing: total={total_duration * 1000:.1f}ms, "
f"search={search_duration * 1000:.1f}ms ({search_duration / total_duration * 100:.1f}%), "
f"vector_fetch={vector_fetch_duration * 1000:.1f}ms ({vector_fetch_duration / total_duration * 100:.1f}%), "
f"query_embed={query_embed_duration * 1000:.1f}ms ({query_embed_duration / total_duration * 100:.1f}%), "
f"pca={pca_duration * 1000:.1f}ms ({pca_duration / total_duration * 100:.1f}%), "
f"results={len(search_results)}, vectors={len(vectors)}"
f"results={len(search_results)}, chunk_vectors={len(chunk_vectors)}"
)
return JSONResponse(
{
"success": True,
"results": response_results,
"coordinates_2d": result_coords[: len(search_results)],
"coordinates_3d": result_coords[: len(search_results)],
"query_coords": query_coords_3d,
"pca_variance": {
"pc1": float(pca.explained_variance_ratio_[0]),
"pc2": float(pca.explained_variance_ratio_[1]),
"pc3": float(pca.explained_variance_ratio_[2]),
},
"timing": {
"total_ms": round(total_duration * 1000, 2),
"search_ms": round(search_duration * 1000, 2),
"vector_fetch_ms": round(vector_fetch_duration * 1000, 2),
"query_embed_ms": round(query_embed_duration * 1000, 2),
"pca_ms": round(pca_duration * 1000, 2),
"num_results": len(search_results),
"num_vectors": len(vectors),
"num_chunk_vectors": len(chunk_vectors),
},
}
)
@@ -610,3 +504,166 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
{"success": False, "error": str(e)},
status_code=500,
)
@requires("authenticated", redirect="oauth_login")
async def chunk_context_endpoint(request: Request) -> JSONResponse:
"""Fetch chunk text with surrounding context for visualization.
This endpoint retrieves the matched chunk along with surrounding text
to provide context for the search result. Used by the viz pane to
display chunks inline.
Query parameters:
doc_type: Document type (e.g., "note")
doc_id: Document ID
start: Chunk start offset (character position)
end: Chunk end offset (character position)
context: Characters of context before/after (default: 500)
Returns:
JSON with chunk_text, before_context, after_context, and flags
"""
try:
# Get query parameters
doc_type = request.query_params.get("doc_type")
doc_id = request.query_params.get("doc_id")
start_str = request.query_params.get("start")
end_str = request.query_params.get("end")
context_chars = int(request.query_params.get("context", "500"))
# Validate required parameters
if not all([doc_type, doc_id, start_str, end_str]):
return JSONResponse(
{
"success": False,
"error": "Missing required parameters: doc_type, doc_id, start, end",
},
status_code=400,
)
# Type assertions - we validated these above
assert doc_type is not None
assert doc_id is not None
assert start_str is not None
assert end_str is not None
start = int(start_str)
end = int(end_str)
# Convert doc_id to int (all document types use int IDs)
doc_id_int = int(doc_id)
# Get authenticated Nextcloud client
from nextcloud_mcp_server.auth.userinfo_routes import (
_get_authenticated_client_for_userinfo,
)
from nextcloud_mcp_server.search.context import get_chunk_with_context
# Use context expansion module to fetch chunk with surrounding context
async with await _get_authenticated_client_for_userinfo(request) as nc_client:
chunk_context = await get_chunk_with_context(
nc_client=nc_client,
user_id=request.user.display_name, # User ID from auth
doc_id=doc_id_int,
doc_type=doc_type,
chunk_start=start,
chunk_end=end,
context_chars=context_chars,
)
# Check if context expansion succeeded
if chunk_context is None:
return JSONResponse(
{
"success": False,
"error": f"Failed to fetch chunk context for {doc_type} {doc_id}",
},
status_code=404,
)
logger.info(
f"Fetched chunk context for {doc_type}_{doc_id}: "
f"chunk_len={len(chunk_context.chunk_text)}, "
f"before_len={len(chunk_context.before_context)}, "
f"after_len={len(chunk_context.after_context)}"
)
# For PDF files, also fetch the highlighted page image from Qdrant
highlighted_page_image = None
page_number = None
if doc_type == "file":
try:
from qdrant_client.models import FieldCondition, Filter, MatchValue
settings = get_settings()
qdrant_client = await get_qdrant_client()
username = request.user.display_name
# Query for this specific chunk's highlighted image
points_response = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
get_placeholder_filter(),
FieldCondition(
key="doc_id", match=MatchValue(value=doc_id_int)
),
FieldCondition(
key="user_id", match=MatchValue(value=username)
),
FieldCondition(
key="chunk_start_offset", match=MatchValue(value=start)
),
FieldCondition(
key="chunk_end_offset", match=MatchValue(value=end)
),
]
),
limit=1,
with_vectors=False,
with_payload=["highlighted_page_image", "page_number"],
)
points = points_response[0]
if points and points[0].payload:
highlighted_page_image = points[0].payload.get(
"highlighted_page_image"
)
page_number = points[0].payload.get("page_number")
if highlighted_page_image:
logger.info(
f"Found highlighted image for chunk: "
f"page={page_number}, image_size={len(highlighted_page_image)}"
)
except Exception as e:
logger.warning(f"Failed to fetch highlighted image: {e}")
# Return response compatible with frontend expectations
response_data: dict = {
"success": True,
"chunk_text": chunk_context.chunk_text,
"before_context": chunk_context.before_context,
"after_context": chunk_context.after_context,
"has_more_before": chunk_context.has_before_truncation,
"has_more_after": chunk_context.has_after_truncation,
}
# Add image data if available
if highlighted_page_image:
response_data["highlighted_page_image"] = highlighted_page_image
response_data["page_number"] = page_number
return JSONResponse(response_data)
except ValueError as e:
logger.error(f"Invalid parameter format: {e}")
return JSONResponse(
{"success": False, "error": f"Invalid parameter format: {e}"},
status_code=400,
)
except Exception as e:
logger.error(f"Chunk context error: {e}", exc_info=True)
return JSONResponse(
{"success": False, "error": str(e)},
status_code=500,
)
@@ -139,6 +139,7 @@ async def _get_authenticated_client(request: Request) -> httpx.AsyncClient:
raise RuntimeError("BasicAuth credentials not configured")
assert nextcloud_host is not None # Type narrowing for type checker
assert username is not None and password is not None # Type narrowing
return httpx.AsyncClient(
base_url=nextcloud_host,
auth=(username, password),
+2 -2
View File
@@ -29,9 +29,9 @@ from .app import get_app
@click.option(
"--transport",
"-t",
default="sse",
default="streamable-http",
show_default=True,
type=click.Choice(["sse", "streamable-http", "http"]),
type=click.Choice(["streamable-http", "http"]),
help="MCP transport protocol",
)
@click.option(
+67
View File
@@ -18,6 +18,7 @@ from .contacts import ContactsClient
from .cookbook import CookbookClient
from .deck import DeckClient
from .groups import GroupsClient
from .news import NewsClient
from .notes import NotesClient
from .sharing import SharingClient
from .tables import TablesClient
@@ -81,6 +82,7 @@ class NextcloudClient:
self.contacts = ContactsClient(self._client, username)
self.cookbook = CookbookClient(self._client, username)
self.deck = DeckClient(self._client, username)
self.news = NewsClient(self._client, username)
self.users = UsersClient(self._client, username)
self.groups = GroupsClient(self._client, username)
self.sharing = SharingClient(self._client, username)
@@ -130,10 +132,75 @@ class NextcloudClient:
all_notes = self.notes.get_all_notes()
return await self._notes_search.search_notes(all_notes, query)
async def find_files_by_tag(
self, tag_name: str, mime_type_filter: str | None = None
) -> list[dict]:
"""Find files by system tag name, optionally filtered by MIME type.
This method coordinates tag lookup and file retrieval via WebDAV:
1. Look up the tag ID by name
2. Get all files with that tag (via REPORT with full metadata)
3. Optionally filter by MIME type
Args:
tag_name: Name of the system tag to search for (e.g., "vector-index")
mime_type_filter: Optional MIME type filter (e.g., "application/pdf")
Returns:
List of file dictionaries with WebDAV properties (path, size, content_type, etc.)
Raises:
RuntimeError: If tag lookup or file query fails
Examples:
# Find all files with "vector-index" tag
files = await nc_client.find_files_by_tag("vector-index")
# Find only PDFs with the tag
pdfs = await nc_client.find_files_by_tag("vector-index", "application/pdf")
"""
# Look up tag by name using WebDAV
tag = await self.webdav.get_tag_by_name(tag_name)
if not tag:
logger.debug(f"Tag '{tag_name}' not found, returning empty list")
return []
# Get files with this tag (returns full file info from REPORT)
files = await self.webdav.get_files_by_tag(tag["id"])
if not files:
logger.debug(f"No files found with tag '{tag_name}'")
return []
logger.debug(f"Found {len(files)} files with tag '{tag_name}'")
# Apply MIME type filter if specified
if mime_type_filter:
filtered_files = [
f
for f in files
if f.get("content_type", "").startswith(mime_type_filter)
]
logger.info(
f"Returning {len(filtered_files)} files with tag '{tag_name}' (filtered by {mime_type_filter})"
)
return filtered_files
logger.info(f"Returning {len(files)} files with tag '{tag_name}'")
return files
def _get_webdav_base_path(self) -> str:
"""Helper to get the base WebDAV path for the authenticated user."""
return f"/remote.php/dav/files/{self.username}"
async def __aenter__(self):
"""Async context manager entry."""
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
"""Async context manager exit - closes all clients."""
await self.close()
return False # Don't suppress exceptions
async def close(self):
"""Close the HTTP client and CalDAV client."""
await self._client.aclose()
+385
View File
@@ -0,0 +1,385 @@
"""Client for Nextcloud News app operations."""
import logging
from enum import IntEnum
from typing import Any
from .base import BaseNextcloudClient
logger = logging.getLogger(__name__)
class NewsItemType(IntEnum):
"""Type constants for News API item queries."""
FEED = 0 # Single feed
FOLDER = 1 # Folder and its feeds
STARRED = 2 # All starred items
ALL = 3 # All items
class NewsClient(BaseNextcloudClient):
"""Client for Nextcloud News app operations."""
app_name = "news"
API_BASE = "/apps/news/api/v1-3"
# --- Folders ---
async def get_folders(self) -> list[dict[str, Any]]:
"""Get all folders."""
response = await self._make_request("GET", f"{self.API_BASE}/folders")
return response.json().get("folders", [])
async def create_folder(self, name: str) -> dict[str, Any]:
"""Create a new folder.
Args:
name: Folder name
Returns:
Created folder data
Raises:
HTTPStatusError: 409 if folder name already exists,
422 if name is empty
"""
response = await self._make_request(
"POST", f"{self.API_BASE}/folders", json={"name": name}
)
folders = response.json().get("folders", [])
return folders[0] if folders else {}
async def rename_folder(self, folder_id: int, name: str) -> None:
"""Rename a folder.
Args:
folder_id: Folder ID
name: New folder name
Raises:
HTTPStatusError: 404 if folder not found, 409 if name exists
"""
await self._make_request(
"PUT", f"{self.API_BASE}/folders/{folder_id}", json={"name": name}
)
async def delete_folder(self, folder_id: int) -> None:
"""Delete a folder and all its feeds/items.
Args:
folder_id: Folder ID
Raises:
HTTPStatusError: 404 if folder not found
"""
await self._make_request("DELETE", f"{self.API_BASE}/folders/{folder_id}")
async def mark_folder_read(self, folder_id: int, newest_item_id: int) -> None:
"""Mark all items in a folder as read.
Args:
folder_id: Folder ID
newest_item_id: ID of newest item to mark read (prevents marking
items user hasn't seen yet)
Raises:
HTTPStatusError: 404 if folder not found
"""
await self._make_request(
"POST",
f"{self.API_BASE}/folders/{folder_id}/read",
json={"newestItemId": newest_item_id},
)
# --- Feeds ---
async def get_feeds(self) -> dict[str, Any]:
"""Get all feeds with metadata.
Returns:
Dict with keys:
- feeds: List of feed objects
- starredCount: Number of starred items
- newestItemId: ID of newest item (omitted if no items)
"""
response = await self._make_request("GET", f"{self.API_BASE}/feeds")
return response.json()
async def create_feed(
self, url: str, folder_id: int | None = None
) -> dict[str, Any]:
"""Subscribe to a new feed.
Args:
url: Feed URL
folder_id: Optional folder ID (None for root)
Returns:
Created feed data
Raises:
HTTPStatusError: 409 if feed already exists, 422 if URL is invalid
"""
body: dict[str, Any] = {"url": url}
if folder_id is not None:
body["folderId"] = folder_id
response = await self._make_request("POST", f"{self.API_BASE}/feeds", json=body)
data = response.json()
feeds = data.get("feeds", [])
return feeds[0] if feeds else {}
async def delete_feed(self, feed_id: int) -> None:
"""Unsubscribe from a feed (deletes all items).
Args:
feed_id: Feed ID
Raises:
HTTPStatusError: 404 if feed not found
"""
await self._make_request("DELETE", f"{self.API_BASE}/feeds/{feed_id}")
async def move_feed(self, feed_id: int, folder_id: int | None) -> None:
"""Move a feed to a different folder.
Args:
feed_id: Feed ID
folder_id: Target folder ID (None for root)
Raises:
HTTPStatusError: 404 if feed not found
"""
await self._make_request(
"POST",
f"{self.API_BASE}/feeds/{feed_id}/move",
json={"folderId": folder_id},
)
async def rename_feed(self, feed_id: int, title: str) -> None:
"""Rename a feed.
Args:
feed_id: Feed ID
title: New feed title
Raises:
HTTPStatusError: 404 if feed not found
"""
await self._make_request(
"POST",
f"{self.API_BASE}/feeds/{feed_id}/rename",
json={"feedTitle": title},
)
async def mark_feed_read(self, feed_id: int, newest_item_id: int) -> None:
"""Mark all items in a feed as read.
Args:
feed_id: Feed ID
newest_item_id: ID of newest item to mark read
Raises:
HTTPStatusError: 404 if feed not found
"""
await self._make_request(
"POST",
f"{self.API_BASE}/feeds/{feed_id}/read",
json={"newestItemId": newest_item_id},
)
# --- Items ---
async def get_items(
self,
batch_size: int = 50,
offset: int = 0,
type_: int = NewsItemType.ALL,
id_: int = 0,
get_read: bool = True,
oldest_first: bool = False,
) -> list[dict[str, Any]]:
"""Get items (articles) with filtering.
Args:
batch_size: Number of items to return (-1 for all)
offset: Item ID to start after (for pagination)
type_: Item type filter (NewsItemType)
id_: Feed/folder ID (ignored for STARRED/ALL types)
get_read: Include read items
oldest_first: Sort oldest first instead of newest
Returns:
List of item objects
"""
params: dict[str, Any] = {
"batchSize": batch_size,
"offset": offset,
"type": type_,
"id": id_,
"getRead": str(get_read).lower(),
"oldestFirst": str(oldest_first).lower(),
}
response = await self._make_request(
"GET", f"{self.API_BASE}/items", params=params
)
return response.json().get("items", [])
async def get_item(self, item_id: int) -> dict[str, Any]:
"""Get a specific item by ID.
Args:
item_id: Item ID
Returns:
Item data
Raises:
HTTPStatusError: 404 if item not found
"""
response = await self._make_request("GET", f"{self.API_BASE}/items/{item_id}")
return response.json()
async def get_updated_items(
self,
last_modified: int,
type_: int = NewsItemType.ALL,
id_: int = 0,
) -> list[dict[str, Any]]:
"""Get items modified since a timestamp (for delta sync).
Args:
last_modified: Unix timestamp (seconds or microseconds)
type_: Item type filter
id_: Feed/folder ID
Returns:
List of modified items (includes deleted items)
"""
params: dict[str, Any] = {
"lastModified": last_modified,
"type": type_,
"id": id_,
}
response = await self._make_request(
"GET", f"{self.API_BASE}/items/updated", params=params
)
return response.json().get("items", [])
async def mark_item_read(self, item_id: int) -> None:
"""Mark a single item as read.
Args:
item_id: Item ID
Raises:
HTTPStatusError: 404 if item not found
"""
await self._make_request("POST", f"{self.API_BASE}/items/{item_id}/read")
async def mark_item_unread(self, item_id: int) -> None:
"""Mark a single item as unread.
Args:
item_id: Item ID
Raises:
HTTPStatusError: 404 if item not found
"""
await self._make_request("POST", f"{self.API_BASE}/items/{item_id}/unread")
async def star_item(self, item_id: int) -> None:
"""Star (favorite) a single item.
Args:
item_id: Item ID
Raises:
HTTPStatusError: 404 if item not found
"""
await self._make_request("POST", f"{self.API_BASE}/items/{item_id}/star")
async def unstar_item(self, item_id: int) -> None:
"""Unstar a single item.
Args:
item_id: Item ID
Raises:
HTTPStatusError: 404 if item not found
"""
await self._make_request("POST", f"{self.API_BASE}/items/{item_id}/unstar")
async def mark_items_read(self, item_ids: list[int]) -> None:
"""Mark multiple items as read.
Args:
item_ids: List of item IDs
"""
await self._make_request(
"POST", f"{self.API_BASE}/items/read/multiple", json={"itemIds": item_ids}
)
async def mark_items_unread(self, item_ids: list[int]) -> None:
"""Mark multiple items as unread.
Args:
item_ids: List of item IDs
"""
await self._make_request(
"POST",
f"{self.API_BASE}/items/unread/multiple",
json={"itemIds": item_ids},
)
async def star_items(self, item_ids: list[int]) -> None:
"""Star multiple items.
Args:
item_ids: List of item IDs
"""
await self._make_request(
"POST", f"{self.API_BASE}/items/star/multiple", json={"itemIds": item_ids}
)
async def unstar_items(self, item_ids: list[int]) -> None:
"""Unstar multiple items.
Args:
item_ids: List of item IDs
"""
await self._make_request(
"POST",
f"{self.API_BASE}/items/unstar/multiple",
json={"itemIds": item_ids},
)
async def mark_all_read(self, newest_item_id: int) -> None:
"""Mark all items as read.
Args:
newest_item_id: ID of newest item to mark read
"""
await self._make_request(
"POST", f"{self.API_BASE}/items/read", json={"newestItemId": newest_item_id}
)
# --- Status ---
async def get_status(self) -> dict[str, Any]:
"""Get News app status and configuration.
Returns:
Dict with version and warnings
"""
response = await self._make_request("GET", f"{self.API_BASE}/status")
return response.json()
async def get_version(self) -> str:
"""Get News app version.
Returns:
Version string (e.g., "25.0.0")
"""
response = await self._make_request("GET", f"{self.API_BASE}/version")
return response.json().get("version", "")
+585
View File
@@ -821,6 +821,20 @@ class WebDAVClient(BaseNextcloudClient):
item["file_id"] = int(value) if value else None
elif tag == "favorite":
item["is_favorite"] = value == "1"
elif tag == "tags":
# Tags can be comma-separated or have multiple child elements
if value:
# Handle comma-separated tags
item["tags"] = [
t.strip() for t in value.split(",") if t.strip()
]
else:
# Check for child tag elements (alternative format)
tag_elements = child.findall(".//{http://owncloud.org/ns}tag")
if tag_elements:
item["tags"] = [t.text for t in tag_elements if t.text]
else:
item["tags"] = []
elif tag == "permissions":
item["permissions"] = value
elif tag == "size":
@@ -948,3 +962,574 @@ class WebDAVClient(BaseNextcloudClient):
properties=properties,
limit=limit,
)
async def find_by_tag(
self, tag_name: str, scope: str = "", limit: Optional[int] = None
) -> List[Dict[str, Any]]:
"""Find files by tag name.
DEPRECATED: Use NextcloudClient.find_files_by_tag() instead, which uses
the proper OCS Tags API rather than WebDAV SEARCH.
Args:
tag_name: Tag to filter by (e.g., "vector-index")
scope: Directory path to search in (empty string for user root)
limit: Maximum number of results to return
Returns:
List of files/directories with the specified tag
Examples:
# Find all files tagged with "vector-index"
results = await find_by_tag("vector-index")
# Find tagged files in a specific folder
results = await find_by_tag("vector-index", scope="Documents")
"""
# Use LIKE for tag matching since tags can be comma-separated
where_conditions = f"""
<d:like>
<d:prop>
<oc:tags/>
</d:prop>
<d:literal>%{tag_name}%</d:literal>
</d:like>
"""
# Request tag property along with standard properties
properties = [
"displayname",
"getcontentlength",
"getcontenttype",
"getlastmodified",
"resourcetype",
"getetag",
"fileid",
"tags",
]
return await self.search_files(
scope=scope,
where_conditions=where_conditions,
properties=properties,
limit=limit,
)
async def _get_file_info_by_id(self, file_id: int) -> Dict[str, Any]:
"""Get file information by Nextcloud file ID using WebDAV.
Args:
file_id: Nextcloud internal file ID
Returns:
File information dictionary with path, size, content_type, etc.
Raises:
HTTPStatusError: If file not found or request fails
"""
# Nextcloud allows accessing files by ID via special meta endpoint
meta_path = f"/remote.php/dav/meta/{file_id}/"
propfind_body = """<?xml version="1.0"?>
<d:propfind xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns">
<d:prop>
<d:displayname/>
<d:getcontentlength/>
<d:getcontenttype/>
<d:getlastmodified/>
<d:resourcetype/>
<d:getetag/>
<oc:fileid/>
</d:prop>
</d:propfind>"""
headers = {"Depth": "0", "Content-Type": "text/xml", "OCS-APIRequest": "true"}
response = await self._make_request(
"PROPFIND", meta_path, content=propfind_body, headers=headers
)
response.raise_for_status()
# Parse the XML response
root = ET.fromstring(response.content)
responses = root.findall(".//{DAV:}response")
if not responses:
raise RuntimeError(f"File ID {file_id} not found")
response_elem = responses[0]
href = response_elem.find(".//{DAV:}href")
if href is None:
raise RuntimeError(f"No href in response for file ID {file_id}")
propstat = response_elem.find(".//{DAV:}propstat")
if propstat is None:
raise RuntimeError(f"No propstat for file ID {file_id}")
prop = propstat.find(".//{DAV:}prop")
if prop is None:
raise RuntimeError(f"No prop for file ID {file_id}")
# Extract file path from displayname or construct from file ID
displayname_elem = prop.find(".//{DAV:}displayname")
name = (
displayname_elem.text if displayname_elem is not None else f"file_{file_id}"
)
# Get file properties
size_elem = prop.find(".//{DAV:}getcontentlength")
size = int(size_elem.text) if size_elem is not None and size_elem.text else 0
content_type_elem = prop.find(".//{DAV:}getcontenttype")
content_type = content_type_elem.text if content_type_elem is not None else None
modified_elem = prop.find(".//{DAV:}getlastmodified")
modified = modified_elem.text if modified_elem is not None else None
etag_elem = prop.find(".//{DAV:}getetag")
etag = (
etag_elem.text.strip('"')
if etag_elem is not None and etag_elem.text
else None
)
# Check if it's a directory
resourcetype = prop.find(".//{DAV:}resourcetype")
is_directory = (
resourcetype is not None
and resourcetype.find(".//{DAV:}collection") is not None
)
# Try to get actual file path - meta endpoint doesn't give us the real path
# so we'll construct a reasonable path from the name
# The calling code in NextcloudClient will have the context to determine the actual path
file_info = {
"name": name,
"path": f"/{name}", # Placeholder - caller should use WebDAV to get real path if needed
"size": size,
"content_type": content_type,
"last_modified": modified,
"etag": etag,
"is_directory": is_directory,
"file_id": file_id,
}
logger.debug(f"Retrieved file info for ID {file_id}: {name}")
return file_info
async def get_tag_by_name(self, tag_name: str) -> dict[str, Any] | None:
"""Get a system tag by its name via WebDAV.
Args:
tag_name: Name of the tag to find (case-sensitive)
Returns:
Tag dictionary if found, None otherwise
"""
# Use WebDAV PROPFIND to list all systemtags
propfind_body = """<?xml version="1.0"?>
<d:propfind xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns">
<d:prop>
<oc:id/>
<oc:display-name/>
<oc:user-visible/>
<oc:user-assignable/>
</d:prop>
</d:propfind>"""
response = await self._client.request(
"PROPFIND",
"/remote.php/dav/systemtags/",
headers={"Depth": "1"},
content=propfind_body,
)
response.raise_for_status()
# Parse XML response
root = ET.fromstring(response.content)
ns = {
"d": "DAV:",
"oc": "http://owncloud.org/ns",
}
for response_elem in root.findall("d:response", ns):
href = response_elem.find("d:href", ns)
if href is None or href.text == "/remote.php/dav/systemtags/":
# Skip the collection itself
continue
propstat = response_elem.find("d:propstat", ns)
if propstat is None:
continue
prop = propstat.find("d:prop", ns)
if prop is None:
continue
# Extract tag properties
tag_id_elem = prop.find("oc:id", ns)
display_name_elem = prop.find("oc:display-name", ns)
user_visible_elem = prop.find("oc:user-visible", ns)
user_assignable_elem = prop.find("oc:user-assignable", ns)
if display_name_elem is not None and display_name_elem.text == tag_name:
tag_info = {
"id": int(tag_id_elem.text)
if tag_id_elem is not None and tag_id_elem.text is not None
else None,
"name": display_name_elem.text,
"userVisible": user_visible_elem.text.lower() == "true"
if user_visible_elem is not None
else True,
"userAssignable": user_assignable_elem.text.lower() == "true"
if user_assignable_elem is not None
else True,
}
logger.debug(f"Found tag '{tag_name}' with ID {tag_info['id']}")
return tag_info
logger.debug(f"Tag '{tag_name}' not found")
return None
async def get_files_by_tag(self, tag_id: int) -> list[dict[str, Any]]:
"""Get all files tagged with a specific system tag via WebDAV REPORT.
Args:
tag_id: Numeric ID of the tag
Returns:
List of file info dictionaries with path, size, content_type, etc.
"""
# Use WebDAV REPORT method with systemtag filter, requesting all properties
report_body = f"""<?xml version="1.0"?>
<oc:filter-files xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns" xmlns:nc="http://nextcloud.org/ns">
<d:prop>
<oc:fileid/>
<d:displayname/>
<d:getcontentlength/>
<d:getcontenttype/>
<d:getlastmodified/>
<d:getetag/>
</d:prop>
<oc:filter-rules>
<oc:systemtag>{tag_id}</oc:systemtag>
</oc:filter-rules>
</oc:filter-files>"""
response = await self._client.request(
"REPORT",
f"{self._get_webdav_base_path()}/",
content=report_body,
)
response.raise_for_status()
# Parse XML response
root = ET.fromstring(response.content)
ns = {
"d": "DAV:",
"oc": "http://owncloud.org/ns",
}
files = []
for response_elem in root.findall("d:response", ns):
# Extract href (file path)
href_elem = response_elem.find("d:href", ns)
if href_elem is None or not href_elem.text:
continue
propstat = response_elem.find("d:propstat", ns)
if propstat is None:
continue
prop = propstat.find("d:prop", ns)
if prop is None:
continue
# Extract all properties
fileid_elem = prop.find("oc:fileid", ns)
displayname_elem = prop.find("d:displayname", ns)
contentlength_elem = prop.find("d:getcontentlength", ns)
contenttype_elem = prop.find("d:getcontenttype", ns)
lastmodified_elem = prop.find("d:getlastmodified", ns)
etag_elem = prop.find("d:getetag", ns)
if fileid_elem is None or not fileid_elem.text:
continue
# Decode href path and extract the file path
from urllib.parse import unquote
href_path = unquote(href_elem.text)
# Remove WebDAV prefix to get user-relative path
webdav_prefix = f"/remote.php/dav/files/{self.username}/"
file_path = href_path.replace(webdav_prefix, "/")
# Parse last modified timestamp
last_modified_timestamp = None
if lastmodified_elem is not None and lastmodified_elem.text:
from email.utils import parsedate_to_datetime
try:
dt = parsedate_to_datetime(lastmodified_elem.text)
last_modified_timestamp = int(dt.timestamp())
except Exception:
pass
file_info = {
"id": int(fileid_elem.text),
"path": file_path,
"name": displayname_elem.text
if displayname_elem is not None
else file_path.split("/")[-1],
"size": int(contentlength_elem.text)
if contentlength_elem is not None and contentlength_elem.text
else 0,
"content_type": contenttype_elem.text
if contenttype_elem is not None
else "",
"last_modified": lastmodified_elem.text
if lastmodified_elem is not None
else None,
"last_modified_timestamp": last_modified_timestamp,
"etag": etag_elem.text if etag_elem is not None else None,
}
files.append(file_info)
logger.debug(f"Found {len(files)} files with tag ID {tag_id}")
return files
async def get_file_info(self, path: str) -> dict[str, Any] | None:
"""Get file info including file ID via WebDAV PROPFIND.
Args:
path: Path to the file (relative to user's files directory)
Returns:
File info dictionary with id, name, size, content_type, etc.
Returns None if file not found.
"""
webdav_path = f"{self._get_webdav_base_path()}/{path.lstrip('/')}"
propfind_body = """<?xml version="1.0"?>
<d:propfind xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns">
<d:prop>
<oc:fileid/>
<d:displayname/>
<d:getcontentlength/>
<d:getcontenttype/>
<d:getlastmodified/>
<d:getetag/>
<d:resourcetype/>
</d:prop>
</d:propfind>"""
try:
response = await self._client.request(
"PROPFIND",
webdav_path,
headers={"Depth": "0"},
content=propfind_body,
)
response.raise_for_status()
except HTTPStatusError as e:
if e.response.status_code == 404:
logger.debug(f"File not found: {path}")
return None
raise
# Parse XML response
root = ET.fromstring(response.content)
ns = {
"d": "DAV:",
"oc": "http://owncloud.org/ns",
}
response_elem = root.find("d:response", ns)
if response_elem is None:
return None
propstat = response_elem.find("d:propstat", ns)
if propstat is None:
return None
prop = propstat.find("d:prop", ns)
if prop is None:
return None
# Extract properties
fileid_elem = prop.find("oc:fileid", ns)
displayname_elem = prop.find("d:displayname", ns)
contentlength_elem = prop.find("d:getcontentlength", ns)
contenttype_elem = prop.find("d:getcontenttype", ns)
lastmodified_elem = prop.find("d:getlastmodified", ns)
etag_elem = prop.find("d:getetag", ns)
resourcetype_elem = prop.find("d:resourcetype", ns)
is_directory = (
resourcetype_elem is not None
and resourcetype_elem.find("d:collection", ns) is not None
)
file_info = {
"id": int(fileid_elem.text)
if fileid_elem is not None and fileid_elem.text is not None
else None,
"path": path,
"name": displayname_elem.text
if displayname_elem is not None
else path.split("/")[-1],
"size": int(contentlength_elem.text)
if contentlength_elem is not None and contentlength_elem.text
else 0,
"content_type": contenttype_elem.text
if contenttype_elem is not None
else "",
"last_modified": lastmodified_elem.text
if lastmodified_elem is not None
else None,
"etag": etag_elem.text.strip('"')
if etag_elem is not None and etag_elem.text
else None,
"is_directory": is_directory,
}
logger.debug(f"Got file info for '{path}': id={file_info['id']}")
return file_info
async def create_tag(
self,
name: str,
user_visible: bool = True,
user_assignable: bool = True,
) -> dict[str, Any]:
"""Create a system tag via WebDAV.
Args:
name: Name of the tag to create
user_visible: Whether the tag is visible to users
user_assignable: Whether users can assign this tag
Returns:
Tag dictionary with id, name, userVisible, userAssignable
Raises:
HTTPStatusError: If tag creation fails (409 if already exists)
"""
# Use WebDAV POST with JSON body to create tag
response = await self._client.post(
"/remote.php/dav/systemtags/",
headers={"Content-Type": "application/json"},
json={
"name": name,
"userVisible": user_visible,
"userAssignable": user_assignable,
},
)
response.raise_for_status()
# Extract tag ID from Content-Location header (e.g., /remote.php/dav/systemtags/42)
content_location = response.headers.get("Content-Location", "")
tag_id = None
if content_location:
# Extract the numeric ID from the path
try:
tag_id = int(content_location.rstrip("/").split("/")[-1])
except (ValueError, IndexError):
pass
tag_info = {
"id": tag_id,
"name": name,
"userVisible": user_visible,
"userAssignable": user_assignable,
}
logger.info(f"Created tag '{name}' with ID {tag_info['id']}")
return tag_info
async def get_or_create_tag(
self,
name: str,
user_visible: bool = True,
user_assignable: bool = True,
) -> dict[str, Any]:
"""Get a tag by name, creating it if it doesn't exist.
Args:
name: Name of the tag
user_visible: Whether the tag is visible to users (for creation)
user_assignable: Whether users can assign this tag (for creation)
Returns:
Tag dictionary with id, name, userVisible, userAssignable
"""
# First try to get existing tag
existing_tag = await self.get_tag_by_name(name)
if existing_tag:
logger.debug(f"Tag '{name}' already exists with ID {existing_tag['id']}")
return existing_tag
# Create new tag
try:
return await self.create_tag(name, user_visible, user_assignable)
except HTTPStatusError as e:
if e.response.status_code == 409:
# Tag was created between our check and creation, fetch it
existing_tag = await self.get_tag_by_name(name)
if existing_tag:
return existing_tag
raise
async def assign_tag_to_file(self, file_id: int, tag_id: int) -> bool:
"""Assign a system tag to a file.
Args:
file_id: Numeric file ID
tag_id: Numeric tag ID
Returns:
True if tag was assigned successfully (or already assigned)
Raises:
HTTPStatusError: If tag assignment fails
"""
response = await self._client.request(
"PUT",
f"/remote.php/dav/systemtags-relations/files/{file_id}/{tag_id}",
headers={"Content-Length": "0"},
content=b"",
)
# 201 = Created (new assignment), 409 = Conflict (already assigned)
if response.status_code in (201, 409):
logger.info(f"Tagged file {file_id} with tag {tag_id}")
return True
response.raise_for_status()
return True
async def remove_tag_from_file(self, file_id: int, tag_id: int) -> bool:
"""Remove a system tag from a file.
Args:
file_id: Numeric file ID
tag_id: Numeric tag ID
Returns:
True if tag was removed successfully (or wasn't assigned)
Raises:
HTTPStatusError: If tag removal fails
"""
response = await self._client.request(
"DELETE",
f"/remote.php/dav/systemtags-relations/files/{file_id}/{tag_id}",
)
# 204 = No Content (removed), 404 = Not Found (wasn't assigned)
if response.status_code in (204, 404):
logger.info(f"Removed tag {tag_id} from file {file_id}")
return True
response.raise_for_status()
return True
+82 -10
View File
@@ -2,8 +2,37 @@ import logging
import logging.config
import os
from dataclasses import dataclass
from enum import Enum
from typing import Any, Optional
class DeploymentMode(Enum):
"""Deployment mode for the MCP server.
SELF_HOSTED: Full features, environment-based configuration.
Supports vector sync, semantic search, admin UI.
SMITHERY_STATELESS: Stateless mode for Smithery hosting.
Session-based configuration, no persistent storage.
Excludes semantic search, vector sync, admin UI.
"""
SELF_HOSTED = "self_hosted"
SMITHERY_STATELESS = "smithery"
def get_deployment_mode() -> DeploymentMode:
"""Detect deployment mode from environment.
Returns:
DeploymentMode.SMITHERY_STATELESS if SMITHERY_DEPLOYMENT=true,
otherwise DeploymentMode.SELF_HOSTED (default).
"""
if os.getenv("SMITHERY_DEPLOYMENT", "false").lower() == "true":
return DeploymentMode.SMITHERY_STATELESS
return DeploymentMode.SELF_HOSTED
LOGGING_CONFIG = {
"version": 1,
"disable_existing_loggers": False,
@@ -102,6 +131,14 @@ def get_document_processor_config() -> dict[str, Any]:
"lang": os.getenv("TESSERACT_LANG", "eng"),
}
# PyMuPDF configuration (local PDF processing)
if os.getenv("ENABLE_PYMUPDF", "true").lower() == "true": # Enabled by default
config["processors"]["pymupdf"] = {
"extract_images": os.getenv("PYMUPDF_EXTRACT_IMAGES", "true").lower()
== "true",
"image_dir": os.getenv("PYMUPDF_IMAGE_DIR"), # None = use temp directory
}
# Custom processor (via HTTP API)
if os.getenv("ENABLE_CUSTOM_PROCESSOR", "false").lower() == "true":
custom_url = os.getenv("CUSTOM_PROCESSOR_URL")
@@ -180,9 +217,14 @@ class Settings:
ollama_embedding_model: str = "nomic-embed-text"
ollama_verify_ssl: bool = True
# OpenAI settings (for embeddings)
openai_api_key: Optional[str] = None
openai_base_url: Optional[str] = None
openai_embedding_model: str = "text-embedding-3-small"
# Document chunking settings (for vector embeddings)
document_chunk_size: int = 512 # Words per chunk
document_chunk_overlap: int = 50 # Overlapping words between chunks
document_chunk_size: int = 2048 # Characters per chunk
document_chunk_overlap: int = 200 # Overlapping characters between chunks
# Observability settings
metrics_enabled: bool = True
@@ -227,10 +269,10 @@ class Settings:
f"Overlap should be 10-20% of chunk size for optimal results."
)
if self.document_chunk_size < 100:
if self.document_chunk_size < 512:
logger.warning(
f"DOCUMENT_CHUNK_SIZE is set to {self.document_chunk_size} words, which is quite small. "
f"Smaller chunks may lose context. Consider using at least 256 words."
f"DOCUMENT_CHUNK_SIZE is set to {self.document_chunk_size} characters, which is quite small. "
f"Smaller chunks may lose context. Consider using at least 1024 characters."
)
if self.document_chunk_overlap < 0:
@@ -238,6 +280,29 @@ class Settings:
f"DOCUMENT_CHUNK_OVERLAP ({self.document_chunk_overlap}) cannot be negative."
)
def get_embedding_model_name(self) -> str:
"""
Get the active embedding model name based on provider priority.
Priority order (same as ProviderRegistry):
1. OpenAI - if OPENAI_API_KEY is set
2. Ollama - if OLLAMA_BASE_URL is set
3. Simple - fallback (returns "simple-384")
Returns:
Active embedding model name
"""
# Check OpenAI first (higher priority than Ollama in registry)
if self.openai_api_key:
return self.openai_embedding_model
# Check Ollama
if self.ollama_base_url:
return self.ollama_embedding_model
# Fallback to simple provider indicator
return "simple-384"
def get_collection_name(self) -> str:
"""
Get Qdrant collection name.
@@ -253,8 +318,9 @@ class Settings:
Format: {deployment-id}-{model-name}
Examples:
- "my-deployment-nomic-embed-text" (OTEL_SERVICE_NAME set)
- "mcp-container-all-minilm" (hostname fallback)
- "my-deployment-nomic-embed-text" (Ollama)
- "my-deployment-text-embedding-3-small" (OpenAI)
- "mcp-container-openai-text-embedding-3-small" (hostname fallback)
Returns:
Collection name string
@@ -274,7 +340,7 @@ class Settings:
# Sanitize deployment ID and model name
deployment_id = deployment_id.lower().replace(" ", "-").replace("_", "-")
model_name = self.ollama_embedding_model.replace("/", "-").replace(":", "-")
model_name = self.get_embedding_model_name().replace("/", "-").replace(":", "-")
return f"{deployment_id}-{model_name}"
@@ -334,9 +400,15 @@ def get_settings() -> Settings:
ollama_base_url=os.getenv("OLLAMA_BASE_URL"),
ollama_embedding_model=os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text"),
ollama_verify_ssl=os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true",
# OpenAI settings
openai_api_key=os.getenv("OPENAI_API_KEY"),
openai_base_url=os.getenv("OPENAI_BASE_URL"),
openai_embedding_model=os.getenv(
"OPENAI_EMBEDDING_MODEL", "text-embedding-3-small"
),
# Document chunking settings
document_chunk_size=int(os.getenv("DOCUMENT_CHUNK_SIZE", "512")),
document_chunk_overlap=int(os.getenv("DOCUMENT_CHUNK_OVERLAP", "50")),
document_chunk_size=int(os.getenv("DOCUMENT_CHUNK_SIZE", "2048")),
document_chunk_overlap=int(os.getenv("DOCUMENT_CHUNK_OVERLAP", "200")),
# Observability settings
metrics_enabled=os.getenv("METRICS_ENABLED", "true").lower() == "true",
metrics_port=int(os.getenv("METRICS_PORT", "9090")),
+110 -8
View File
@@ -1,21 +1,37 @@
"""Helper functions for accessing context in MCP tools."""
import logging
from httpx import BasicAuth
from mcp.server.fastmcp import Context
from nextcloud_mcp_server.client import NextcloudClient
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.config import (
DeploymentMode,
get_deployment_mode,
get_settings,
)
logger = logging.getLogger(__name__)
async def get_client(ctx: Context) -> NextcloudClient:
"""
Get the appropriate Nextcloud client based on authentication mode.
ADR-005 compliant implementation supporting two modes:
1. BasicAuth mode: Returns shared client from lifespan context
2. Multi-audience mode (ENABLE_TOKEN_EXCHANGE=false, default):
Token already contains both MCP and Nextcloud audiences - use directly
3. Token exchange mode (ENABLE_TOKEN_EXCHANGE=true):
Exchange MCP token for Nextcloud token via RFC 8693
ADR-016 compliant implementation supporting three deployment modes:
1. Smithery stateless mode (SMITHERY_DEPLOYMENT=true):
Create client from session configuration (nextcloud_url, username, app_password)
No persistent state - client created per-request from Smithery session config.
2. BasicAuth mode: Returns shared client from lifespan context
3. OAuth mode:
a. Multi-audience mode (ENABLE_TOKEN_EXCHANGE=false, default):
Token already contains both MCP and Nextcloud audiences - use directly
b. Token exchange mode (ENABLE_TOKEN_EXCHANGE=true):
Exchange MCP token for Nextcloud token via RFC 8693
SECURITY: Token passthrough has been REMOVED. All OAuth modes validate
proper token audiences per MCP Security Best Practices specification.
@@ -24,7 +40,7 @@ async def get_client(ctx: Context) -> NextcloudClient:
by the MCP server via @require_scopes decorator, not by the IdP.
This function automatically detects the authentication mode by checking
the type of the lifespan context.
the deployment mode and type of the lifespan context.
Args:
ctx: MCP request context
@@ -34,6 +50,7 @@ async def get_client(ctx: Context) -> NextcloudClient:
Raises:
AttributeError: If context doesn't contain expected data
ValueError: If Smithery mode but session config is missing required fields
Example:
```python
@@ -43,6 +60,12 @@ async def get_client(ctx: Context) -> NextcloudClient:
return await client.capabilities()
```
"""
deployment_mode = get_deployment_mode()
# ADR-016: Smithery stateless mode - create client from session config
if deployment_mode == DeploymentMode.SMITHERY_STATELESS:
return _get_client_from_session_config(ctx)
settings = get_settings()
lifespan_ctx = ctx.request_context.lifespan_context
@@ -75,3 +98,82 @@ async def get_client(ctx: Context) -> NextcloudClient:
f"Lifespan context does not have 'client' or 'nextcloud_host' attribute. "
f"Type: {type(lifespan_ctx)}"
)
def _get_client_from_session_config(ctx: Context) -> NextcloudClient:
"""
Create NextcloudClient from Smithery session configuration.
ADR-016: In Smithery stateless mode, each request includes session config
with the user's Nextcloud credentials. This function creates a fresh client
for each request - no state is persisted between requests.
For container runtime, config is extracted from URL query parameters by
SmitheryConfigMiddleware and stored in a context variable.
Expected session config fields (from Smithery configSchema):
- nextcloud_url: str - Nextcloud instance URL (required)
- username: str - Nextcloud username (required)
- app_password: str - Nextcloud app password (required)
Args:
ctx: MCP request context (not used directly for Smithery config)
Returns:
NextcloudClient configured with session credentials
Raises:
ValueError: If required session config fields are missing
"""
# ADR-016: Get session config from context variable (set by SmitheryConfigMiddleware)
from nextcloud_mcp_server.app import get_smithery_session_config
session_config = get_smithery_session_config()
if session_config is None:
raise ValueError(
"Session configuration required in Smithery mode. "
"Ensure nextcloud_url, username, and app_password are provided as URL query parameters."
)
# Extract required fields - config is always a dict from SmitheryConfigMiddleware
nextcloud_url = session_config.get("nextcloud_url")
username = session_config.get("username")
app_password = session_config.get("app_password")
# Validate required fields
missing_fields = []
if not nextcloud_url:
missing_fields.append("nextcloud_url")
if not username:
missing_fields.append("username")
if not app_password:
missing_fields.append("app_password")
if missing_fields:
raise ValueError(
f"Missing required session config fields: {', '.join(missing_fields)}. "
f"Configure these in the Smithery connection settings."
)
# Type assertions after validation (for type checker)
# These are guaranteed to be str after the missing_fields check above
assert nextcloud_url is not None
assert username is not None
assert app_password is not None
# Validate URL format
if not nextcloud_url.startswith(("http://", "https://")):
raise ValueError(
f"Invalid nextcloud_url: {nextcloud_url}. "
f"Must start with http:// or https://"
)
logger.debug(f"Creating Smithery client for {nextcloud_url} as {username}")
# Create client with session credentials using BasicAuth
return NextcloudClient(
base_url=nextcloud_url,
username=username,
auth=BasicAuth(username, app_password),
)
@@ -1,12 +1,18 @@
"""Document processing plugins for extracting text from various file formats."""
from .base import DocumentProcessor, ProcessingResult, ProcessorError
from .pymupdf import PyMuPDFProcessor
from .registry import ProcessorRegistry, get_registry
# Register processors at module initialization
_registry = get_registry()
_registry.register(PyMuPDFProcessor(), priority=10)
__all__ = [
"DocumentProcessor",
"ProcessingResult",
"ProcessorError",
"ProcessorRegistry",
"get_registry",
"PyMuPDFProcessor",
]
@@ -0,0 +1,253 @@
"""Document processor using PyMuPDF (fitz) library."""
import logging
import pathlib
import tempfile
from collections.abc import Awaitable, Callable
from typing import Any, Optional
# NOTE: Do NOT call pymupdf.layout.activate() here!
# It changes the behavior of pymupdf4llm.to_markdown() when page_chunks=True,
# causing it to return a string instead of a list[dict].
# See: https://github.com/pymupdf/pymupdf4llm/issues/323
import pymupdf
import pymupdf4llm
from .base import DocumentProcessor, ProcessingResult, ProcessorError
logger = logging.getLogger(__name__)
class PyMuPDFProcessor(DocumentProcessor):
"""Document processor using PyMuPDF library for PDF processing.
PyMuPDF (fitz) is a fast, local PDF processing library that extracts text,
metadata, and images without requiring external API calls.
Features:
- Fast text extraction with layout preservation
- PDF metadata extraction (title, author, creation date, page count)
- Image extraction for future multimodal support
- Page number tracking for precise citations
"""
SUPPORTED_TYPES = {
"application/pdf",
}
def __init__(
self,
extract_images: bool = True,
image_dir: Optional[str | pathlib.Path] = None,
):
"""Initialize PyMuPDF processor.
Args:
extract_images: Whether to extract embedded images from PDFs
image_dir: Directory to store extracted images (defaults to temp directory)
"""
self.extract_images = extract_images
if image_dir is None:
self.image_dir = pathlib.Path(tempfile.gettempdir()) / "pdf-images"
else:
self.image_dir = pathlib.Path(image_dir)
# Create image directory if it doesn't exist
if self.extract_images:
self.image_dir.mkdir(exist_ok=True, parents=True)
logger.info(
f"Initialized PyMuPDFProcessor with image extraction to {self.image_dir}"
)
else:
logger.info("Initialized PyMuPDFProcessor without image extraction")
@property
def name(self) -> str:
return "pymupdf"
@property
def supported_mime_types(self) -> set[str]:
return self.SUPPORTED_TYPES
async def process(
self,
content: bytes,
content_type: str,
filename: Optional[str] = None,
options: Optional[dict[str, Any]] = None,
progress_callback: Optional[
Callable[[float, Optional[float], Optional[str]], Awaitable[None]]
] = None,
) -> ProcessingResult:
"""Process a PDF document and extract text, metadata, and images.
Args:
content: PDF document bytes
content_type: MIME type (should be application/pdf)
filename: Optional filename for better error messages
options: Processing options (currently unused)
progress_callback: Optional callback for progress updates
Returns:
ProcessingResult with extracted text and metadata
Raises:
ProcessorError: If PDF processing fails
"""
import anyio
try:
if progress_callback:
await progress_callback(0, 100, "Opening PDF document")
# Open document and extract metadata in thread
doc = await anyio.to_thread.run_sync( # type: ignore[attr-defined]
lambda: pymupdf.open("pdf", content)
)
metadata = self._extract_metadata(doc, filename)
metadata["file_size"] = len(content)
page_count = doc.page_count
if progress_callback:
await progress_callback(10, 100, f"Extracting {page_count} pages")
# Prepare image directory if needed
pdf_image_dir = None
if self.extract_images:
pdf_id = filename.replace("/", "_") if filename else "unknown"
pdf_image_dir = self.image_dir / pdf_id
pdf_image_dir.mkdir(exist_ok=True, parents=True)
# Extract all pages in a single call with page_chunks=True
def do_extract() -> list[dict[str, Any]]:
# When page_chunks=True, to_markdown returns list[dict] not str
return pymupdf4llm.to_markdown( # type: ignore[return-value]
doc,
write_images=self.extract_images,
image_path=pdf_image_dir if self.extract_images else None,
page_chunks=True,
)
page_chunks: list[dict[str, Any]] = await anyio.to_thread.run_sync( # type: ignore[attr-defined]
do_extract
)
if progress_callback:
await progress_callback(90, 100, "Building result")
# Extract page texts and build boundaries from chunks
page_texts: list[str] = []
page_boundaries: list[dict[str, Any]] = []
current_offset = 0
for chunk in page_chunks:
text = chunk.get("text", "")
page_num = chunk.get("metadata", {}).get("page", len(page_texts) + 1)
page_texts.append(text)
page_boundaries.append(
{
"page": page_num,
"start_offset": current_offset,
"end_offset": current_offset + len(text),
}
)
current_offset += len(text)
# Collect image paths
image_paths = []
if pdf_image_dir and pdf_image_dir.exists():
image_paths = [str(p) for p in pdf_image_dir.glob("*")]
# Build final text and metadata
md_text = "".join(page_texts)
metadata["has_images"] = len(image_paths) > 0
if image_paths:
metadata["image_count"] = len(image_paths)
metadata["image_paths"] = image_paths
metadata["page_boundaries"] = page_boundaries
# Close document
doc.close()
if progress_callback:
await progress_callback(100, 100, "Processing complete")
logger.info(
f"Successfully processed PDF {filename or '<bytes>'}: "
f"{metadata['page_count']} pages, {len(md_text)} chars, "
f"{metadata.get('image_count', 0)} images"
)
return ProcessingResult(
text=md_text,
metadata=metadata,
processor=self.name,
success=True,
)
except Exception as e:
error_msg = f"Failed to process PDF {filename or '<bytes>'}: {e}"
logger.error(error_msg, exc_info=True)
raise ProcessorError(error_msg) from e
def _extract_metadata(
self, doc: pymupdf.Document, filename: Optional[str]
) -> dict[str, Any]:
"""Extract metadata from PDF document.
Args:
doc: Opened PyMuPDF document
filename: Optional filename
Returns:
Dictionary with PDF metadata
"""
metadata: dict[str, Any] = {}
# Basic document info
metadata["page_count"] = doc.page_count
metadata["format"] = "PDF 1." + str(
doc.pdf_version() if hasattr(doc, "pdf_version") else "?" # type: ignore[call-non-callable]
)
if filename:
metadata["filename"] = filename
# Extract PDF metadata dictionary
pdf_metadata = doc.metadata
if pdf_metadata:
# Standard PDF metadata fields
if pdf_metadata.get("title"):
metadata["title"] = pdf_metadata["title"]
if pdf_metadata.get("author"):
metadata["author"] = pdf_metadata["author"]
if pdf_metadata.get("subject"):
metadata["subject"] = pdf_metadata["subject"]
if pdf_metadata.get("keywords"):
metadata["keywords"] = pdf_metadata["keywords"]
if pdf_metadata.get("creator"):
metadata["creator"] = pdf_metadata["creator"]
if pdf_metadata.get("producer"):
metadata["producer"] = pdf_metadata["producer"]
if pdf_metadata.get("creationDate"):
metadata["creation_date"] = pdf_metadata["creationDate"]
if pdf_metadata.get("modDate"):
metadata["modification_date"] = pdf_metadata["modDate"]
return metadata
async def health_check(self) -> bool:
"""Check if PyMuPDF is available and working.
Returns:
True if processor is ready to use
"""
try:
# Try to create a simple PDF in memory
test_doc = pymupdf.open()
test_doc.close()
return True
except Exception as e:
logger.error(f"PyMuPDF health check failed: {e}")
return False
+9 -2
View File
@@ -1,6 +1,13 @@
"""Embedding service package for generating vector embeddings."""
from .service import EmbeddingService, get_embedding_service
from .bm25_provider import BM25SparseEmbeddingProvider
from .service import EmbeddingService, get_bm25_service, get_embedding_service
from .simple_provider import SimpleEmbeddingProvider
__all__ = ["EmbeddingService", "get_embedding_service", "SimpleEmbeddingProvider"]
__all__ = [
"EmbeddingService",
"get_embedding_service",
"BM25SparseEmbeddingProvider",
"get_bm25_service",
"SimpleEmbeddingProvider",
]
@@ -0,0 +1,98 @@
"""BM25 sparse embedding provider using FastEmbed."""
import logging
from typing import Any
from fastembed import SparseTextEmbedding
logger = logging.getLogger(__name__)
class BM25SparseEmbeddingProvider:
"""
BM25 sparse embedding provider for hybrid search.
Uses FastEmbed's BM25 model to generate sparse vectors for keyword-based
retrieval. These sparse vectors are combined with dense semantic vectors
in Qdrant using Reciprocal Rank Fusion (RRF) for hybrid search.
Unlike dense embeddings which have fixed dimensions, sparse embeddings
have variable-length vectors with (index, value) pairs representing
term frequencies in the BM25 vocabulary.
"""
def __init__(self, model_name: str = "Qdrant/bm25"):
"""
Initialize BM25 sparse embedding provider.
Args:
model_name: FastEmbed BM25 model name (default: Qdrant/bm25)
"""
self.model_name = model_name
logger.info(f"Initializing BM25 sparse embedding provider: {model_name}")
# Initialize FastEmbed sparse embedding model
self.model = SparseTextEmbedding(model_name=model_name)
logger.info(f"BM25 sparse embedding model loaded: {model_name}")
def encode(self, text: str) -> dict[str, Any]:
"""
Generate BM25 sparse embedding for a single text (synchronous).
Note: For async contexts, prefer encode_async() to avoid blocking the event loop.
Args:
text: Input text to encode
Returns:
Dictionary with 'indices' and 'values' keys for Qdrant sparse vector
"""
# FastEmbed returns a generator, take first result
sparse_embedding = next(iter(self.model.embed([text])))
return {
"indices": sparse_embedding.indices.tolist(),
"values": sparse_embedding.values.tolist(),
}
async def encode_async(self, text: str) -> dict[str, Any]:
"""
Generate BM25 sparse embedding for a single text (async).
Runs CPU-bound BM25 encoding in thread pool to avoid blocking the event loop.
Args:
text: Input text to encode
Returns:
Dictionary with 'indices' and 'values' keys for Qdrant sparse vector
"""
import anyio
# Run CPU-bound BM25 encoding in thread pool
return await anyio.to_thread.run_sync(lambda: self.encode(text)) # type: ignore[attr-defined]
async def encode_batch(self, texts: list[str]) -> list[dict[str, Any]]:
"""
Generate BM25 sparse embeddings for multiple texts (batched).
Args:
texts: List of texts to encode
Returns:
List of dictionaries with 'indices' and 'values' for each text
"""
import anyio
# Run CPU-bound BM25 encoding in thread pool to avoid blocking event loop
sparse_embeddings = await anyio.to_thread.run_sync( # type: ignore[attr-defined]
lambda: list(self.model.embed(texts))
)
return [
{
"indices": emb.indices.tolist(),
"values": emb.values.tolist(),
}
for emb in sparse_embeddings
]
+33 -42
View File
@@ -1,56 +1,30 @@
"""Embedding service with provider detection."""
"""Embedding service with provider detection.
DEPRECATED: This module is maintained for backward compatibility.
New code should use nextcloud_mcp_server.providers.get_provider() directly.
"""
import logging
import os
from .base import EmbeddingProvider
from .ollama_provider import OllamaEmbeddingProvider
from .simple_provider import SimpleEmbeddingProvider
from nextcloud_mcp_server.providers import get_provider
from .bm25_provider import BM25SparseEmbeddingProvider
logger = logging.getLogger(__name__)
class EmbeddingService:
"""Unified embedding service with automatic provider detection."""
"""
Unified embedding service with automatic provider detection.
DEPRECATED: This class wraps the new unified provider infrastructure
for backward compatibility. New code should use
nextcloud_mcp_server.providers.get_provider() directly.
"""
def __init__(self):
"""Initialize embedding service with auto-detected provider."""
self.provider = self._detect_provider()
def _detect_provider(self) -> EmbeddingProvider:
"""
Auto-detect available embedding provider.
Checks environment variables in order:
1. OLLAMA_BASE_URL - Use Ollama provider (production)
2. OPENAI_API_KEY - Use OpenAI provider (future)
3. Fallback to SimpleEmbeddingProvider (testing/development)
Returns:
Configured embedding provider
"""
# Ollama provider (production)
ollama_url = os.getenv("OLLAMA_BASE_URL")
if ollama_url:
logger.info(f"Using Ollama embedding provider: {ollama_url}")
return OllamaEmbeddingProvider(
base_url=ollama_url,
model=os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text"),
verify_ssl=os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true",
)
# OpenAI provider (future implementation)
# openai_key = os.getenv("OPENAI_API_KEY")
# if openai_key:
# return OpenAIEmbeddingProvider(api_key=openai_key)
# Fallback to simple provider for development/testing
logger.warning(
"No embedding provider configured (OLLAMA_BASE_URL or OPENAI_API_KEY not set). "
"Using SimpleEmbeddingProvider for testing/development. "
"For production, configure an external embedding service."
)
return SimpleEmbeddingProvider(dimension=384)
self.provider = get_provider()
async def embed(self, text: str) -> list[float]:
"""
@@ -109,3 +83,20 @@ def get_embedding_service() -> EmbeddingService:
if _embedding_service is None:
_embedding_service = EmbeddingService()
return _embedding_service
# BM25 sparse embedding singleton
_bm25_service: BM25SparseEmbeddingProvider | None = None
def get_bm25_service() -> BM25SparseEmbeddingProvider:
"""
Get singleton BM25 sparse embedding service instance.
Returns:
Global BM25SparseEmbeddingProvider instance
"""
global _bm25_service
if _bm25_service is None:
_bm25_service = BM25SparseEmbeddingProvider()
return _bm25_service
+170
View File
@@ -0,0 +1,170 @@
"""Pydantic models for Nextcloud News app responses."""
from typing import List
from pydantic import BaseModel, ConfigDict, Field
from .base import BaseResponse
class NewsFolder(BaseModel):
"""Model for a News folder."""
model_config = ConfigDict(populate_by_name=True)
id: int = Field(description="Folder ID")
name: str = Field(description="Folder name")
class NewsFeed(BaseModel):
"""Model for a News feed (RSS/Atom subscription)."""
model_config = ConfigDict(populate_by_name=True)
id: int = Field(description="Feed ID")
url: str = Field(description="Feed URL")
title: str = Field(description="Feed title")
favicon_link: str | None = Field(
None, alias="faviconLink", description="Favicon URL"
)
link: str | None = Field(None, description="Website link")
added: int = Field(description="Unix timestamp when feed was added")
folder_id: int | None = Field(
None, alias="folderId", description="Parent folder ID"
)
unread_count: int = Field(
0, alias="unreadCount", description="Number of unread items"
)
ordering: int = Field(
0, description="Feed ordering (0=default, 1=oldest, 2=newest)"
)
pinned: bool = Field(False, description="Whether feed is pinned to top")
update_error_count: int = Field(
0, alias="updateErrorCount", description="Consecutive update failures"
)
last_update_error: str | None = Field(
None, alias="lastUpdateError", description="Last update error message"
)
@property
def has_errors(self) -> bool:
"""Check if feed has update errors."""
return self.update_error_count > 0
class NewsItem(BaseModel):
"""Model for a News item (article) with full content."""
model_config = ConfigDict(populate_by_name=True)
id: int = Field(description="Item ID")
guid: str = Field(description="Globally unique identifier")
guid_hash: str = Field(alias="guidHash", description="MD5 hash of GUID")
url: str | None = Field(None, description="Article URL")
title: str = Field(description="Article title")
author: str | None = Field(None, description="Article author")
pub_date: int | None = Field(
None, alias="pubDate", description="Publication timestamp"
)
body: str | None = Field(None, description="Article content (HTML)")
enclosure_mime: str | None = Field(
None, alias="enclosureMime", description="Enclosure MIME type"
)
enclosure_link: str | None = Field(
None, alias="enclosureLink", description="Enclosure URL"
)
media_thumbnail: str | None = Field(
None, alias="mediaThumbnail", description="Media thumbnail URL"
)
media_description: str | None = Field(
None, alias="mediaDescription", description="Media description"
)
feed_id: int = Field(alias="feedId", description="Parent feed ID")
unread: bool = Field(True, description="Whether item is unread")
starred: bool = Field(False, description="Whether item is starred")
rtl: bool = Field(False, description="Right-to-left text")
last_modified: int = Field(
alias="lastModified", description="Last modification timestamp"
)
fingerprint: str | None = Field(
None, description="Content fingerprint for deduplication"
)
content_hash: str | None = Field(
None, alias="contentHash", description="Content hash"
)
class NewsItemSummary(BaseModel):
"""Lightweight model for News item list responses."""
model_config = ConfigDict(populate_by_name=True)
id: int = Field(description="Item ID")
title: str = Field(description="Article title")
feed_id: int = Field(alias="feedId", description="Parent feed ID")
unread: bool = Field(True, description="Whether item is unread")
starred: bool = Field(False, description="Whether item is starred")
pub_date: int | None = Field(
None, alias="pubDate", description="Publication timestamp"
)
url: str | None = Field(None, description="Article URL")
author: str | None = Field(None, description="Article author")
class NewsStatus(BaseModel):
"""Model for News app status."""
version: str = Field(description="News app version")
warnings: dict = Field(default_factory=dict, description="Configuration warnings")
# --- Response Models ---
class ListFoldersResponse(BaseResponse):
"""Response model for listing folders."""
results: List[NewsFolder] = Field(description="List of folders")
total_count: int = Field(description="Total number of folders")
class ListFeedsResponse(BaseResponse):
"""Response model for listing feeds."""
results: List[NewsFeed] = Field(description="List of feeds")
starred_count: int = Field(0, description="Number of starred items")
newest_item_id: int | None = Field(None, description="ID of newest item")
total_count: int = Field(description="Total number of feeds")
class ListItemsResponse(BaseResponse):
"""Response model for listing items."""
results: List[NewsItemSummary] = Field(description="List of items")
total_count: int = Field(description="Number of items returned")
has_more: bool = Field(False, description="Whether more items exist")
oldest_id: int | None = Field(None, description="Oldest item ID (for pagination)")
class GetItemResponse(BaseResponse):
"""Response model for getting a single item."""
item: NewsItem = Field(description="Full item details")
class FeedHealthResponse(BaseResponse):
"""Response model for feed health status."""
feed_id: int = Field(description="Feed ID")
title: str = Field(description="Feed title")
url: str = Field(description="Feed URL")
has_errors: bool = Field(description="Whether feed has update errors")
error_count: int = Field(description="Number of consecutive errors")
last_error: str | None = Field(None, description="Last error message")
class GetStatusResponse(BaseResponse):
"""Response model for app status."""
version: str = Field(description="News app version")
warnings: dict = Field(default_factory=dict, description="Configuration warnings")
+38 -2
View File
@@ -10,7 +10,7 @@ from .base import BaseResponse
class SemanticSearchResult(BaseModel):
"""Model for semantic search results with additional metadata."""
id: int = Field(description="Document ID")
id: int = Field(description="Document ID (int for all document types)")
doc_type: str = Field(
description="Document type (note, calendar_event, deck_card, etc.)"
)
@@ -19,9 +19,45 @@ class SemanticSearchResult(BaseModel):
default="", description="Document category (notes) or location (calendar)"
)
excerpt: str = Field(description="Excerpt from matching chunk")
score: float = Field(description="Semantic similarity score (0-1)")
score: float = Field(
description=(
"Relevance score (≥ 0.0, higher is better). "
"Score range depends on fusion method: "
"RRF produces scores in [0.0, 1.0], "
"DBSF can exceed 1.0 (sum of normalized scores from multiple systems)"
)
)
chunk_index: int = Field(description="Index of matching chunk in document")
total_chunks: int = Field(description="Total number of chunks in document")
chunk_start_offset: Optional[int] = Field(
default=None, description="Character position where chunk starts in document"
)
chunk_end_offset: Optional[int] = Field(
default=None, description="Character position where chunk ends in document"
)
page_number: Optional[int] = Field(
default=None, description="Page number for PDF documents"
)
# Context expansion fields (optional, populated when include_context=True)
has_context_expansion: bool = Field(
default=False, description="Whether context expansion was performed"
)
marked_text: Optional[str] = Field(
default=None,
description="Full text with position markers around matched chunk",
)
before_context: Optional[str] = Field(
default=None, description="Text before the matched chunk"
)
after_context: Optional[str] = Field(
default=None, description="Text after the matched chunk"
)
has_before_truncation: Optional[bool] = Field(
default=None, description="Whether before_context was truncated"
)
has_after_truncation: Optional[bool] = Field(
default=None, description="Whether after_context was truncated"
)
class SemanticSearchResponse(BaseResponse):
@@ -37,7 +37,7 @@ class HealthCheckFilter(logging.Filter):
"""
# Check if the log message contains health check endpoints
message = record.getMessage()
return not any(
health_check = any(
endpoint in message
for endpoint in [
"/health/live",
@@ -47,6 +47,8 @@ class HealthCheckFilter(logging.Filter):
]
)
return not health_check
class TraceContextFormatter(JsonFormatter):
"""
@@ -58,7 +60,7 @@ class TraceContextFormatter(JsonFormatter):
def add_fields(
self,
log_record: dict[str, Any],
log_data: dict[str, Any],
record: logging.LogRecord,
message_dict: dict[str, Any],
) -> None:
@@ -66,28 +68,28 @@ class TraceContextFormatter(JsonFormatter):
Add custom fields to the log record, including trace context.
Args:
log_record: Dictionary to be serialized as JSON
log_data: Dictionary to be serialized as JSON
record: LogRecord instance
message_dict: Dictionary of extra fields from log call
"""
# Call parent to add standard fields
super().add_fields(log_record, record, message_dict)
super().add_fields(log_data, record, message_dict)
# Add trace context if available
trace_context = get_trace_context()
if trace_context:
log_record["trace_id"] = trace_context.get("trace_id")
log_record["span_id"] = trace_context.get("span_id")
log_data["trace_id"] = trace_context.get("trace_id")
log_data["span_id"] = trace_context.get("span_id")
# Add standard fields with consistent naming
log_record["timestamp"] = self.formatTime(record)
log_record["level"] = record.levelname
log_record["logger"] = record.name
log_record["message"] = record.getMessage()
log_data["timestamp"] = self.formatTime(record)
log_data["level"] = record.levelname
log_data["logger"] = record.name
log_data["message"] = record.getMessage()
# Include exception info if present
if record.exc_info:
log_record["exception"] = self.formatException(record.exc_info)
log_data["exception"] = self.formatException(record.exc_info)
class TraceContextTextFormatter(logging.Formatter):
+37 -14
View File
@@ -404,10 +404,11 @@ def update_vector_sync_queue_size(size: int) -> None:
def instrument_tool(func):
"""
Decorator to automatically instrument MCP tool functions with metrics.
Decorator to automatically instrument MCP tool functions with metrics and tracing.
Wraps async tool functions to record execution time and success/error status.
Compatible with @mcp.tool() and @require_scopes() decorators.
Wraps async tool functions to record execution time, success/error status, and
create OpenTelemetry trace spans. Compatible with @mcp.tool() and @require_scopes()
decorators.
Usage:
@mcp.tool()
@@ -420,24 +421,46 @@ def instrument_tool(func):
func: The async function to instrument
Returns:
Wrapped function with metrics instrumentation
Wrapped function with metrics and tracing instrumentation
"""
import functools
import time
from nextcloud_mcp_server.observability.tracing import trace_operation
@functools.wraps(func)
async def wrapper(*args, **kwargs):
tool_name = func.__name__
start_time = time.time()
try:
result = await func(*args, **kwargs)
duration = time.time() - start_time
record_tool_call(tool_name, duration, "success")
return result
except Exception as e:
duration = time.time() - start_time
record_tool_call(tool_name, duration, "error")
record_tool_error(tool_name, type(e).__name__)
raise
# Extract tool arguments for tracing (sanitize sensitive fields)
# kwargs contains the actual arguments passed to the tool
tool_args = {
k: v
for k, v in kwargs.items()
if k not in ("password", "token", "secret", "api_key", "etag", "ctx")
}
# Create trace span with metrics collection
with trace_operation(
f"mcp.tool.{tool_name}",
attributes={
"mcp.tool.name": tool_name,
"mcp.tool.args": str(tool_args)[:500]
if tool_args
else None, # Limit to 500 chars
},
record_exception=True,
):
try:
result = await func(*args, **kwargs)
duration = time.time() - start_time
record_tool_call(tool_name, duration, "success")
return result
except Exception as e:
duration = time.time() - start_time
record_tool_call(tool_name, duration, "error")
record_tool_error(tool_name, type(e).__name__)
raise
return wrapper
@@ -0,0 +1,20 @@
"""Unified provider infrastructure for embeddings and text generation."""
from .anthropic import AnthropicProvider
from .base import Provider
from .bedrock import BedrockProvider
from .ollama import OllamaProvider
from .openai import OpenAIProvider
from .registry import get_provider, reset_provider
from .simple import SimpleProvider
__all__ = [
"Provider",
"OllamaProvider",
"OpenAIProvider",
"AnthropicProvider",
"SimpleProvider",
"BedrockProvider",
"get_provider",
"reset_provider",
]
@@ -0,0 +1,99 @@
"""Unified Anthropic provider for text generation."""
import logging
from anthropic import AsyncAnthropic
from .base import Provider
logger = logging.getLogger(__name__)
class AnthropicProvider(Provider):
"""
Anthropic provider for text generation.
Supports Claude models via the Anthropic API.
Note: Anthropic doesn't provide embedding models, only text generation.
"""
def __init__(
self, api_key: str, generation_model: str = "claude-3-5-sonnet-20241022"
):
"""
Initialize Anthropic provider.
Args:
api_key: Anthropic API key
generation_model: Model name (e.g., "claude-3-5-sonnet-20241022")
"""
self.client = AsyncAnthropic(api_key=api_key)
self.model = generation_model
logger.info(f"Initialized Anthropic provider (model={self.model})")
@property
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
return False
@property
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
return True
async def embed(self, text: str) -> list[float]:
"""
Generate embedding vector for text.
Raises:
NotImplementedError: Anthropic doesn't provide embedding models
"""
raise NotImplementedError(
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
)
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
"""
Generate embeddings for multiple texts.
Raises:
NotImplementedError: Anthropic doesn't provide embedding models
"""
raise NotImplementedError(
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
)
def get_dimension(self) -> int:
"""
Get embedding dimension.
Raises:
NotImplementedError: Anthropic doesn't provide embedding models
"""
raise NotImplementedError(
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
)
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""
Generate text using Anthropic API.
Args:
prompt: The prompt to generate from
max_tokens: Maximum tokens to generate
Returns:
Generated text
"""
message = await self.client.messages.create(
model=self.model,
max_tokens=max_tokens,
temperature=0.7,
messages=[{"role": "user", "content": prompt}],
)
return message.content[0].text
async def close(self) -> None:
"""Close the client (no-op for Anthropic SDK)."""
pass
+91
View File
@@ -0,0 +1,91 @@
"""Unified provider interface for embeddings and text generation."""
from abc import ABC, abstractmethod
class Provider(ABC):
"""
Unified base class for LLM providers.
Providers can support embeddings, text generation, or both.
Use capability properties to determine what features are available.
"""
@property
@abstractmethod
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
pass
@property
@abstractmethod
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
pass
@abstractmethod
async def embed(self, text: str) -> list[float]:
"""
Generate embedding vector for text.
Args:
text: Input text to embed
Returns:
Vector embedding as list of floats
Raises:
NotImplementedError: If provider doesn't support embeddings
"""
pass
@abstractmethod
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
"""
Generate embeddings for multiple texts (optimized).
Args:
texts: List of texts to embed
Returns:
List of vector embeddings
Raises:
NotImplementedError: If provider doesn't support embeddings
"""
pass
@abstractmethod
def get_dimension(self) -> int:
"""
Get embedding dimension for this provider.
Returns:
Vector dimension (e.g., 768 for nomic-embed-text)
Raises:
NotImplementedError: If provider doesn't support embeddings
"""
pass
@abstractmethod
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""
Generate text from a prompt.
Args:
prompt: The prompt to generate from
max_tokens: Maximum tokens to generate
Returns:
Generated text
Raises:
NotImplementedError: If provider doesn't support generation
"""
pass
@abstractmethod
async def close(self) -> None:
"""Close the provider and release resources."""
pass
+397
View File
@@ -0,0 +1,397 @@
"""Amazon Bedrock provider for embeddings and text generation."""
import json
import logging
from typing import Any
try:
import boto3
from botocore.exceptions import BotoCoreError, ClientError
BOTO3_AVAILABLE = True
except ImportError:
BOTO3_AVAILABLE = False
from .base import Provider
logger = logging.getLogger(__name__)
class BedrockProvider(Provider):
"""
Amazon Bedrock provider supporting both embeddings and text generation.
Uses AWS Bedrock Runtime API with boto3. Supports various model families:
- Embeddings: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2, cohere.embed-*
- Text Generation: anthropic.claude-*, meta.llama3-*, amazon.titan-text-*, mistral.*, etc.
Requires AWS credentials configured via:
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION)
- AWS credentials file (~/.aws/credentials)
- IAM role (when running on AWS)
"""
def __init__(
self,
region_name: str | None = None,
embedding_model: str | None = None,
generation_model: str | None = None,
aws_access_key_id: str | None = None,
aws_secret_access_key: str | None = None,
):
"""
Initialize Bedrock provider.
Args:
region_name: AWS region (e.g., "us-east-1"). Defaults to AWS_REGION env var.
embedding_model: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0").
None disables embeddings.
generation_model: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0").
None disables generation.
aws_access_key_id: AWS access key (optional, uses default credential chain if not provided)
aws_secret_access_key: AWS secret key (optional, uses default credential chain if not provided)
Raises:
ImportError: If boto3 is not installed
"""
if not BOTO3_AVAILABLE:
raise ImportError(
"boto3 is required for Bedrock provider. Install with: pip install boto3"
)
self.embedding_model = embedding_model
self.generation_model = generation_model
self._dimension: int | None = None # Detected dynamically
# Initialize bedrock-runtime client
client_kwargs: dict[str, Any] = {}
if region_name:
client_kwargs["region_name"] = region_name
if aws_access_key_id:
client_kwargs["aws_access_key_id"] = aws_access_key_id
if aws_secret_access_key:
client_kwargs["aws_secret_access_key"] = aws_secret_access_key
self.client = boto3.client("bedrock-runtime", **client_kwargs)
logger.info(
f"Initialized Bedrock provider in region {region_name or 'default'} "
f"(embedding_model={embedding_model}, generation_model={generation_model})"
)
@property
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
return self.embedding_model is not None
@property
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
return self.generation_model is not None
def _create_embedding_request(self, text: str) -> dict[str, Any]:
"""
Create model-specific embedding request payload.
Args:
text: Input text to embed
Returns:
Request payload dict for the embedding model
"""
if not self.embedding_model:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
# Titan Embed models
if self.embedding_model.startswith("amazon.titan-embed"):
return {"inputText": text}
# Cohere Embed models
elif self.embedding_model.startswith("cohere.embed"):
return {"texts": [text], "input_type": "search_document"}
# Unknown model - try Titan format as default
else:
logger.warning(
f"Unknown embedding model format for {self.embedding_model}, "
"using Titan format as default"
)
return {"inputText": text}
def _parse_embedding_response(self, response: dict[str, Any]) -> list[float]:
"""
Parse model-specific embedding response.
Args:
response: Raw response from Bedrock
Returns:
Embedding vector as list of floats
"""
# Titan Embed models
if self.embedding_model and self.embedding_model.startswith(
"amazon.titan-embed"
):
return response["embedding"]
# Cohere Embed models
elif self.embedding_model and self.embedding_model.startswith("cohere.embed"):
return response["embeddings"][0]
# Unknown model - try Titan format as default
else:
logger.warning(
f"Unknown embedding response format for {self.embedding_model}, "
"trying Titan format"
)
return response.get("embedding", response.get("embeddings", [None])[0])
async def embed(self, text: str) -> list[float]:
"""
Generate embedding vector for text.
Args:
text: Input text to embed
Returns:
Vector embedding as list of floats
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
ClientError: If Bedrock API call fails
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
try:
request_body = self._create_embedding_request(text)
response = self.client.invoke_model(
modelId=self.embedding_model,
body=json.dumps(request_body),
accept="application/json",
contentType="application/json",
)
response_body = json.loads(response["body"].read())
embedding = self._parse_embedding_response(response_body)
return embedding
except (BotoCoreError, ClientError) as e:
logger.error(f"Bedrock embedding error: {e}")
raise
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
"""
Generate embeddings for multiple texts.
Note: Current implementation sends requests sequentially.
Future optimization could use asyncio for concurrent requests.
Args:
texts: List of texts to embed
Returns:
List of vector embeddings
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
ClientError: If Bedrock API call fails
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
embeddings = []
for text in texts:
embedding = await self.embed(text)
embeddings.append(embedding)
return embeddings
async def _detect_dimension(self):
"""
Detect embedding dimension by generating a test embedding.
"""
if self._dimension is None and self.supports_embeddings:
logger.debug(
f"Detecting embedding dimension for model {self.embedding_model}..."
)
test_embedding = await self.embed("test")
self._dimension = len(test_embedding)
logger.info(
f"Detected embedding dimension: {self._dimension} "
f"for model {self.embedding_model}"
)
def get_dimension(self) -> int:
"""
Get embedding dimension.
Returns:
Vector dimension for the configured embedding model
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
RuntimeError: If dimension not detected yet (call _detect_dimension first)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
if self._dimension is None:
raise RuntimeError(
f"Embedding dimension not detected yet for model {self.embedding_model}. "
"Call _detect_dimension() first or generate an embedding."
)
return self._dimension
def _create_generation_request(
self, prompt: str, max_tokens: int
) -> dict[str, Any]:
"""
Create model-specific text generation request payload.
Args:
prompt: The prompt to generate from
max_tokens: Maximum tokens to generate
Returns:
Request payload dict for the generation model
"""
if not self.generation_model:
raise NotImplementedError(
"Text generation not supported - no generation_model configured"
)
# Anthropic Claude models
if self.generation_model.startswith("anthropic.claude"):
return {
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": max_tokens,
"temperature": 0.7,
"messages": [{"role": "user", "content": prompt}],
}
# Meta Llama models
elif self.generation_model.startswith("meta.llama"):
return {"prompt": prompt, "max_gen_len": max_tokens, "temperature": 0.7}
# Amazon Titan Text models
elif self.generation_model.startswith("amazon.titan-text"):
return {
"inputText": prompt,
"textGenerationConfig": {
"maxTokenCount": max_tokens,
"temperature": 0.7,
},
}
# Mistral models
elif self.generation_model.startswith("mistral"):
return {"prompt": prompt, "max_tokens": max_tokens, "temperature": 0.7}
# Unknown model - try Claude format as default
else:
logger.warning(
f"Unknown generation model format for {self.generation_model}, "
"using Claude format as default"
)
return {
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": max_tokens,
"temperature": 0.7,
"messages": [{"role": "user", "content": prompt}],
}
def _parse_generation_response(self, response: dict[str, Any]) -> str:
"""
Parse model-specific text generation response.
Args:
response: Raw response from Bedrock
Returns:
Generated text
"""
# Anthropic Claude models
if self.generation_model and self.generation_model.startswith(
"anthropic.claude"
):
return response["content"][0]["text"]
# Meta Llama models
elif self.generation_model and self.generation_model.startswith("meta.llama"):
return response["generation"]
# Amazon Titan Text models
elif self.generation_model and self.generation_model.startswith(
"amazon.titan-text"
):
return response["results"][0]["outputText"]
# Mistral models
elif self.generation_model and self.generation_model.startswith("mistral"):
return response["outputs"][0]["text"]
# Unknown model - try common response fields
else:
logger.warning(
f"Unknown generation response format for {self.generation_model}, "
"trying common fields"
)
# Try common response field names
for field in ["text", "generation", "outputText", "completion"]:
if field in response:
return response[field]
# Last resort: return JSON string
return json.dumps(response)
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""
Generate text from a prompt.
Args:
prompt: The prompt to generate from
max_tokens: Maximum tokens to generate
Returns:
Generated text
Raises:
NotImplementedError: If generation not enabled (no generation_model)
ClientError: If Bedrock API call fails
"""
if not self.supports_generation:
raise NotImplementedError(
"Text generation not supported - no generation_model configured"
)
try:
request_body = self._create_generation_request(prompt, max_tokens)
response = self.client.invoke_model(
modelId=self.generation_model,
body=json.dumps(request_body),
accept="application/json",
contentType="application/json",
)
response_body = json.loads(response["body"].read())
text = self._parse_generation_response(response_body)
return text
except (BotoCoreError, ClientError) as e:
logger.error(f"Bedrock generation error: {e}")
raise
async def close(self) -> None:
"""Close the client (no-op for boto3 clients)."""
pass
+234
View File
@@ -0,0 +1,234 @@
"""Unified Ollama provider for embeddings and text generation."""
import logging
import httpx
from .base import Provider
logger = logging.getLogger(__name__)
class OllamaProvider(Provider):
"""
Ollama provider supporting both embeddings and text generation.
Supports TLS, SSL verification, and automatic model loading.
"""
def __init__(
self,
base_url: str,
embedding_model: str | None = None,
generation_model: str | None = None,
verify_ssl: bool = True,
timeout: httpx.Timeout | None = None,
):
"""
Initialize Ollama provider.
Args:
base_url: Ollama API base URL (e.g., https://ollama.internal.example.com:443)
embedding_model: Model for embeddings (e.g., "nomic-embed-text"). None disables embeddings.
generation_model: Model for text generation (e.g., "llama3.2:1b"). None disables generation.
verify_ssl: Verify SSL certificates (default: True)
timeout: HTTP timeout configuration
"""
self.base_url = base_url.rstrip("/")
self.embedding_model = embedding_model
self.generation_model = generation_model
self.verify_ssl = verify_ssl
if timeout is None:
timeout = httpx.Timeout(timeout=120, connect=5)
self.client = httpx.AsyncClient(verify=verify_ssl, timeout=timeout)
self._dimension: int | None = None # Detected dynamically for embeddings
logger.info(
f"Initialized Ollama provider: {base_url} "
f"(embedding_model={embedding_model}, generation_model={generation_model}, "
f"verify_ssl={verify_ssl})"
)
# Pre-check and auto-load models
if embedding_model:
self._check_model_is_loaded(embedding_model, autoload=True)
if generation_model:
self._check_model_is_loaded(generation_model, autoload=True)
@property
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
return self.embedding_model is not None
@property
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
return self.generation_model is not None
async def embed(self, text: str) -> list[float]:
"""
Generate embedding vector for text.
Args:
text: Input text to embed
Returns:
Vector embedding as list of floats
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
response = await self.client.post(
f"{self.base_url}/api/embeddings",
json={"model": self.embedding_model, "prompt": text},
)
response.raise_for_status()
return response.json()["embedding"]
async def embed_batch(
self, texts: list[str], batch_size: int = 32
) -> list[list[float]]:
"""
Generate embeddings for multiple texts using Ollama's batch API.
Uses /api/embed endpoint with array input for efficient batch processing.
Conservative batch size (32) prevents quality degradation observed in
Ollama issue #6262 with larger batches.
Note: Ollama processes batches serially, not in parallel.
Args:
texts: List of texts to embed
batch_size: Maximum texts per batch (default: 32)
Returns:
List of vector embeddings
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
all_embeddings = []
for i in range(0, len(texts), batch_size):
batch = texts[i : i + batch_size]
response = await self.client.post(
f"{self.base_url}/api/embed",
json={"model": self.embedding_model, "input": batch},
)
response.raise_for_status()
all_embeddings.extend(response.json()["embeddings"])
return all_embeddings
async def _detect_dimension(self):
"""
Detect embedding dimension by generating a test embedding.
This method queries the model to determine the actual dimension
instead of relying on hardcoded values.
"""
if self._dimension is None and self.supports_embeddings:
logger.debug(
f"Detecting embedding dimension for model {self.embedding_model}..."
)
test_embedding = await self.embed("test")
self._dimension = len(test_embedding)
logger.info(
f"Detected embedding dimension: {self._dimension} "
f"for model {self.embedding_model}"
)
def get_dimension(self) -> int:
"""
Get embedding dimension.
Returns:
Vector dimension for the configured embedding model
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
RuntimeError: If dimension not detected yet (call _detect_dimension first)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
if self._dimension is None:
raise RuntimeError(
f"Embedding dimension not detected yet for model {self.embedding_model}. "
"Call _detect_dimension() first or generate an embedding."
)
return self._dimension
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""
Generate text from a prompt.
Args:
prompt: The prompt to generate from
max_tokens: Maximum tokens to generate
Returns:
Generated text
Raises:
NotImplementedError: If generation not enabled (no generation_model)
"""
if not self.supports_generation:
raise NotImplementedError(
"Text generation not supported - no generation_model configured"
)
response = await self.client.post(
f"{self.base_url}/api/generate",
json={
"model": self.generation_model,
"prompt": prompt,
"stream": False,
"options": {
"num_predict": max_tokens,
"temperature": 0.7,
},
},
)
response.raise_for_status()
data = response.json()
return data["response"]
def _check_model_is_loaded(self, model: str, autoload: bool = True):
"""
Check if model is loaded in Ollama, optionally auto-loading it.
Args:
model: Model name to check
autoload: Whether to automatically pull the model if not loaded
"""
response = httpx.get(f"{self.base_url}/api/tags")
response.raise_for_status()
models = [m["name"] for m in response.json().get("models", [])]
logger.info("Ollama has following models pre-loaded: %s", models)
if (model not in models) and autoload:
logger.warning(
"Model '%s' not yet available in ollama, attempting to pull now...",
model,
)
response = httpx.post(f"{self.base_url}/api/pull", json={"model": model})
response.raise_for_status()
async def close(self) -> None:
"""Close HTTP client."""
await self.client.aclose()
+271
View File
@@ -0,0 +1,271 @@
"""Unified OpenAI provider for embeddings and text generation.
Supports:
- OpenAI's standard API
- GitHub Models API (models.github.ai)
- Any OpenAI-compatible API via base_url override
"""
import logging
from functools import wraps
import anyio
from openai import AsyncOpenAI, RateLimitError
from .base import Provider
logger = logging.getLogger(__name__)
# Rate limit retry configuration
MAX_RETRIES = 5
INITIAL_RETRY_DELAY = 2.0 # seconds
MAX_RETRY_DELAY = 60.0 # seconds
def retry_on_rate_limit(func):
"""Decorator to retry on OpenAI rate limit errors with exponential backoff."""
@wraps(func)
async def wrapper(*args, **kwargs):
retry_delay = INITIAL_RETRY_DELAY
last_error: Exception | None = None
for attempt in range(1, MAX_RETRIES + 1):
try:
return await func(*args, **kwargs)
except RateLimitError as e:
last_error = e
if attempt < MAX_RETRIES:
logger.warning(
f"Rate limit hit (attempt {attempt}/{MAX_RETRIES}), "
f"retrying in {retry_delay:.1f}s..."
)
await anyio.sleep(retry_delay)
retry_delay = min(retry_delay * 2, MAX_RETRY_DELAY)
logger.error(f"Rate limit exceeded after {MAX_RETRIES} attempts")
raise last_error # type: ignore[misc]
return wrapper
# Well-known embedding dimensions for OpenAI models
OPENAI_EMBEDDING_DIMENSIONS: dict[str, int] = {
"text-embedding-3-small": 1536,
"text-embedding-3-large": 3072,
"text-embedding-ada-002": 1536,
# GitHub Models API uses openai/ prefix
"openai/text-embedding-3-small": 1536,
"openai/text-embedding-3-large": 3072,
}
class OpenAIProvider(Provider):
"""
OpenAI provider supporting both embeddings and text generation.
Works with:
- OpenAI's standard API (api.openai.com)
- GitHub Models API (models.github.ai)
- Any OpenAI-compatible API (via base_url)
"""
def __init__(
self,
api_key: str,
base_url: str | None = None,
embedding_model: str | None = None,
generation_model: str | None = None,
timeout: float = 120.0,
):
"""
Initialize OpenAI provider.
Args:
api_key: OpenAI API key (or GITHUB_TOKEN for GitHub Models)
base_url: Base URL override (e.g., "https://models.github.ai/inference")
embedding_model: Model for embeddings (e.g., "text-embedding-3-small").
None disables embeddings.
generation_model: Model for text generation (e.g., "gpt-4o-mini").
None disables generation.
timeout: HTTP timeout in seconds (default: 120)
"""
self.embedding_model = embedding_model
self.generation_model = generation_model
self._dimension: int | None = None
# Initialize async client
self.client = AsyncOpenAI(
api_key=api_key,
base_url=base_url,
timeout=timeout,
)
# Try to get known dimension without API call
if embedding_model and embedding_model in OPENAI_EMBEDDING_DIMENSIONS:
self._dimension = OPENAI_EMBEDDING_DIMENSIONS[embedding_model]
logger.info(
f"Initialized OpenAI provider: base_url={base_url or 'default'} "
f"(embedding_model={embedding_model}, generation_model={generation_model}, "
f"dimension={self._dimension})"
)
@property
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
return self.embedding_model is not None
@property
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
return self.generation_model is not None
@retry_on_rate_limit
async def embed(self, text: str) -> list[float]:
"""
Generate embedding vector for text.
Args:
text: Input text to embed
Returns:
Vector embedding as list of floats
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
assert self.embedding_model is not None # Type narrowing
response = await self.client.embeddings.create(
input=text,
model=self.embedding_model,
)
embedding = response.data[0].embedding
# Update dimension if not set
if self._dimension is None:
self._dimension = len(embedding)
logger.info(
f"Detected embedding dimension: {self._dimension} "
f"for model {self.embedding_model}"
)
return embedding
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
"""
Generate embeddings for multiple texts using OpenAI's batch API.
OpenAI supports up to 2048 inputs per request.
Args:
texts: List of texts to embed
Returns:
List of vector embeddings
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
if not texts:
return []
# OpenAI supports batches up to 2048, but use smaller batches for safety
batch_size = 100
all_embeddings: list[list[float]] = []
for i in range(0, len(texts), batch_size):
batch = texts[i : i + batch_size]
# Use helper method with retry logic for each batch
batch_embeddings = await self._embed_batch_request(batch)
all_embeddings.extend(batch_embeddings)
# Update dimension if not set
if self._dimension is None and batch_embeddings:
self._dimension = len(batch_embeddings[0])
logger.info(
f"Detected embedding dimension: {self._dimension} "
f"for model {self.embedding_model}"
)
return all_embeddings
@retry_on_rate_limit
async def _embed_batch_request(self, batch: list[str]) -> list[list[float]]:
"""Make a single batch embedding request with retry logic."""
assert self.embedding_model is not None # Type narrowing
response = await self.client.embeddings.create(
input=batch,
model=self.embedding_model,
)
# Sort by index to maintain order
sorted_data = sorted(response.data, key=lambda x: x.index)
return [item.embedding for item in sorted_data]
def get_dimension(self) -> int:
"""
Get embedding dimension.
Returns:
Vector dimension for the configured embedding model
Raises:
NotImplementedError: If embeddings not enabled (no embedding_model)
RuntimeError: If dimension not detected yet (call embed first)
"""
if not self.supports_embeddings:
raise NotImplementedError(
"Embedding not supported - no embedding_model configured"
)
if self._dimension is None:
raise RuntimeError(
f"Embedding dimension not detected yet for model {self.embedding_model}. "
"Call embed() first or use a known model."
)
return self._dimension
@retry_on_rate_limit
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""
Generate text from a prompt.
Args:
prompt: The prompt to generate from
max_tokens: Maximum tokens to generate
Returns:
Generated text
Raises:
NotImplementedError: If generation not enabled (no generation_model)
"""
if not self.supports_generation:
raise NotImplementedError(
"Text generation not supported - no generation_model configured"
)
response = await self.client.chat.completions.create(
model=self.generation_model,
messages=[{"role": "user", "content": prompt}],
max_tokens=max_tokens,
temperature=0.7,
)
return response.choices[0].message.content or ""
async def close(self) -> None:
"""Close HTTP client."""
await self.client.close()
+156
View File
@@ -0,0 +1,156 @@
"""Provider registry and factory for auto-detection and instantiation."""
import logging
import os
from .base import Provider
from .bedrock import BedrockProvider
from .ollama import OllamaProvider
from .openai import OpenAIProvider
from .simple import SimpleProvider
logger = logging.getLogger(__name__)
class ProviderRegistry:
"""
Registry for provider auto-detection and instantiation.
Checks environment variables in priority order and creates appropriate provider:
1. Bedrock (AWS_REGION + BEDROCK_*_MODEL)
2. OpenAI (OPENAI_API_KEY)
3. Ollama (OLLAMA_BASE_URL)
4. Simple (fallback for testing/development)
"""
@staticmethod
def create_provider() -> Provider:
"""
Auto-detect and create provider based on environment variables.
Priority order:
1. Bedrock - if AWS_REGION or BEDROCK_EMBEDDING_MODEL is set
2. OpenAI - if OPENAI_API_KEY is set
3. Ollama - if OLLAMA_BASE_URL is set
4. Simple - fallback for testing/development
Returns:
Provider instance
Environment Variables:
Bedrock:
- AWS_REGION: AWS region (e.g., "us-east-1")
- AWS_ACCESS_KEY_ID: AWS access key (optional, uses credential chain)
- AWS_SECRET_ACCESS_KEY: AWS secret key (optional)
- BEDROCK_EMBEDDING_MODEL: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
- BEDROCK_GENERATION_MODEL: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
OpenAI:
- OPENAI_API_KEY: OpenAI API key (or GITHUB_TOKEN for GitHub Models)
- OPENAI_BASE_URL: Base URL override (e.g., "https://models.github.ai/inference")
- OPENAI_EMBEDDING_MODEL: Model for embeddings (default: "text-embedding-3-small")
- OPENAI_GENERATION_MODEL: Model for text generation (e.g., "gpt-4o-mini")
Ollama:
- OLLAMA_BASE_URL: Ollama API base URL (e.g., "http://localhost:11434")
- OLLAMA_EMBEDDING_MODEL: Model for embeddings (default: "nomic-embed-text")
- OLLAMA_GENERATION_MODEL: Model for text generation (e.g., "llama3.2:1b")
- OLLAMA_VERIFY_SSL: Verify SSL certificates (default: "true")
Simple (no configuration needed, fallback):
- SIMPLE_EMBEDDING_DIMENSION: Embedding dimension (default: 384)
"""
# 1. Check for Bedrock
aws_region = os.getenv("AWS_REGION")
bedrock_embedding_model = os.getenv("BEDROCK_EMBEDDING_MODEL")
bedrock_generation_model = os.getenv("BEDROCK_GENERATION_MODEL")
if aws_region or bedrock_embedding_model or bedrock_generation_model:
logger.info(
f"Using Bedrock provider: region={aws_region}, "
f"embedding_model={bedrock_embedding_model}, "
f"generation_model={bedrock_generation_model}"
)
return BedrockProvider(
region_name=aws_region,
embedding_model=bedrock_embedding_model,
generation_model=bedrock_generation_model,
aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)
# 2. Check for OpenAI
openai_api_key = os.getenv("OPENAI_API_KEY")
if openai_api_key:
base_url = os.getenv("OPENAI_BASE_URL")
embedding_model = os.getenv(
"OPENAI_EMBEDDING_MODEL", "text-embedding-3-small"
)
generation_model = os.getenv("OPENAI_GENERATION_MODEL")
logger.info(
f"Using OpenAI provider: base_url={base_url or 'default'}, "
f"embedding_model={embedding_model}, "
f"generation_model={generation_model}"
)
return OpenAIProvider(
api_key=openai_api_key,
base_url=base_url,
embedding_model=embedding_model,
generation_model=generation_model,
)
# 3. Check for Ollama (local LLM)
ollama_url = os.getenv("OLLAMA_BASE_URL")
if ollama_url:
embedding_model = os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text")
generation_model = os.getenv("OLLAMA_GENERATION_MODEL")
verify_ssl = os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true"
logger.info(
f"Using Ollama provider: {ollama_url}, "
f"embedding_model={embedding_model}, "
f"generation_model={generation_model}"
)
return OllamaProvider(
base_url=ollama_url,
embedding_model=embedding_model,
generation_model=generation_model,
verify_ssl=verify_ssl,
)
# 4. Fallback to Simple provider for development/testing
dimension = int(os.getenv("SIMPLE_EMBEDDING_DIMENSION", "384"))
logger.warning(
"No provider configured (AWS_REGION, OPENAI_API_KEY, OLLAMA_BASE_URL not set). "
"Using SimpleProvider for testing/development. "
"For production, configure Bedrock, OpenAI, or Ollama."
)
return SimpleProvider(dimension=dimension)
# Singleton instance
_provider: Provider | None = None
def get_provider() -> Provider:
"""
Get singleton provider instance.
Returns:
Global Provider instance (auto-detected on first call)
"""
global _provider
if _provider is None:
_provider = ProviderRegistry.create_provider()
return _provider
def reset_provider():
"""
Reset singleton provider instance.
Useful for testing or reconfiguration.
"""
global _provider
_provider = None
+149
View File
@@ -0,0 +1,149 @@
"""Simple in-process embedding provider for testing.
This provider uses a basic TF-IDF-like approach with feature hashing to generate
deterministic embeddings without requiring external services. Suitable for testing
but not for production use.
"""
import hashlib
import math
import re
from collections import Counter
from .base import Provider
class SimpleProvider(Provider):
"""Simple deterministic embedding provider using feature hashing.
This implementation:
- Tokenizes text into words
- Uses feature hashing to map words to fixed-size vectors
- Applies TF-IDF-like weighting
- Normalizes vectors to unit length
Not suitable for production but good for testing semantic search infrastructure.
Only supports embeddings, not text generation.
"""
def __init__(self, dimension: int = 384):
"""Initialize simple embedding provider.
Args:
dimension: Embedding dimension (default: 384)
"""
self.dimension = dimension
@property
def supports_embeddings(self) -> bool:
"""Whether this provider supports embedding generation."""
return True
@property
def supports_generation(self) -> bool:
"""Whether this provider supports text generation."""
return False
def _tokenize(self, text: str) -> list[str]:
"""Tokenize text into lowercase words.
Args:
text: Input text
Returns:
List of lowercase word tokens
"""
# Simple word tokenization
text = text.lower()
words = re.findall(r"\b\w+\b", text)
return words
def _hash_word(self, word: str) -> int:
"""Hash word to dimension index.
Args:
word: Word to hash
Returns:
Index in range [0, dimension)
"""
hash_bytes = hashlib.md5(word.encode()).digest()
hash_int = int.from_bytes(hash_bytes[:4], byteorder="big")
return hash_int % self.dimension
def _embed_single(self, text: str) -> list[float]:
"""Generate embedding for single text.
Args:
text: Input text
Returns:
Normalized embedding vector
"""
tokens = self._tokenize(text)
if not tokens:
return [0.0] * self.dimension
# Count term frequencies
term_freq = Counter(tokens)
# Initialize vector
vector = [0.0] * self.dimension
# Apply TF weighting with feature hashing
for word, count in term_freq.items():
idx = self._hash_word(word)
# Simple TF weighting: log(1 + count)
vector[idx] += math.log1p(count)
# Normalize to unit length
norm = math.sqrt(sum(x * x for x in vector))
if norm > 0:
vector = [x / norm for x in vector]
return vector
async def embed(self, text: str) -> list[float]:
"""Generate embedding vector for text.
Args:
text: Input text to embed
Returns:
Vector embedding as list of floats
"""
return self._embed_single(text)
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
"""Generate embeddings for multiple texts.
Args:
texts: List of texts to embed
Returns:
List of vector embeddings
"""
return [self._embed_single(text) for text in texts]
def get_dimension(self) -> int:
"""Get embedding dimension.
Returns:
Vector dimension
"""
return self.dimension
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
"""
Generate text from a prompt.
Raises:
NotImplementedError: Simple provider doesn't support text generation
"""
raise NotImplementedError(
"Text generation not supported by Simple provider - use Ollama, Anthropic, or Bedrock"
)
async def close(self) -> None:
"""Close the provider (no-op for simple provider)."""
pass
+8 -14
View File
@@ -1,13 +1,11 @@
"""Search algorithms module for unified multi-algorithm search.
"""Search algorithms module for BM25 hybrid search.
This module provides a unified interface for different search algorithms:
- Semantic search (vector similarity)
- Keyword search (token-based matching)
- Fuzzy search (character overlap)
- Hybrid search (RRF fusion of multiple algorithms)
This module provides BM25 hybrid search combining:
- Dense semantic vectors (vector similarity via embeddings)
- Sparse BM25 vectors (keyword-based retrieval)
All algorithms share the same interface and can be used interchangeably by both
MCP tools and the visualization pane.
Results are fused using Qdrant's native Reciprocal Rank Fusion (RRF) for
optimal relevance across both semantic and keyword queries.
"""
from nextcloud_mcp_server.search.algorithms import (
@@ -16,9 +14,7 @@ from nextcloud_mcp_server.search.algorithms import (
SearchResult,
get_indexed_doc_types,
)
from nextcloud_mcp_server.search.fuzzy import FuzzySearchAlgorithm
from nextcloud_mcp_server.search.hybrid import HybridSearchAlgorithm
from nextcloud_mcp_server.search.keyword import KeywordSearchAlgorithm
from nextcloud_mcp_server.search.bm25_hybrid import BM25HybridSearchAlgorithm
from nextcloud_mcp_server.search.semantic import SemanticSearchAlgorithm
__all__ = [
@@ -27,7 +23,5 @@ __all__ = [
"SearchResult",
"get_indexed_doc_types",
"SemanticSearchAlgorithm",
"KeywordSearchAlgorithm",
"FuzzySearchAlgorithm",
"HybridSearchAlgorithm",
"BM25HybridSearchAlgorithm",
]
+40 -8
View File
@@ -83,6 +83,7 @@ async def get_indexed_doc_types(user_id: str) -> set[str]:
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.vector.placeholder import get_placeholder_filter
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
@@ -97,15 +98,18 @@ async def get_indexed_doc_types(user_id: str) -> set[str]:
scroll_results, _next_offset = await qdrant_client.scroll(
collection_name=collection,
scroll_filter=Filter(
must=[FieldCondition(key="user_id", match=MatchValue(value=user_id))]
must=[
get_placeholder_filter(), # Exclude placeholders from doc_type discovery
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
]
),
limit=1000, # Sample size to discover types
with_payload=["doc_type"],
with_vectors=False, # Don't need vectors for type discovery
)
doc_types = {
point.payload.get("doc_type")
doc_types: set[str] = {
str(point.payload.get("doc_type"))
for point in scroll_results
if point.payload.get("doc_type")
}
@@ -123,12 +127,20 @@ class SearchResult:
"""A single search result with metadata and score.
Attributes:
id: Document ID
id: Document ID (int for all document types)
doc_type: Document type (note, file, calendar, contact, etc.)
title: Document title
excerpt: Content excerpt showing match context
score: Relevance score (0.0-1.0, higher is better)
score: Relevance score (≥ 0.0, higher is better)
- RRF fusion: scores in [0.0, 1.0]
- DBSF fusion: scores can exceed 1.0 (sum of normalized scores)
metadata: Additional algorithm-specific metadata
chunk_start_offset: Character position where chunk starts (None if not available)
chunk_end_offset: Character position where chunk ends (None if not available)
page_number: Page number for PDF documents (None for other doc types)
chunk_index: Zero-based index of this chunk in the document
total_chunks: Total number of chunks in the document
point_id: Qdrant point ID for batch vector retrieval (None if not from Qdrant)
"""
id: int
@@ -137,11 +149,24 @@ class SearchResult:
excerpt: str
score: float
metadata: dict[str, Any] | None = None
chunk_start_offset: int | None = None
chunk_end_offset: int | None = None
page_number: int | None = None
chunk_index: int = 0
total_chunks: int = 1
point_id: str | None = None
def __post_init__(self):
"""Validate score is in valid range."""
if not 0.0 <= self.score <= 1.0:
raise ValueError(f"Score must be between 0.0 and 1.0, got {self.score}")
"""Validate score is non-negative.
Note: Different fusion methods produce different score ranges:
- RRF (Reciprocal Rank Fusion): Bounded to [0.0, 1.0]
- DBSF (Distribution-Based Score Fusion): Unbounded (can exceed 1.0)
DBSF sums normalized scores from multiple systems, so scores can be
1.5, 2.0, etc. when multiple systems agree a document is highly relevant.
"""
if self.score < 0.0:
raise ValueError(f"Score must be non-negative, got {self.score}")
class SearchAlgorithm(ABC):
@@ -149,8 +174,15 @@ class SearchAlgorithm(ABC):
All search algorithms must implement the search() method with consistent
interface, allowing them to be used interchangeably.
Attributes:
query_embedding: The query embedding generated during the last search.
Available after search() completes for algorithms that use embeddings.
Can be reused by callers to avoid redundant embedding generation.
"""
query_embedding: list[float] | None = None
@abstractmethod
async def search(
self,
+255
View File
@@ -0,0 +1,255 @@
"""BM25 hybrid search algorithm using Qdrant native RRF fusion."""
import logging
from typing import Any
from qdrant_client import models
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.embedding import get_bm25_service, get_embedding_service
from nextcloud_mcp_server.observability.metrics import record_qdrant_operation
from nextcloud_mcp_server.observability.tracing import trace_operation
from nextcloud_mcp_server.search.algorithms import SearchAlgorithm, SearchResult
from nextcloud_mcp_server.vector.placeholder import get_placeholder_filter
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
class BM25HybridSearchAlgorithm(SearchAlgorithm):
"""
Hybrid search combining dense semantic vectors with BM25 sparse vectors.
Uses Qdrant's native Reciprocal Rank Fusion (RRF) to automatically merge
results from both dense (semantic) and sparse (BM25 keyword) searches.
This provides the best of both worlds: semantic understanding for conceptual
queries and precise keyword matching for specific terms, acronyms, and codes.
The fusion happens efficiently in the database using the prefetch mechanism,
eliminating the need for application-layer result merging.
"""
def __init__(self, score_threshold: float = 0.0, fusion: str = "rrf"):
"""
Initialize BM25 hybrid search algorithm.
Args:
score_threshold: Minimum fusion score (0-1, default: 0.0 to allow fusion scoring)
Note: Both RRF and DBSF produce normalized scores
fusion: Fusion algorithm to use: "rrf" (Reciprocal Rank Fusion, default)
or "dbsf" (Distribution-Based Score Fusion)
Raises:
ValueError: If fusion is not "rrf" or "dbsf"
"""
if fusion not in ("rrf", "dbsf"):
raise ValueError(
f"Invalid fusion algorithm '{fusion}'. Must be 'rrf' or 'dbsf'"
)
self.score_threshold = score_threshold
self.fusion = models.Fusion.RRF if fusion == "rrf" else models.Fusion.DBSF
self.fusion_name = fusion
@property
def name(self) -> str:
return "bm25_hybrid"
@property
def requires_vector_db(self) -> bool:
return True
async def search(
self,
query: str,
user_id: str,
limit: int = 10,
doc_type: str | None = None,
**kwargs: Any,
) -> list[SearchResult]:
"""
Execute hybrid search using dense + sparse vectors with native RRF fusion.
Returns unverified results from Qdrant. Access verification should be
performed separately at the final output stage using verify_search_results().
Deduplicates by (doc_id, doc_type, chunk_start_offset, chunk_end_offset)
to show multiple chunks from the same document while avoiding duplicate chunks.
Args:
query: Natural language or keyword search query
user_id: User ID for filtering
limit: Maximum results to return
doc_type: Optional document type filter
**kwargs: Additional parameters (score_threshold override)
Returns:
List of unverified SearchResult objects ranked by RRF fusion score
Raises:
McpError: If vector sync is not enabled or search fails
"""
settings = get_settings()
score_threshold = kwargs.get("score_threshold", self.score_threshold)
logger.info(
f"BM25 hybrid search: query='{query}', user={user_id}, "
f"limit={limit}, score_threshold={score_threshold}, doc_type={doc_type}, "
f"fusion={self.fusion_name}"
)
# Generate dense embedding for semantic search
with trace_operation("search.get_embedding_service"):
embedding_service = get_embedding_service()
with trace_operation("search.dense_embedding"):
dense_embedding = await embedding_service.embed(query)
# Store for reuse by callers (e.g., viz_routes PCA visualization)
self.query_embedding = dense_embedding
logger.debug(f"Generated dense embedding (dimension={len(dense_embedding)})")
# Generate sparse embedding for BM25 keyword search
with trace_operation("search.get_bm25_service"):
bm25_service = get_bm25_service()
with trace_operation("search.sparse_embedding_bm25"):
sparse_embedding = await bm25_service.encode_async(query)
logger.debug(
f"Generated sparse embedding "
f"({len(sparse_embedding['indices'])} non-zero terms)"
)
# Build Qdrant filter
filter_conditions = [
get_placeholder_filter(), # Always exclude placeholders from user-facing queries
FieldCondition(
key="user_id",
match=MatchValue(value=user_id),
),
]
# Add doc_type filter if specified
if doc_type:
filter_conditions.append(
FieldCondition(
key="doc_type",
match=MatchValue(value=doc_type),
)
)
query_filter = Filter(must=filter_conditions)
# Execute hybrid search with Qdrant native RRF fusion
with trace_operation("search.get_qdrant_client"):
qdrant_client = await get_qdrant_client()
try:
# Use prefetch to run both dense and sparse searches
# Qdrant will automatically merge results using RRF
with trace_operation(
"search.qdrant_query",
attributes={"query.limit": limit * 2, "query.fusion": self.fusion_name},
):
search_response = await qdrant_client.query_points(
collection_name=settings.get_collection_name(),
prefetch=[
# Dense semantic search
models.Prefetch(
query=dense_embedding,
using="dense",
limit=limit * 2, # Get extra for deduplication
filter=query_filter,
),
# Sparse BM25 search
models.Prefetch(
query=models.SparseVector(
indices=sparse_embedding["indices"],
values=sparse_embedding["values"],
),
using="sparse",
limit=limit * 2, # Get extra for deduplication
filter=query_filter,
),
],
# Fusion query (RRF or DBSF based on initialization)
query=models.FusionQuery(fusion=self.fusion),
limit=limit * 2, # Get extra for deduplication
score_threshold=score_threshold,
with_payload=True,
with_vectors=False, # Don't return vectors to save bandwidth
)
record_qdrant_operation("search", "success")
except Exception:
record_qdrant_operation("search", "error")
raise
logger.info(
f"Qdrant {self.fusion_name.upper()} fusion returned {len(search_response.points)} results "
f"(before deduplication)"
)
if search_response.points:
# Log top 3 fusion scores to help with threshold tuning
top_scores = [p.score for p in search_response.points[:3]]
logger.debug(
f"Top 3 {self.fusion_name.upper()} fusion scores: {top_scores}"
)
# Deduplicate by (doc_id, doc_type, chunk_start, chunk_end)
# This allows multiple chunks from same doc, but removes duplicate chunks
with trace_operation(
"search.deduplicate",
attributes={"dedupe.num_points": len(search_response.points)},
):
seen_chunks = set()
results = []
for result in search_response.points:
if result.payload is None:
continue
# doc_id can be int (notes) or str (files - file paths)
doc_id = result.payload["doc_id"]
doc_type = result.payload.get("doc_type", "note")
chunk_start = result.payload.get("chunk_start_offset")
chunk_end = result.payload.get("chunk_end_offset")
chunk_key = (doc_id, doc_type, chunk_start, chunk_end)
# Skip if we've already seen this exact chunk
if chunk_key in seen_chunks:
continue
seen_chunks.add(chunk_key)
# Return unverified results (verification happens at output stage)
results.append(
SearchResult(
id=doc_id,
doc_type=doc_type,
title=result.payload.get("title", "Untitled"),
excerpt=result.payload.get("excerpt", ""),
score=result.score, # Fusion score (RRF or DBSF)
metadata={
"chunk_index": result.payload.get("chunk_index"),
"total_chunks": result.payload.get("total_chunks"),
"search_method": f"bm25_hybrid_{self.fusion_name}",
},
chunk_start_offset=result.payload.get("chunk_start_offset"),
chunk_end_offset=result.payload.get("chunk_end_offset"),
page_number=result.payload.get("page_number"),
chunk_index=result.payload.get("chunk_index", 0),
total_chunks=result.payload.get("total_chunks", 1),
point_id=str(result.id), # Qdrant point ID for batch retrieval
)
)
if len(results) >= limit:
break
logger.info(f"Returning {len(results)} unverified results after deduplication")
if results:
result_details = [
f"{r.doc_type}_{r.id} (score={r.score:.3f}, title='{r.title}')"
for r in results[:5] # Show top 5
]
logger.debug(f"Top results: {', '.join(result_details)}")
return results
+598
View File
@@ -0,0 +1,598 @@
"""Context expansion for search results.
Provides utilities to expand matched chunks with surrounding context and
position markers for better visualization and understanding of search results.
"""
import logging
from dataclasses import dataclass
from nextcloud_mcp_server.client import NextcloudClient
logger = logging.getLogger(__name__)
async def _get_chunk_from_qdrant(
user_id: str, doc_id: int, doc_type: str, chunk_start: int, chunk_end: int
) -> str | None:
"""Retrieve full chunk text from Qdrant payload.
This avoids re-fetching and re-parsing documents by using the cached
chunk content already stored in Qdrant.
Args:
user_id: User ID who owns the document
doc_id: Document ID
doc_type: Document type (e.g., "note", "file")
chunk_start: Character offset where chunk starts
chunk_end: Character offset where chunk ends
Returns:
Full chunk text from Qdrant excerpt field, or None if not found
"""
try:
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
qdrant_client = await get_qdrant_client()
settings = get_settings()
# Query for the specific chunk
scroll_result = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_id", match=MatchValue(value=doc_id)),
FieldCondition(key="doc_type", match=MatchValue(value=doc_type)),
FieldCondition(
key="chunk_start_offset", match=MatchValue(value=chunk_start)
),
FieldCondition(
key="chunk_end_offset", match=MatchValue(value=chunk_end)
),
]
),
limit=1,
with_payload=["excerpt"],
with_vectors=False,
)
if scroll_result[0]:
point = scroll_result[0][0]
excerpt = point.payload.get("excerpt")
if excerpt:
logger.debug(
f"Retrieved chunk from Qdrant for {doc_type} {doc_id}: "
f"{len(excerpt)} chars"
)
return str(excerpt)
logger.debug(
f"Chunk not found in Qdrant for {doc_type} {doc_id}, "
f"chunk [{chunk_start}:{chunk_end}]. Will fall back to document fetch."
)
return None
except Exception as e:
logger.error(
f"Error querying Qdrant for chunk: {e}. Falling back to document fetch.",
exc_info=True,
)
return None
async def _get_chunk_by_index_from_qdrant(
user_id: str, doc_id: int, doc_type: str, chunk_index: int
) -> str | None:
"""Retrieve chunk text by chunk_index from Qdrant payload.
Used to fetch adjacent chunks for context expansion.
Args:
user_id: User ID who owns the document
doc_id: Document ID
doc_type: Document type (e.g., "note", "file")
chunk_index: Zero-based chunk index in document
Returns:
Full chunk text from Qdrant excerpt field, or None if not found
"""
try:
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
qdrant_client = await get_qdrant_client()
settings = get_settings()
# Query for chunk by index
scroll_result = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_id", match=MatchValue(value=doc_id)),
FieldCondition(key="doc_type", match=MatchValue(value=doc_type)),
FieldCondition(
key="chunk_index", match=MatchValue(value=chunk_index)
),
]
),
limit=1,
with_payload=["excerpt"],
with_vectors=False,
)
if scroll_result[0]:
point = scroll_result[0][0]
excerpt = point.payload.get("excerpt")
if excerpt:
logger.debug(
f"Retrieved adjacent chunk {chunk_index} from Qdrant for "
f"{doc_type} {doc_id}: {len(excerpt)} chars"
)
return str(excerpt)
return None
except Exception as e:
logger.debug(
f"Could not retrieve adjacent chunk {chunk_index} for "
f"{doc_type} {doc_id}: {e}"
)
return None
async def _get_file_path_from_qdrant(
user_id: str, file_id: int, chunk_start: int, chunk_end: int
) -> str | None:
"""Resolve file_id to file_path by querying Qdrant payload.
Args:
user_id: User ID who owns the file
file_id: Numeric file ID
chunk_start: Character offset where chunk starts
chunk_end: Character offset where chunk ends
Returns:
File path string, or None if not found in Qdrant
"""
try:
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
qdrant_client = await get_qdrant_client()
settings = get_settings()
# Query for the specific chunk
scroll_result = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_id", match=MatchValue(value=file_id)),
FieldCondition(key="doc_type", match=MatchValue(value="file")),
FieldCondition(
key="chunk_start_offset", match=MatchValue(value=chunk_start)
),
FieldCondition(
key="chunk_end_offset", match=MatchValue(value=chunk_end)
),
]
),
limit=1,
with_payload=["file_path"],
with_vectors=False,
)
if scroll_result[0]:
point = scroll_result[0][0]
file_path = point.payload.get("file_path")
if file_path:
logger.debug(f"Resolved file_id {file_id} to file_path {file_path}")
return str(file_path)
logger.warning(
f"Could not find file_path in Qdrant for file_id {file_id}, "
f"chunk [{chunk_start}:{chunk_end}]"
)
return None
except Exception as e:
logger.error(f"Error querying Qdrant for file_path: {e}", exc_info=True)
return None
@dataclass
class ChunkContext:
"""Expanded chunk with surrounding context and position markers.
Attributes:
chunk_text: The matched chunk text
before_context: Text before the chunk (up to context_chars)
after_context: Text after the chunk (up to context_chars)
chunk_start_offset: Character position where chunk starts in document
chunk_end_offset: Character position where chunk ends in document
page_number: Page number for PDFs (None for other doc types)
chunk_index: Zero-based chunk index (N in "chunk N of M")
total_chunks: Total number of chunks in document
marked_text: Full text with position markers around the chunk
has_before_truncation: True if before_context was truncated
has_after_truncation: True if after_context was truncated
"""
chunk_text: str
before_context: str
after_context: str
chunk_start_offset: int
chunk_end_offset: int
page_number: int | None
chunk_index: int
total_chunks: int
marked_text: str
has_before_truncation: bool
has_after_truncation: bool
async def get_chunk_with_context(
nc_client: NextcloudClient,
user_id: str,
doc_id: str | int,
doc_type: str,
chunk_start: int,
chunk_end: int,
page_number: int | None = None,
chunk_index: int = 0,
total_chunks: int = 1,
context_chars: int = 300,
) -> ChunkContext | None:
"""Fetch chunk with surrounding context.
First tries to retrieve the chunk from Qdrant (fast, cached). If that fails
(e.g., legacy data with truncated excerpts), falls back to fetching and
parsing the full document (slower, for PDFs especially).
Args:
nc_client: Authenticated Nextcloud client
user_id: User ID who owns the document
doc_id: Document ID (int for notes/files)
doc_type: Type of document ("note", "file", etc.)
chunk_start: Character offset where chunk starts
chunk_end: Character offset where chunk ends
page_number: Optional page number for PDFs
chunk_index: Zero-based chunk index in document
total_chunks: Total number of chunks in document
context_chars: Number of characters to include before/after chunk
Returns:
ChunkContext with expanded context and markers, or None if document
cannot be retrieved
"""
# Convert doc_id to int for Qdrant query
doc_id_int = (
int(doc_id)
if isinstance(doc_id, str) and doc_id.isdigit()
else (doc_id if isinstance(doc_id, int) else None)
)
# Try to get chunk from Qdrant first (fast path)
if doc_id_int is not None:
chunk_text = await _get_chunk_from_qdrant(
user_id, doc_id_int, doc_type, chunk_start, chunk_end
)
if chunk_text:
logger.info(
f"Retrieved chunk from Qdrant cache for {doc_type} {doc_id} "
f"(avoids document re-fetch/re-parse)"
)
# Fetch adjacent chunks for context expansion
# Get chunk overlap from config to remove duplicate text
from nextcloud_mcp_server.config import get_settings
settings = get_settings()
chunk_overlap = settings.document_chunk_overlap
before_context = ""
after_context = ""
has_before_truncation = False
has_after_truncation = False
# Fetch previous chunk if not first chunk
if chunk_index > 0:
before_chunk = await _get_chunk_by_index_from_qdrant(
user_id, doc_id_int, doc_type, chunk_index - 1
)
if before_chunk:
# Remove overlap: the last chunk_overlap chars of previous chunk
# overlap with the first chunk_overlap chars of current chunk
before_context = (
before_chunk[:-chunk_overlap]
if len(before_chunk) > chunk_overlap
else ""
)
# Truncate if requested context_chars < remaining length
if before_context and len(before_context) > context_chars:
before_context = before_context[-context_chars:]
has_before_truncation = True
else:
# Could not fetch previous chunk, but we're not at start
has_before_truncation = True
# Fetch next chunk if not last chunk
if chunk_index < total_chunks - 1:
after_chunk = await _get_chunk_by_index_from_qdrant(
user_id, doc_id_int, doc_type, chunk_index + 1
)
if after_chunk:
# Remove overlap: the first chunk_overlap chars of next chunk
# overlap with the last chunk_overlap chars of current chunk
after_context = (
after_chunk[chunk_overlap:]
if len(after_chunk) > chunk_overlap
else ""
)
# Truncate if requested context_chars < remaining length
if after_context and len(after_context) > context_chars:
after_context = after_context[:context_chars]
has_after_truncation = True
else:
# Could not fetch next chunk, but we're not at end
has_after_truncation = True
marked_text = _insert_position_markers(
before_context=before_context,
chunk_text=chunk_text,
after_context=after_context,
page_number=page_number,
chunk_index=chunk_index,
total_chunks=total_chunks,
has_before_truncation=has_before_truncation,
has_after_truncation=has_after_truncation,
)
return ChunkContext(
chunk_text=chunk_text,
before_context=before_context,
after_context=after_context,
chunk_start_offset=chunk_start,
chunk_end_offset=chunk_end,
page_number=page_number,
chunk_index=chunk_index,
total_chunks=total_chunks,
marked_text=marked_text,
has_before_truncation=has_before_truncation,
has_after_truncation=has_after_truncation,
)
# Fallback: Fetch full document and extract chunk with context
# This path is taken for:
# 1. Legacy data with truncated excerpts in Qdrant
# 2. Failed Qdrant queries
logger.info(
f"Falling back to document fetch for {doc_type} {doc_id} "
f"(Qdrant cache miss, possibly legacy data)"
)
# For files, retrieve file_path from Qdrant payload
resolved_doc_id = doc_id
if doc_type == "file" and isinstance(doc_id, int):
file_path = await _get_file_path_from_qdrant(
user_id, doc_id, chunk_start, chunk_end
)
if not file_path:
logger.warning(
f"Could not resolve file_id {doc_id} to file_path from Qdrant"
)
return None
resolved_doc_id = file_path
logger.debug(f"Resolved file_id {doc_id} to file_path {file_path}")
# Fetch full document text
full_text = await _fetch_document_text(nc_client, resolved_doc_id, doc_type)
if full_text is None:
logger.warning(
f"Could not fetch document text for {doc_type} {doc_id}, "
"skipping context expansion"
)
return None
# Validate offsets
if chunk_start < 0 or chunk_end > len(full_text) or chunk_start >= chunk_end:
logger.warning(
f"Invalid chunk offsets for {doc_type} {doc_id}: "
f"start={chunk_start}, end={chunk_end}, doc_len={len(full_text)}"
)
return None
# Extract chunk text
chunk_text = full_text[chunk_start:chunk_end]
# Calculate context boundaries
context_start = max(0, chunk_start - context_chars)
context_end = min(len(full_text), chunk_end + context_chars)
# Extract context
before_context = full_text[context_start:chunk_start]
after_context = full_text[chunk_end:context_end]
# Check for truncation
has_before_truncation = context_start > 0
has_after_truncation = context_end < len(full_text)
# Create marked text with position markers
marked_text = _insert_position_markers(
before_context=before_context,
chunk_text=chunk_text,
after_context=after_context,
page_number=page_number,
chunk_index=chunk_index,
total_chunks=total_chunks,
has_before_truncation=has_before_truncation,
has_after_truncation=has_after_truncation,
)
return ChunkContext(
chunk_text=chunk_text,
before_context=before_context,
after_context=after_context,
chunk_start_offset=chunk_start,
chunk_end_offset=chunk_end,
page_number=page_number,
chunk_index=chunk_index,
total_chunks=total_chunks,
marked_text=marked_text,
has_before_truncation=has_before_truncation,
has_after_truncation=has_after_truncation,
)
async def _fetch_document_text(
nc_client: NextcloudClient, doc_id: str | int, doc_type: str
) -> str | None:
"""Fetch full text content of a document.
Args:
nc_client: Authenticated Nextcloud client
doc_id: Document ID (note ID or file path)
doc_type: Type of document ("note", "file", etc.)
Returns:
Full document text, or None if document cannot be retrieved
"""
try:
if doc_type == "note":
# Fetch note by ID
note = await nc_client.notes.get_note(note_id=int(doc_id))
# Reconstruct full content as indexed: title + "\n\n" + content
# This ensures chunk offsets align with indexed content structure
title = note.get("title", "")
content = note.get("content", "")
return f"{title}\n\n{content}"
elif doc_type == "file":
# Fetch file content via WebDAV
try:
file_path = str(doc_id)
file_content, content_type = await nc_client.webdav.read_file(file_path)
# Check if it's a PDF (by content type or file extension)
is_pdf = (
content_type and "pdf" in content_type.lower()
) or file_path.lower().endswith(".pdf")
if is_pdf:
# Extract text from PDF using PyMuPDF
# IMPORTANT: Use pymupdf4llm.to_markdown() to match indexing extraction
# This ensures character offsets align between indexed chunks and retrieval
import pymupdf
import pymupdf4llm
logger.debug(f"Extracting text from PDF: {file_path}")
pdf_doc = pymupdf.open(stream=file_content, filetype="pdf")
text_parts = []
# Extract each page as markdown (same as indexing)
for page_num in range(pdf_doc.page_count):
page_md = pymupdf4llm.to_markdown(
pdf_doc,
pages=[page_num],
write_images=False, # Don't need images for context
page_chunks=False,
)
text_parts.append(page_md)
pdf_doc.close()
# Join pages (no separator - matches indexing)
full_text = "".join(text_parts)
logger.debug(
f"Extracted {len(full_text)} characters from "
f"{pdf_doc.page_count} pages in {file_path}"
)
return full_text
else:
# Assume it's a text file, decode to string
logger.debug(f"Decoding text file: {file_path}")
return file_content.decode("utf-8", errors="replace")
except Exception as e:
logger.error(
f"Error fetching file content for {doc_id}: {e}", exc_info=True
)
return None
else:
logger.warning(f"Unsupported doc_type for context expansion: {doc_type}")
return None
except Exception as e:
logger.error(f"Error fetching document {doc_type} {doc_id}: {e}", exc_info=True)
return None
def _insert_position_markers(
before_context: str,
chunk_text: str,
after_context: str,
page_number: int | None,
chunk_index: int,
total_chunks: int,
has_before_truncation: bool,
has_after_truncation: bool,
) -> str:
"""Insert position markers around matched chunk.
Creates markdown-formatted text with visual markers indicating chunk
boundaries and metadata.
Args:
before_context: Text before chunk
chunk_text: The matched chunk
after_context: Text after chunk
page_number: Optional page number
chunk_index: Zero-based chunk index
total_chunks: Total chunks in document
has_before_truncation: Whether before_context is truncated
has_after_truncation: Whether after_context is truncated
Returns:
Formatted text with position markers
"""
# Build position metadata
position_parts = []
if page_number is not None:
position_parts.append(f"Page {page_number}")
position_parts.append(f"Chunk {chunk_index + 1} of {total_chunks}")
position_metadata = ", ".join(position_parts)
# Build marked text
parts = []
# Add truncation indicator for before context
if has_before_truncation:
parts.append("**[...]**\n\n")
# Add before context if present
if before_context:
parts.append(before_context)
# Add chunk start marker
parts.append(f"\n\n🔍 **MATCHED CHUNK START** ({position_metadata})\n\n")
# Add chunk text
parts.append(chunk_text)
# Add chunk end marker
parts.append("\n\n🔍 **MATCHED CHUNK END**\n\n")
# Add after context if present
if after_context:
parts.append(after_context)
# Add truncation indicator for after context
if has_after_truncation:
parts.append("\n\n**[...]**")
return "".join(parts)
-219
View File
@@ -1,219 +0,0 @@
"""Fuzzy search algorithm using character overlap matching on Qdrant payload."""
import logging
from typing import Any
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.search.algorithms import SearchAlgorithm, SearchResult
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
class FuzzySearchAlgorithm(SearchAlgorithm):
"""Fuzzy search using simple character-based similarity.
Implements character overlap matching with configurable threshold:
- Compares character sets between query and text
- Requires configurable % character overlap to match (default: 70%)
- Tolerant to typos and minor variations
"""
def __init__(self, threshold: float = 0.7):
"""Initialize fuzzy search algorithm.
Args:
threshold: Minimum character overlap ratio (0-1, default: 0.7)
"""
if not 0.0 <= threshold <= 1.0:
raise ValueError(f"Threshold must be between 0.0 and 1.0, got {threshold}")
self.threshold = threshold
@property
def name(self) -> str:
return "fuzzy"
async def search(
self,
query: str,
user_id: str,
limit: int = 10,
doc_type: str | None = None,
**kwargs: Any,
) -> list[SearchResult]:
"""Execute fuzzy search using character overlap on Qdrant payload.
Queries Qdrant for all indexed documents, then scores based on character
overlap in title and excerpt fields. Returns unverified results - access
verification should be performed separately at the final output stage.
Args:
query: Search query
user_id: User ID for filtering
limit: Maximum results to return
doc_type: Optional document type filter (None = all types)
**kwargs: Additional parameters (threshold override)
Returns:
List of unverified SearchResult objects ranked by character overlap score
"""
settings = get_settings()
threshold = kwargs.get("threshold", self.threshold)
logger.info(
f"Fuzzy search: query='{query}', user={user_id}, "
f"limit={limit}, threshold={threshold}, doc_type={doc_type}"
)
# Build Qdrant filter
filter_conditions = [
FieldCondition(key="user_id", match=MatchValue(value=user_id))
]
if doc_type:
filter_conditions.append(
FieldCondition(key="doc_type", match=MatchValue(value=doc_type))
)
# Scroll through Qdrant to get all matching documents
qdrant_client = await get_qdrant_client()
collection = settings.get_collection_name()
all_points = []
offset = None
# Scroll through all points matching filter
while True:
scroll_result, next_offset = await qdrant_client.scroll(
collection_name=collection,
scroll_filter=Filter(must=filter_conditions),
limit=100, # Batch size
offset=offset,
with_payload=["doc_id", "doc_type", "title", "excerpt", "chunk_index"],
with_vectors=False, # Don't need vectors
)
all_points.extend(scroll_result)
if next_offset is None:
break
offset = next_offset
logger.debug(f"Retrieved {len(all_points)} points from Qdrant for fuzzy search")
# Deduplicate by (doc_id, doc_type) - keep first chunk
seen_docs = {}
for point in all_points:
doc_id = int(point.payload["doc_id"])
dtype = point.payload.get("doc_type", "note")
doc_key = (doc_id, dtype)
chunk_idx = point.payload.get("chunk_index", 0)
if doc_key not in seen_docs or chunk_idx == 0:
seen_docs[doc_key] = point
logger.debug(f"Deduplicated to {len(seen_docs)} unique documents")
# Score each document based on fuzzy matches
scored_results = []
query_lower = query.lower()
for doc_key, point in seen_docs.items():
doc_id, dtype = doc_key
title = point.payload.get("title", "")
excerpt = point.payload.get("excerpt", "")
# Check title match
title_score = self._calculate_char_overlap(query_lower, title.lower())
# Check excerpt match
excerpt_score = self._calculate_char_overlap(query_lower, excerpt.lower())
# Use best score
best_score = max(title_score, excerpt_score)
if best_score >= threshold:
match_location = "title" if title_score >= excerpt_score else "excerpt"
scored_results.append(
{
"doc_id": doc_id,
"doc_type": dtype,
"title": title,
"excerpt": excerpt
if excerpt_score >= title_score
else f"Title match: {title}",
"score": best_score,
"match_location": match_location,
}
)
# Sort by score (descending) and limit
scored_results.sort(key=lambda x: x["score"], reverse=True)
top_results = scored_results[:limit]
# Return unverified results (verification happens at output stage)
final_results = []
for result in top_results:
final_results.append(
SearchResult(
id=result["doc_id"],
doc_type=result["doc_type"],
title=result["title"],
excerpt=result["excerpt"],
score=result["score"],
metadata={"match_location": result["match_location"]},
)
)
logger.info(f"Fuzzy search returned {len(final_results)} unverified results")
if final_results:
result_details = [
f"{r.doc_type}_{r.id} (score={r.score:.3f}, title='{r.title}')"
for r in final_results[:5]
]
logger.debug(f"Top fuzzy results: {', '.join(result_details)}")
return final_results
def _calculate_char_overlap(self, query: str, text: str) -> float:
"""Calculate character overlap ratio between query and text.
Args:
query: Query string (normalized)
text: Text to compare (normalized)
Returns:
Overlap ratio (0.0-1.0)
"""
if not query or not text:
return 0.0
# Convert to character sets
query_chars = set(query)
text_chars = set(text)
# Calculate overlap
overlap = query_chars & text_chars
overlap_ratio = len(overlap) / len(query_chars)
return overlap_ratio
def _extract_excerpt(self, content: str, max_length: int = 200) -> str:
"""Extract excerpt from content.
Args:
content: Full document content
max_length: Maximum excerpt length
Returns:
Excerpt string
"""
if not content:
return ""
excerpt = content[:max_length].strip()
if len(content) > max_length:
excerpt += "..."
return excerpt
-278
View File
@@ -1,278 +0,0 @@
"""Hybrid search algorithm using Reciprocal Rank Fusion (RRF)."""
import logging
from collections import defaultdict
from typing import Any
import anyio
from nextcloud_mcp_server.search.algorithms import SearchAlgorithm, SearchResult
from nextcloud_mcp_server.search.fuzzy import FuzzySearchAlgorithm
from nextcloud_mcp_server.search.keyword import KeywordSearchAlgorithm
from nextcloud_mcp_server.search.semantic import SemanticSearchAlgorithm
logger = logging.getLogger(__name__)
class HybridSearchAlgorithm(SearchAlgorithm):
"""Hybrid search combining multiple algorithms using Reciprocal Rank Fusion.
Implements RRF from ADR-003 to combine results from:
- Semantic search (vector similarity)
- Keyword search (token matching)
- Fuzzy search (character overlap)
RRF formula: score = weight / (k + rank)
where k=60 (standard value) and rank is 1-indexed position.
"""
DEFAULT_RRF_K = 60 # Standard RRF constant
def __init__(
self,
semantic_weight: float = 0.5,
keyword_weight: float = 0.3,
fuzzy_weight: float = 0.2,
rrf_k: int = DEFAULT_RRF_K,
):
"""Initialize hybrid search with algorithm weights.
Args:
semantic_weight: Weight for semantic results (default: 0.5)
keyword_weight: Weight for keyword results (default: 0.3)
fuzzy_weight: Weight for fuzzy results (default: 0.2)
rrf_k: RRF constant for rank decay (default: 60)
Raises:
ValueError: If weights are invalid
"""
# Validate weights
if semantic_weight < 0 or keyword_weight < 0 or fuzzy_weight < 0:
raise ValueError("Weights must be non-negative")
total_weight = semantic_weight + keyword_weight + fuzzy_weight
if total_weight > 1.0:
raise ValueError(f"Weights sum to {total_weight:.2f}, must be ≤1.0")
if total_weight == 0.0:
raise ValueError("At least one weight must be > 0")
self.semantic_weight = semantic_weight
self.keyword_weight = keyword_weight
self.fuzzy_weight = fuzzy_weight
self.rrf_k = rrf_k
self.total_weight = total_weight
# Initialize sub-algorithms
self.semantic = SemanticSearchAlgorithm()
self.keyword = KeywordSearchAlgorithm()
self.fuzzy = FuzzySearchAlgorithm()
@property
def name(self) -> str:
return "hybrid"
@property
def requires_vector_db(self) -> bool:
# Requires vector DB if semantic search has non-zero weight
return self.semantic_weight > 0
async def search(
self,
query: str,
user_id: str,
limit: int = 10,
doc_type: str | None = None,
**kwargs: Any,
) -> list[SearchResult]:
"""Execute hybrid search using RRF to combine algorithms.
Returns unverified results from combined algorithms. Access verification
should be performed separately at the final output stage.
Args:
query: Search query
user_id: User ID for filtering
limit: Maximum results to return
doc_type: Optional document type filter
**kwargs: Additional parameters passed to sub-algorithms
Returns:
List of unverified SearchResult objects ranked by RRF combined score
"""
logger.info(
f"Hybrid search: query='{query}', user={user_id}, limit={limit}, "
f"weights=(semantic={self.semantic_weight}, keyword={self.keyword_weight}, "
f"fuzzy={self.fuzzy_weight})"
)
# Prepare algorithm configurations for parallel execution
algo_configs = []
if self.semantic_weight > 0:
algo_configs.append(
(
"semantic",
self.semantic.search,
query,
user_id,
limit * 2,
doc_type,
kwargs,
)
)
if self.keyword_weight > 0:
algo_configs.append(
(
"keyword",
self.keyword.search,
query,
user_id,
limit * 2,
doc_type,
kwargs,
)
)
if self.fuzzy_weight > 0:
algo_configs.append(
(
"fuzzy",
self.fuzzy.search,
query,
user_id,
limit * 2,
doc_type,
kwargs,
)
)
# Pre-allocate results list and extract algorithm names
results_list = [None] * len(algo_configs)
algo_names = [name for name, *_ in algo_configs]
async def search_one(
index: int,
search_func,
query_arg: str,
user_id_arg: str,
limit_arg: int,
doc_type_arg: str | None,
kwargs_arg: dict,
):
"""Execute one search algorithm and store result at index."""
result = await search_func(
query_arg, user_id_arg, limit_arg, doc_type_arg, **kwargs_arg
)
results_list[index] = result
# Execute searches in parallel using anyio task group
async with anyio.create_task_group() as tg:
for idx, (name, search_func, q, uid, lim, dt, kw) in enumerate(
algo_configs
):
tg.start_soon(search_one, idx, search_func, q, uid, lim, dt, kw)
# Build results dict
algo_results = {}
for algo_name, results in zip(algo_names, results_list):
algo_results[algo_name] = results
logger.debug(f"{algo_name} returned {len(results)} results")
# Combine using RRF
combined_results = self._reciprocal_rank_fusion(
algo_results,
{
"semantic": self.semantic_weight,
"keyword": self.keyword_weight,
"fuzzy": self.fuzzy_weight,
},
limit,
)
logger.info(f"Hybrid search returned {len(combined_results)} combined results")
if combined_results:
result_details = [
f"{r.doc_type}_{r.id} (score={r.score:.3f}, title='{r.title}')"
for r in combined_results[:5]
]
logger.debug(f"Top hybrid results: {', '.join(result_details)}")
return combined_results
def _reciprocal_rank_fusion(
self,
algo_results: dict[str, list[SearchResult]],
weights: dict[str, float],
limit: int,
) -> list[SearchResult]:
"""Combine multiple ranked result lists using RRF.
Args:
algo_results: Dict of algorithm_name -> ranked results
weights: Dict of algorithm_name -> weight (0-1)
limit: Maximum results to return
Returns:
Combined and re-ranked results
"""
# Track RRF scores per document
rrf_scores: dict[tuple[int, str], float] = defaultdict(float)
# Track best result object for each document
best_results: dict[tuple[int, str], SearchResult] = {}
for algo_name, results in algo_results.items():
weight = weights.get(algo_name, 0.0)
if weight == 0:
continue
for rank, result in enumerate(results, start=1):
doc_key = (result.id, result.doc_type)
# RRF formula: weight / (k + rank)
rrf_score = weight / (self.rrf_k + rank)
rrf_scores[doc_key] += rrf_score
# Track best result object (prefer higher original scores)
if doc_key not in best_results:
best_results[doc_key] = result
elif result.score > best_results[doc_key].score:
best_results[doc_key] = result
# Sort by combined RRF score
sorted_docs = sorted(
rrf_scores.items(),
key=lambda x: x[1],
reverse=True,
)[:limit]
# Calculate normalization factor to scale RRF scores to 0-1 range
# Theoretical max RRF score = total_weight / (rrf_k + 1)
# Normalization factor = (rrf_k + 1) / total_weight
normalization_factor = (self.rrf_k + 1) / self.total_weight
# Build final results with normalized RRF scores
final_results = []
for doc_key, rrf_score in sorted_docs:
result = best_results[doc_key]
# Normalize RRF score to 0-1 range for better user comprehension
normalized_score = rrf_score * normalization_factor
# Create new result with normalized score
# Keep original metadata but add RRF details
metadata = result.metadata or {}
metadata["rrf_score_raw"] = rrf_score # Original RRF score
metadata["original_score"] = result.score # Original algorithm score
metadata["normalization_factor"] = normalization_factor
final_results.append(
SearchResult(
id=result.id,
doc_type=result.doc_type,
title=result.title,
excerpt=result.excerpt,
score=normalized_score, # Use normalized score (0-1 range)
metadata=metadata,
)
)
return final_results
-277
View File
@@ -1,277 +0,0 @@
"""Keyword search algorithm using token-based matching on Qdrant payload (ADR-001)."""
import logging
from typing import Any
from qdrant_client.models import FieldCondition, Filter, MatchValue
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.search.algorithms import SearchAlgorithm, SearchResult
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
class KeywordSearchAlgorithm(SearchAlgorithm):
"""Keyword search using token-based matching with weighted scoring.
Implements token-based search from ADR-001:
- Title matches weighted 3x higher than content matches
- Case-insensitive token matching
- Relevance scoring based on match frequency and location
"""
# Weighting constants from ADR-001
TITLE_WEIGHT = 3.0
CONTENT_WEIGHT = 1.0
@property
def name(self) -> str:
return "keyword"
async def search(
self,
query: str,
user_id: str,
limit: int = 10,
doc_type: str | None = None,
**kwargs: Any,
) -> list[SearchResult]:
"""Execute keyword search using token matching on Qdrant payload.
Queries Qdrant for all indexed documents, then scores based on token
matches in title and excerpt fields. Returns unverified results - access
verification should be performed separately at the final output stage.
Args:
query: Search query to tokenize and match
user_id: User ID for filtering
limit: Maximum results to return
doc_type: Optional document type filter (None = all types)
**kwargs: Additional parameters (unused)
Returns:
List of unverified SearchResult objects ranked by keyword match score
"""
settings = get_settings()
logger.info(
f"Keyword search: query='{query}', user={user_id}, "
f"limit={limit}, doc_type={doc_type}"
)
# Tokenize query
query_tokens = self._process_query(query)
logger.debug(f"Query tokens: {query_tokens}")
# Build Qdrant filter
filter_conditions = [
FieldCondition(key="user_id", match=MatchValue(value=user_id))
]
if doc_type:
filter_conditions.append(
FieldCondition(key="doc_type", match=MatchValue(value=doc_type))
)
# Scroll through Qdrant to get all matching documents
# We need title and excerpt from payload for token matching
qdrant_client = await get_qdrant_client()
collection = settings.get_collection_name()
all_points = []
offset = None
# Scroll through all points matching filter
while True:
scroll_result, next_offset = await qdrant_client.scroll(
collection_name=collection,
scroll_filter=Filter(must=filter_conditions),
limit=100, # Batch size
offset=offset,
with_payload=[
"doc_id",
"doc_type",
"title",
"excerpt",
"chunk_index",
"total_chunks",
],
with_vectors=False, # Don't need vectors for keyword search
)
all_points.extend(scroll_result)
if next_offset is None:
break
offset = next_offset
logger.debug(
f"Retrieved {len(all_points)} points from Qdrant for keyword search"
)
# Deduplicate by (doc_id, doc_type) - keep best chunk per document
seen_docs = {}
for point in all_points:
doc_id = int(point.payload["doc_id"])
dtype = point.payload.get("doc_type", "note")
doc_key = (doc_id, dtype)
# Keep first chunk (chunk_index=0) as it has the most relevant content
chunk_idx = point.payload.get("chunk_index", 0)
if doc_key not in seen_docs or chunk_idx == 0:
seen_docs[doc_key] = point
logger.debug(f"Deduplicated to {len(seen_docs)} unique documents")
# Score each document based on keyword matches
scored_results = []
for doc_key, point in seen_docs.items():
doc_id, dtype = doc_key
title = point.payload.get("title", "")
excerpt = point.payload.get("excerpt", "")
# Calculate keyword match score
score = self._calculate_score(query_tokens, title, excerpt)
if score > 0: # Only include matches
scored_results.append(
{
"doc_id": doc_id,
"doc_type": dtype,
"title": title,
"excerpt": excerpt,
"score": score,
}
)
# Sort by score (descending) and limit
scored_results.sort(key=lambda x: x["score"], reverse=True)
top_results = scored_results[:limit]
# Return unverified results (verification happens at output stage)
final_results = []
for result in top_results:
final_results.append(
SearchResult(
id=result["doc_id"],
doc_type=result["doc_type"],
title=result["title"],
excerpt=result["excerpt"],
score=result["score"],
metadata={},
)
)
logger.info(f"Keyword search returned {len(final_results)} unverified results")
if final_results:
result_details = [
f"{r.doc_type}_{r.id} (score={r.score:.3f}, title='{r.title}')"
for r in final_results[:5]
]
logger.debug(f"Top keyword results: {', '.join(result_details)}")
return final_results
def _process_query(self, query: str) -> list[str]:
"""Tokenize and normalize query.
Args:
query: Raw query string
Returns:
List of normalized tokens
"""
# Convert to lowercase and split into tokens
tokens = query.lower().split()
# Filter out very short tokens (optional)
tokens = [token for token in tokens if len(token) > 1]
return tokens
def _calculate_score(
self, query_tokens: list[str], title: str, content: str
) -> float:
"""Calculate relevance score based on token matches.
Args:
query_tokens: List of query tokens
title: Document title
content: Document content
Returns:
Relevance score (0.0-1.0)
"""
if not query_tokens:
return 0.0
# Process title and content
title_tokens = title.lower().split()
content_tokens = content.lower().split()
score = 0.0
# Count matches in title
title_matches = sum(1 for qt in query_tokens if qt in title_tokens)
if query_tokens: # Avoid division by zero
title_match_ratio = title_matches / len(query_tokens)
score += self.TITLE_WEIGHT * title_match_ratio
# Count matches in content
content_matches = sum(1 for qt in query_tokens if qt in content_tokens)
if query_tokens:
content_match_ratio = content_matches / len(query_tokens)
score += self.CONTENT_WEIGHT * content_match_ratio
# Normalize score to 0-1 range
# Max score would be TITLE_WEIGHT + CONTENT_WEIGHT if all tokens match everywhere
max_score = self.TITLE_WEIGHT + self.CONTENT_WEIGHT
normalized_score = min(score / max_score, 1.0)
return normalized_score
def _extract_excerpt(
self, content: str, query_tokens: list[str], max_length: int = 200
) -> str:
"""Extract excerpt showing match context.
Args:
content: Full document content
query_tokens: Query tokens to find
max_length: Maximum excerpt length in characters
Returns:
Excerpt string with context around matches
"""
if not content:
return ""
content_lower = content.lower()
# Find first occurrence of any query token
first_match_pos = -1
for token in query_tokens:
pos = content_lower.find(token)
if pos != -1:
if first_match_pos == -1 or pos < first_match_pos:
first_match_pos = pos
if first_match_pos == -1:
# No matches found, return beginning
return content[:max_length].strip() + (
"..." if len(content) > max_length else ""
)
# Extract context around match
start = max(0, first_match_pos - max_length // 2)
end = min(len(content), first_match_pos + max_length // 2)
excerpt = content[start:end].strip()
# Add ellipsis if truncated
if start > 0:
excerpt = "..." + excerpt
if end < len(content):
excerpt = excerpt + "..."
return excerpt
@@ -0,0 +1,907 @@
"""PDF chunk highlighting utilities for vector visualization.
This module provides utilities to generate highlighted page images showing
matched chunks and their context from semantic search results.
The highlighting uses character offsets to precisely locate chunks within
PDF documents, ensuring accurate highlighting even when text formatting
varies between indexing and rendering.
"""
import logging
import re
from typing import Optional
import pymupdf
import pymupdf4llm
logger = logging.getLogger(__name__)
class PDFHighlighter:
"""Generate highlighted page images from PDF chunks."""
# Color definitions (RGB, 0-1 range)
COLORS = {
"yellow": [1, 1, 0],
"red": [1, 0, 0],
"green": [0, 1, 0],
"blue": [0, 0, 1],
"orange": [1, 0.5, 0],
"pink": [1, 0, 1],
"gray": [0.7, 0.7, 0.7],
"light_blue": [0.7, 0.9, 1.0],
"light_green": [0.7, 1.0, 0.7],
}
@staticmethod
def strip_markdown(text: str) -> str:
"""Remove markdown formatting to improve search accuracy.
Args:
text: Text with potential markdown formatting
Returns:
Plain text with markdown removed
"""
# Remove bold/italic markers
text = re.sub(r"\*\*(.+?)\*\*", r"\1", text)
text = re.sub(r"\*(.+?)\*", r"\1", text)
text = re.sub(r"__(.+?)__", r"\1", text)
text = re.sub(r"_(.+?)_", r"\1", text)
# Remove headers
text = re.sub(r"^#+\s+", "", text, flags=re.MULTILINE)
# Remove inline code
text = re.sub(r"`(.+?)`", r"\1", text)
return text.strip()
@staticmethod
def extract_pdf_text_with_boundaries(
pdf_doc: pymupdf.Document,
) -> tuple[str, list[dict]]:
"""Extract full document text with page boundary tracking.
Uses pymupdf4llm.to_markdown() for consistency with indexing.
IMPORTANT: Must use write_images=True to match PyMuPDFProcessor behavior!
Even though we don't need the images, we need the image references in the
markdown text to maintain consistent character offsets with indexing.
Args:
pdf_doc: Open PyMuPDF document
Returns:
Tuple of (full_text, page_boundaries) where page_boundaries is a list of:
{"page": 1, "start_offset": 0, "end_offset": 1234}
"""
import tempfile
from pathlib import Path
page_boundaries = []
text_parts = []
current_offset = 0
# Use temp directory for image output (images are discarded after extraction)
temp_dir = Path(tempfile.mkdtemp(prefix="pdf_highlight_"))
for page_idx in range(pdf_doc.page_count):
page_md = pymupdf4llm.to_markdown(
pdf_doc,
pages=[page_idx],
write_images=True, # Must match indexing! Otherwise offsets misalign
image_path=temp_dir,
page_chunks=False,
)
page_boundaries.append(
{
"page": page_idx + 1, # 1-indexed
"start_offset": current_offset,
"end_offset": current_offset + len(page_md),
}
)
text_parts.append(page_md)
current_offset += len(page_md)
full_text = "".join(text_parts)
# Clean up temp directory and extracted images
import shutil
try:
shutil.rmtree(temp_dir)
except Exception as e:
logger.warning(f"Failed to clean up temp directory {temp_dir}: {e}")
return full_text, page_boundaries
@staticmethod
def find_chunk_page(
chunk_start_offset: int,
chunk_end_offset: int,
page_boundaries: list[dict],
) -> Optional[dict]:
"""Find which page contains the most of a given chunk.
Args:
chunk_start_offset: Chunk start position in full document
chunk_end_offset: Chunk end position in full document
page_boundaries: Page boundary list from extract_pdf_text_with_boundaries()
Returns:
Dict with keys: page_num, overlap_chars, page_relative_start, page_relative_end
or None if chunk not found on any page
"""
chunk_pages = []
for boundary in page_boundaries:
page_start = boundary["start_offset"]
page_end = boundary["end_offset"]
# Check if chunk overlaps with this page
if chunk_start_offset < page_end and chunk_end_offset > page_start:
overlap_start = max(chunk_start_offset, page_start)
overlap_end = min(chunk_end_offset, page_end)
overlap_chars = overlap_end - overlap_start
chunk_pages.append(
{
"page_num": boundary["page"],
"overlap_chars": overlap_chars,
"page_relative_start": overlap_start - page_start,
"page_relative_end": overlap_end - page_start,
}
)
if not chunk_pages:
return None
# Return page with maximum overlap
return max(chunk_pages, key=lambda p: p["overlap_chars"])
@staticmethod
def highlight_chunk_by_word_positions(
page: pymupdf.Page,
chunk_text: str,
color: str = "yellow",
search_region: tuple[float, float, float, float] | None = None,
) -> int:
"""Highlight chunk using word-position matching.
This method matches words from the chunk to their positions on the PDF page,
avoiding text search mismatches between markdown-formatted text and raw PDF text.
Args:
page: PyMuPDF page object
chunk_text: Text to highlight (may contain markdown)
color: Color name from COLORS dict
search_region: Optional (x0, y0, x1, y1) bounding box to constrain search.
If provided, only words within this region are considered.
Returns:
Number of highlight rectangles added
"""
# Tokenize chunk into words (alphanumeric only, lowercase)
chunk_words = re.findall(
r"\w+", PDFHighlighter.strip_markdown(chunk_text).lower()
)
if not chunk_words:
logger.warning("No words found in chunk text")
return 0
# Get all words from page with positions
# Format: (x0, y0, x1, y1, "word", block_no, line_no, word_no)
try:
page_words = page.get_text("words")
except Exception as e:
logger.error(f"Failed to extract words from page: {e}")
return 0
if not page_words:
logger.warning("No words found on page")
return 0
# Filter words by search region if provided
if search_region:
rx0, ry0, rx1, ry1 = search_region
# Allow some tolerance (10 points) for words near region boundary
tolerance = 10
page_words = [
w
for w in page_words
if (
w[0] >= rx0 - tolerance
and w[2] <= rx1 + tolerance
and w[1] >= ry0 - tolerance
and w[3] <= ry1 + tolerance
)
]
logger.debug(
f"Filtered to {len(page_words)} words in region "
f"({rx0:.0f}, {ry0:.0f}, {rx1:.0f}, {ry1:.0f})"
)
if not page_words:
logger.warning("No words found in search region")
return 0
# Find matching word sequence - use FIRST match, not longest
# This ensures we highlight the actual chunk location, not similar text elsewhere
matches = []
# Build a simple word-to-positions index for the first few chunk words
# to find candidate starting positions
first_chunk_word = chunk_words[0] if chunk_words else ""
candidate_starts = []
for i, pw in enumerate(page_words):
page_word = pw[4].lower()
# Check if this could be the start of the chunk
if (
first_chunk_word == page_word
or first_chunk_word in page_word
or page_word in first_chunk_word
):
candidate_starts.append(i)
# Try each candidate start position and take the FIRST good match
for start_pos in candidate_starts:
current_matches = []
chunk_idx = 0
skip_count = 0
max_skips = 3 # Allow some formatting differences
for page_idx in range(start_pos, len(page_words)):
if chunk_idx >= len(chunk_words):
break
page_word = page_words[page_idx][4].lower()
chunk_word = chunk_words[chunk_idx]
# Check for match (allow partial matches for flexibility)
if (
chunk_word == page_word
or chunk_word in page_word
or page_word in chunk_word
):
current_matches.append(page_words[page_idx])
chunk_idx += 1
skip_count = 0
elif skip_count < max_skips:
# Allow skipping some words (formatting, punctuation)
skip_count += 1
continue
else:
break
# Accept if we matched at least 50% of chunk words
if len(current_matches) >= len(chunk_words) * 0.5:
matches = current_matches
logger.debug(
f"Found match at position {start_pos}: "
f"{len(matches)}/{len(chunk_words)} words"
)
break # Take FIRST match, not best/longest
if not matches:
logger.debug(f"No word matches found (chunk has {len(chunk_words)} words)")
return 0
logger.debug(
f"Matched {len(matches)} words out of {len(chunk_words)} chunk words"
)
# Build rectangles from matched words
rects = [pymupdf.Rect(w[0], w[1], w[2], w[3]) for w in matches]
# Check if matches are contiguous (not scattered across the page)
# Scattered matches indicate false positives from common words
if len(rects) > 1:
# Sort by vertical position then horizontal
sorted_matches = sorted(matches, key=lambda w: (round(w[1]), w[0]))
# Check for large vertical gaps (more than ~2 lines apart)
# A typical line height is 12-20 points
max_line_gap = 50 # Points - allows for ~2-3 lines gap
prev_y = sorted_matches[0][1]
large_gaps = 0
for match in sorted_matches[1:]:
y_gap = match[1] - prev_y
if y_gap > max_line_gap:
large_gaps += 1
prev_y = match[1]
# If matches are scattered (many large gaps), reject this match
# A chunk should be mostly contiguous text
if large_gaps > len(matches) * 0.3: # More than 30% have gaps
logger.debug(
f"Rejecting scattered matches: {large_gaps} large gaps "
f"out of {len(matches)} matches"
)
return 0
# Merge adjacent rectangles on the same line for cleaner highlighting
merged_rects = []
sorted_rects = sorted(rects, key=lambda r: (round(r.y0), r.x0))
current_rect = None
for rect in sorted_rects:
if current_rect is None:
current_rect = rect
elif abs(rect.y0 - current_rect.y0) < 5: # Same line (within 5 points)
current_rect = current_rect | rect # Union
else:
merged_rects.append(current_rect)
current_rect = rect
if current_rect:
merged_rects.append(current_rect)
# Add highlights
rgb = PDFHighlighter.COLORS.get(color, PDFHighlighter.COLORS["yellow"])
for rect in merged_rects:
highlight = page.add_highlight_annot(rect)
highlight.set_colors({"stroke": rgb})
highlight.set_info(
content="Chunk from semantic search",
title="PDF Highlighter (word-position)",
)
highlight.update()
return len(merged_rects)
@staticmethod
def find_unique_phrase(
text: str, min_len: int = 30, max_len: int = 80
) -> str | None:
"""Find a relatively unique phrase from text for location search.
Looks for phrases that are likely to be unique on the page:
- Prefers phrases with numbers or special terms
- Avoids very common words
Args:
text: Source text to extract phrase from
min_len: Minimum phrase length
max_len: Maximum phrase length
Returns:
A phrase likely to be unique, or None if not found
"""
clean_text = PDFHighlighter.strip_markdown(text).strip()
if not clean_text:
return None
# Try first sentence (often unique due to context)
sentences = re.split(r"[.!?]\s+", clean_text)
for sentence in sentences:
sentence = sentence.strip()
if min_len <= len(sentence) <= max_len:
return sentence
elif len(sentence) > max_len:
return sentence[:max_len]
# Fallback: first N chars
if len(clean_text) >= min_len:
return clean_text[:max_len]
return clean_text if clean_text else None
@staticmethod
def _find_chunk_bbox(
page: pymupdf.Page,
chunk_text: str,
page_relative_start: int,
page_relative_end: int,
page_text_length: int,
) -> tuple[float, float, float, float] | None:
"""Find bounding box for a chunk without modifying the page.
Returns (x0, y0, x1, y1) in page coordinates, or None if not found.
"""
page_rect = page.rect
# Strip markdown for searching
search_text = PDFHighlighter.strip_markdown(chunk_text)
# Try to find chunk location using text search
anchor_rect = None
search_phrases = []
# Build search phrases from chunk text
sentences = re.split(r"[.!?]\s+", search_text)
for sentence in sentences[:3]:
sentence = sentence.strip()
if len(sentence) >= 20:
search_phrases.append(sentence[:80])
if len(sentence) >= 40:
search_phrases.append(sentence[:40])
# Also try first N characters
if len(search_text) >= 30:
search_phrases.append(search_text[:60])
search_phrases.append(search_text[:30])
for phrase in search_phrases:
if not phrase:
continue
rects = page.search_for(phrase.strip())
if rects:
anchor_rect = rects[0]
break
if not anchor_rect:
return None
# Calculate chunk height based on character count
chunk_chars = len(search_text)
estimated_lines = max(1, chunk_chars / 60)
estimated_height = estimated_lines * 14
# Build bounding box
return (
page_rect.x0 + 30, # Left margin
anchor_rect.y0 - 5, # Start slightly above anchor
page_rect.x1 - 30, # Right margin
min(anchor_rect.y0 + estimated_height + 10, page_rect.y1 - 30),
)
@staticmethod
def highlight_chunk_on_page(
page: pymupdf.Page,
chunk_text: str,
color: str = "yellow",
page_relative_start: int | None = None,
page_relative_end: int | None = None,
page_text_length: int | None = None,
) -> int:
"""Add bounding box highlight to a PDF page for the given chunk text.
Uses text search to find the chunk's location on the page, then draws
a bounding box around that region. Falls back to character offset estimation
if text search fails.
Args:
page: PyMuPDF page object
chunk_text: Text to highlight (may contain markdown)
color: Color name from COLORS dict
page_relative_start: Character offset where chunk starts on page (optional)
page_relative_end: Character offset where chunk ends on page (optional)
page_text_length: Total character length of page text (optional)
Returns:
Number of highlights added (1 for bounding box, 0 if failed)
"""
page_rect = page.rect
rgb = PDFHighlighter.COLORS.get(color, PDFHighlighter.COLORS["yellow"])
# Strip markdown for searching
search_text = PDFHighlighter.strip_markdown(chunk_text)
# Try to find chunk location using text search
# Search for progressively shorter phrases until we find a match
anchor_rect = None
search_phrases = []
# Build search phrases from chunk text
sentences = re.split(r"[.!?]\s+", search_text)
for sentence in sentences[:3]: # Try first 3 sentences
sentence = sentence.strip()
if len(sentence) >= 20:
search_phrases.append(sentence[:80])
if len(sentence) >= 40:
search_phrases.append(sentence[:40])
# Also try first N characters
if len(search_text) >= 30:
search_phrases.append(search_text[:60])
search_phrases.append(search_text[:30])
for phrase in search_phrases:
if not phrase:
continue
rects = page.search_for(phrase.strip())
if rects:
anchor_rect = rects[0] # Use first match
logger.debug(f"Found chunk anchor using phrase: '{phrase[:30]}...'")
break
if not anchor_rect:
page_num = page.number + 1 if page.number is not None else "unknown"
logger.warning(f"Could not find chunk text on page {page_num}")
return 0
# Calculate chunk height based on character count
# Estimate ~15 chars per line, ~12pt line height
chunk_chars = len(search_text)
estimated_lines = max(1, chunk_chars / 60) # ~60 chars per line typical
estimated_height = estimated_lines * 14 # ~14pt per line
# Build bounding box starting from anchor
chunk_rect = pymupdf.Rect(
page_rect.x0 + 30, # Left margin
anchor_rect.y0 - 5, # Start slightly above anchor
page_rect.x1 - 30, # Right margin
min(
anchor_rect.y0 + estimated_height + 10, page_rect.y1 - 30
), # Estimated bottom
)
# Draw a visible rectangle around the chunk region
shape = page.new_shape()
shape.draw_rect(chunk_rect)
shape.finish(
color=rgb, # Border color
fill=None, # No fill (transparent)
width=2.5, # Border width
dashes="[4 2]", # Dashed line
)
shape.commit()
# Add semi-transparent fill for visibility
fill_shape = page.new_shape()
fill_shape.draw_rect(chunk_rect)
fill_shape.finish(
color=None, # No border
fill=[1, 1, 0.7], # Light yellow fill
fill_opacity=0.15, # Very transparent
)
fill_shape.commit()
logger.debug(
f"Added bounding box at y={chunk_rect.y0:.0f}-{chunk_rect.y1:.0f} "
f"(estimated {estimated_lines:.1f} lines)"
)
return 1
@staticmethod
def highlight_chunk(
pdf_bytes: bytes,
chunk_start_offset: int,
chunk_end_offset: int,
stored_page_number: Optional[int] = None,
color: str = "yellow",
zoom: float = 2.0,
) -> Optional[tuple[bytes, int, int]]:
"""Generate PNG image of PDF page with highlighted chunk.
This is the main entry point for highlighting. It:
1. Extracts document text with page boundaries
2. Finds which page contains the chunk
3. Extracts chunk text using character offsets
4. Highlights the chunk on the page
5. Renders page to PNG
Args:
pdf_bytes: PDF file bytes
chunk_start_offset: Chunk start position (document-level)
chunk_end_offset: Chunk end position (document-level)
stored_page_number: Page number from metadata (optional, for validation)
color: Highlight color name
zoom: Rendering zoom factor (2.0 = 144 DPI)
Returns:
Tuple of (png_bytes, page_number, highlight_count) or None if failed
"""
import tempfile
from pathlib import Path
temp_pdf_path = None
try:
# Write PDF to temp file with consistent name "pdf.pdf"
# This ensures image references match indexing (e.g., pdf-0001.png)
# Different temp filenames would cause different markdown text lengths!
temp_dir = Path(tempfile.mkdtemp(prefix="pdf_highlight_"))
temp_pdf_path = temp_dir / "pdf.pdf"
temp_pdf_path.write_bytes(pdf_bytes)
# Open PDF from temp file
doc = pymupdf.open(temp_pdf_path)
# Extract text with page boundaries
full_text, page_boundaries = (
PDFHighlighter.extract_pdf_text_with_boundaries(doc)
)
# Find which page contains the chunk
chunk_page_info = PDFHighlighter.find_chunk_page(
chunk_start_offset, chunk_end_offset, page_boundaries
)
if not chunk_page_info:
logger.error("Chunk not found on any page")
doc.close()
return None
page_num = chunk_page_info["page_num"]
# Log if page differs from stored metadata
if stored_page_number and stored_page_number != page_num:
logger.info(
f"Chunk primarily on page {page_num}, metadata says {stored_page_number}"
)
# Extract page text
page_boundary = page_boundaries[page_num - 1]
page_start = page_boundary["start_offset"]
page_end = page_boundary["end_offset"]
page_text = full_text[page_start:page_end]
# Extract chunk text using page-relative offsets
page_relative_start = chunk_page_info["page_relative_start"]
page_relative_end = chunk_page_info["page_relative_end"]
chunk_text = page_text[page_relative_start:page_relative_end]
# Calculate page text length for region estimation
page_text_length = page_end - page_start
logger.debug(
f"Extracted {len(chunk_text)} chars on page {page_num} "
f"(offsets {page_relative_start}-{page_relative_end} of {page_text_length})"
)
# Get page and add highlights
page = doc[page_num - 1]
highlight_count = PDFHighlighter.highlight_chunk_on_page(
page,
chunk_text,
color,
page_relative_start=page_relative_start,
page_relative_end=page_relative_end,
page_text_length=page_text_length,
)
if highlight_count == 0:
logger.warning("No highlights added")
doc.close()
return None
# Render page to PNG
mat = pymupdf.Matrix(zoom, zoom)
pix = page.get_pixmap(matrix=mat, alpha=False)
png_bytes = pix.tobytes("png")
doc.close()
logger.info(
f"Generated {len(png_bytes):,} byte image with {highlight_count} highlights"
)
return (png_bytes, page_num, highlight_count)
except Exception as e:
logger.error(f"Error highlighting chunk: {e}", exc_info=True)
return None
finally:
# Clean up temp directory and PDF file
if temp_pdf_path and temp_pdf_path.parent.exists():
try:
import shutil
shutil.rmtree(temp_pdf_path.parent)
except Exception as e:
logger.warning(
f"Failed to delete temp directory {temp_pdf_path.parent}: {e}"
)
@staticmethod
def highlight_chunks_batch(
pdf_bytes: bytes,
chunks: list[tuple[int, int, int, int | None, str]],
page_boundaries: list[dict],
full_text: str,
color: str = "yellow",
zoom: float = 2.0,
) -> dict[int, tuple[bytes, int, int]]:
"""Generate highlighted images for multiple chunks.
Opens PDF once for rendering, uses pre-computed page boundaries from the
document processor. This ensures consistent character offsets between
chunking and highlighting.
Args:
pdf_bytes: PDF file bytes
chunks: List of (chunk_index, start_offset, end_offset, stored_page_number, chunk_text)
The chunk_index is used as the key in the returned dict.
chunk_text is the actual text content of the chunk.
page_boundaries: Pre-computed page boundaries from document processor.
Each entry: {"page": 1, "start_offset": 0, "end_offset": 1234}
full_text: Full document text for extracting page-relative portions.
color: Highlight color name
zoom: Rendering zoom factor (2.0 = 144 DPI)
Returns:
Dict mapping chunk_index to (png_bytes, page_number, highlight_count)
Chunks that fail to highlight are omitted from the result.
"""
import shutil
import tempfile
from collections import defaultdict
from pathlib import Path
results: dict[int, tuple[bytes, int, int]] = {}
if not chunks:
return results
temp_pdf_path = None
try:
# Write PDF to temp file
temp_dir = Path(tempfile.mkdtemp(prefix="pdf_highlight_batch_"))
temp_pdf_path = temp_dir / "pdf.pdf"
temp_pdf_path.write_bytes(pdf_bytes)
# Open PDF once (only for rendering, not text extraction)
doc = pymupdf.open(temp_pdf_path)
logger.debug(
f"Batch highlighting: {len(chunks)} chunks, "
f"{len(page_boundaries)} pages"
)
# Group chunks by their target page for efficient rendering
# We'll render each page only once with all its highlights
chunks_by_page: dict[int, list[tuple[int, dict, str]]] = defaultdict(list)
for chunk_tuple in chunks:
# Unpack chunk tuple - chunk_text is now passed directly
chunk_index, start_offset, end_offset, stored_page_num, chunk_text = (
chunk_tuple
)
# Find which page contains this chunk
chunk_page_info = PDFHighlighter.find_chunk_page(
start_offset, end_offset, page_boundaries
)
if not chunk_page_info:
logger.warning(f"Chunk {chunk_index}: not found on any page")
continue
page_num = chunk_page_info["page_num"]
# Log if page differs from stored metadata
if stored_page_num and stored_page_num != page_num:
logger.debug(
f"Chunk {chunk_index}: found on page {page_num}, "
f"metadata says {stored_page_num}"
)
# Extract page-relative portion of chunk text
# This is critical for cross-page chunks where the start
# of the chunk might be on a different page
page_boundary = page_boundaries[page_num - 1]
page_start = page_boundary["start_offset"]
page_end = page_boundary["end_offset"]
page_text_length = page_end - page_start
# Calculate what portion of the chunk appears on this page
chunk_start_on_page = max(start_offset, page_start)
chunk_end_on_page = min(end_offset, page_end)
# Extract just the text that appears on this page
page_relative_text = full_text[chunk_start_on_page:chunk_end_on_page]
chunks_by_page[page_num].append(
(chunk_index, chunk_page_info, page_relative_text, page_text_length)
)
logger.debug(
f"Chunks distributed across {len(chunks_by_page)} unique pages"
)
# OPTIMIZATION: Render each page ONCE, then draw highlights using PIL
# This avoids expensive page.get_pixmap() calls per chunk
from io import BytesIO
from PIL import Image, ImageDraw
# PIL color for bounding box (RGB tuple)
rgb = PDFHighlighter.COLORS.get(color, PDFHighlighter.COLORS["yellow"])
pil_color = tuple(int(c * 255) for c in rgb)
fill_color = (255, 255, 178, 38) # Light yellow with alpha
for page_num, page_chunks in chunks_by_page.items():
page = doc[page_num - 1]
# Render page ONCE to get base image (most expensive operation)
mat = pymupdf.Matrix(zoom, zoom)
base_pix = page.get_pixmap(matrix=mat, alpha=False)
base_png = base_pix.tobytes("png")
# Convert to PIL Image for fast highlight drawing
base_image = Image.open(BytesIO(base_png)).convert("RGBA")
page_rect = page.rect
logger.debug(
f"Page {page_num}: rendered once, processing {len(page_chunks)} chunks"
)
for (
chunk_index,
chunk_page_info,
chunk_text,
page_text_length,
) in page_chunks:
try:
# Find chunk bounding box using text search
bbox = PDFHighlighter._find_chunk_bbox(
page,
chunk_text,
chunk_page_info["page_relative_start"],
chunk_page_info["page_relative_end"],
page_text_length,
)
if bbox is None:
logger.warning(f"Chunk {chunk_index}: could not find bbox")
continue
# Copy base image for this chunk
chunk_image = base_image.copy()
# Scale bbox coordinates to pixmap coordinates
scale_x = base_pix.width / page_rect.width
scale_y = base_pix.height / page_rect.height
pil_bbox = (
int(bbox[0] * scale_x),
int(bbox[1] * scale_y),
int(bbox[2] * scale_x),
int(bbox[3] * scale_y),
)
# Create transparent overlay for fill (proper alpha blending)
overlay = Image.new("RGBA", chunk_image.size, (0, 0, 0, 0))
overlay_draw = ImageDraw.Draw(overlay)
overlay_draw.rectangle(pil_bbox, fill=fill_color)
# Alpha composite the overlay onto the chunk image
chunk_image = Image.alpha_composite(chunk_image, overlay)
# Draw border on top (solid, not transparent)
border_draw = ImageDraw.Draw(chunk_image)
border_draw.rectangle(pil_bbox, outline=pil_color, width=3)
# Convert back to PNG bytes
output = BytesIO()
chunk_image.convert("RGB").save(output, format="PNG")
png_bytes = output.getvalue()
results[chunk_index] = (png_bytes, page_num, 1)
logger.debug(
f"Chunk {chunk_index}: {len(png_bytes):,} bytes, "
f"page {page_num}, bbox {pil_bbox}"
)
except Exception as e:
logger.error(f"Chunk {chunk_index}: error - {e}")
continue
doc.close()
logger.info(
f"Batch highlighted {len(results)}/{len(chunks)} chunks successfully"
)
return results
except Exception as e:
logger.error(f"Error in batch highlighting: {e}", exc_info=True)
return results
finally:
# Clean up temp directory
if temp_pdf_path and temp_pdf_path.parent.exists():
try:
shutil.rmtree(temp_pdf_path.parent)
except Exception as e:
logger.warning(f"Failed to clean up temp dir: {e}")
+28 -8
View File
@@ -9,6 +9,7 @@ from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.embedding import get_embedding_service
from nextcloud_mcp_server.observability.metrics import record_qdrant_operation
from nextcloud_mcp_server.search.algorithms import SearchAlgorithm, SearchResult
from nextcloud_mcp_server.vector.placeholder import get_placeholder_filter
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
@@ -50,6 +51,9 @@ class SemanticSearchAlgorithm(SearchAlgorithm):
Returns unverified results from Qdrant. Access verification should be
performed separately at the final output stage using verify_search_results().
Deduplicates by (doc_id, doc_type, chunk_start_offset, chunk_end_offset)
to show multiple chunks from the same document while avoiding duplicate chunks.
Args:
query: Natural language search query
user_id: User ID for filtering
@@ -74,16 +78,19 @@ class SemanticSearchAlgorithm(SearchAlgorithm):
# Generate embedding for query
embedding_service = get_embedding_service()
query_embedding = await embedding_service.embed(query)
# Store for reuse by callers (e.g., viz_routes PCA visualization)
self.query_embedding = query_embedding
logger.debug(
f"Generated embedding for query (dimension={len(query_embedding)})"
)
# Build Qdrant filter
filter_conditions = [
get_placeholder_filter(), # Always exclude placeholders from user-facing queries
FieldCondition(
key="user_id",
match=MatchValue(value=user_id),
)
),
]
# Add doc_type filter if specified
@@ -101,6 +108,7 @@ class SemanticSearchAlgorithm(SearchAlgorithm):
search_response = await qdrant_client.query_points(
collection_name=settings.get_collection_name(),
query=query_embedding,
using="dense", # Use named dense vector (BM25 hybrid collections)
query_filter=Filter(must=filter_conditions),
limit=limit * 2, # Get extra for deduplication
score_threshold=score_threshold,
@@ -122,20 +130,26 @@ class SemanticSearchAlgorithm(SearchAlgorithm):
top_scores = [p.score for p in search_response.points[:3]]
logger.debug(f"Top 3 similarity scores: {top_scores}")
# Deduplicate by (doc_id, doc_type) - multiple chunks per document
seen_docs = set()
# Deduplicate by (doc_id, doc_type, chunk_start, chunk_end)
# This allows multiple chunks from same doc, but removes duplicate chunks
seen_chunks = set()
results = []
for result in search_response.points:
doc_id = int(result.payload["doc_id"])
if result.payload is None:
continue
# doc_id can be int (notes) or str (files - file paths)
doc_id = result.payload["doc_id"]
doc_type = result.payload.get("doc_type", "note")
doc_key = (doc_id, doc_type)
chunk_start = result.payload.get("chunk_start_offset")
chunk_end = result.payload.get("chunk_end_offset")
chunk_key = (doc_id, doc_type, chunk_start, chunk_end)
# Skip if we've already seen this document
if doc_key in seen_docs:
# Skip if we've already seen this exact chunk
if chunk_key in seen_chunks:
continue
seen_docs.add(doc_key)
seen_chunks.add(chunk_key)
# Return unverified results (verification happens at output stage)
results.append(
@@ -149,6 +163,12 @@ class SemanticSearchAlgorithm(SearchAlgorithm):
"chunk_index": result.payload.get("chunk_index"),
"total_chunks": result.payload.get("total_chunks"),
},
chunk_start_offset=result.payload.get("chunk_start_offset"),
chunk_end_offset=result.payload.get("chunk_end_offset"),
page_number=result.payload.get("page_number"),
chunk_index=result.payload.get("chunk_index", 0),
total_chunks=result.payload.get("total_chunks", 1),
point_id=str(result.id), # Qdrant point ID for batch retrieval
)
)
+2
View File
@@ -2,6 +2,7 @@ from .calendar import configure_calendar_tools
from .contacts import configure_contacts_tools
from .cookbook import configure_cookbook_tools
from .deck import configure_deck_tools
from .news import configure_news_tools
from .notes import configure_notes_tools
from .semantic import configure_semantic_tools
from .sharing import configure_sharing_tools
@@ -13,6 +14,7 @@ __all__ = [
"configure_contacts_tools",
"configure_cookbook_tools",
"configure_deck_tools",
"configure_news_tools",
"configure_notes_tools",
"configure_semantic_tools",
"configure_sharing_tools",
+360
View File
@@ -0,0 +1,360 @@
"""MCP tools for Nextcloud News app."""
import logging
from httpx import HTTPStatusError, RequestError
from mcp.server.fastmcp import Context, FastMCP
from mcp.shared.exceptions import McpError
from mcp.types import ErrorData
from nextcloud_mcp_server.auth import require_scopes
from nextcloud_mcp_server.client.news import NewsItemType
from nextcloud_mcp_server.context import get_client
from nextcloud_mcp_server.models.news import (
FeedHealthResponse,
GetItemResponse,
GetStatusResponse,
ListFeedsResponse,
ListFoldersResponse,
ListItemsResponse,
NewsFeed,
NewsFolder,
NewsItem,
NewsItemSummary,
)
from nextcloud_mcp_server.observability.metrics import instrument_tool
logger = logging.getLogger(__name__)
def configure_news_tools(mcp: FastMCP):
"""Configure News app MCP tools."""
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_list_folders(ctx: Context) -> ListFoldersResponse:
"""List all News folders (requires news:read scope)."""
client = await get_client(ctx)
try:
folders_data = await client.news.get_folders()
folders = [NewsFolder(**f) for f in folders_data]
return ListFoldersResponse(results=folders, total_count=len(folders))
except RequestError as e:
raise McpError(
ErrorData(code=-1, message=f"Network error listing folders: {str(e)}")
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to list folders: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_list_feeds(ctx: Context) -> ListFeedsResponse:
"""List all News feeds with metadata (requires news:read scope).
Returns feeds with unread counts, error status, and overall starred count.
"""
client = await get_client(ctx)
try:
data = await client.news.get_feeds()
feeds = [NewsFeed(**f) for f in data.get("feeds", [])]
return ListFeedsResponse(
results=feeds,
starred_count=data.get("starredCount", 0),
newest_item_id=data.get("newestItemId"),
total_count=len(feeds),
)
except RequestError as e:
raise McpError(
ErrorData(code=-1, message=f"Network error listing feeds: {str(e)}")
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to list feeds: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_list_items(
ctx: Context,
feed_id: int | None = None,
folder_id: int | None = None,
starred_only: bool = False,
unread_only: bool = False,
limit: int = 50,
offset: int = 0,
) -> ListItemsResponse:
"""List News items (articles) with optional filtering (requires news:read scope).
Args:
feed_id: Filter by specific feed ID
folder_id: Filter by specific folder ID
starred_only: Return only starred items
unread_only: Return only unread items
limit: Maximum number of items to return (default 50, -1 for all)
offset: Item ID to start after (for pagination)
Returns:
ListItemsResponse with items, count, and pagination info
"""
client = await get_client(ctx)
# Determine item type filter
type_ = NewsItemType.ALL
id_ = 0
if starred_only:
type_ = NewsItemType.STARRED
elif feed_id is not None:
type_ = NewsItemType.FEED
id_ = feed_id
elif folder_id is not None:
type_ = NewsItemType.FOLDER
id_ = folder_id
try:
items_data = await client.news.get_items(
batch_size=limit,
offset=offset,
type_=type_,
id_=id_,
get_read=not unread_only,
)
items = [NewsItemSummary(**i) for i in items_data]
# Determine pagination info
oldest_id = min((i.id for i in items), default=None) if items else None
has_more = len(items) == limit and limit > 0
return ListItemsResponse(
results=items,
total_count=len(items),
has_more=has_more,
oldest_id=oldest_id,
)
except RequestError as e:
raise McpError(
ErrorData(code=-1, message=f"Network error listing items: {str(e)}")
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to list items: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_get_item(item_id: int, ctx: Context) -> GetItemResponse:
"""Get a specific News item by ID with full content (requires news:read scope).
Args:
item_id: Item ID
Returns:
GetItemResponse with full item details including HTML body
"""
client = await get_client(ctx)
try:
item_data = await client.news.get_item(item_id)
item = NewsItem(**item_data)
return GetItemResponse(item=item)
except ValueError as e:
raise McpError(ErrorData(code=-1, message=str(e)))
except RequestError as e:
raise McpError(
ErrorData(
code=-1, message=f"Network error getting item {item_id}: {str(e)}"
)
)
except HTTPStatusError as e:
if e.response.status_code == 404:
raise McpError(ErrorData(code=-1, message=f"Item {item_id} not found"))
raise McpError(
ErrorData(
code=-1,
message=f"Failed to get item {item_id}: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_get_starred_items(
ctx: Context, limit: int = 50, offset: int = 0
) -> ListItemsResponse:
"""Get starred (favorited) News items (requires news:read scope).
Convenience method for retrieving user's starred articles.
Args:
limit: Maximum number of items to return (default 50, -1 for all)
offset: Item ID to start after (for pagination)
Returns:
ListItemsResponse with starred items
"""
client = await get_client(ctx)
try:
items_data = await client.news.get_items(
batch_size=limit,
offset=offset,
type_=NewsItemType.STARRED,
get_read=True, # Include read starred items
)
items = [NewsItemSummary(**i) for i in items_data]
oldest_id = min((i.id for i in items), default=None) if items else None
has_more = len(items) == limit and limit > 0
return ListItemsResponse(
results=items,
total_count=len(items),
has_more=has_more,
oldest_id=oldest_id,
)
except RequestError as e:
raise McpError(
ErrorData(
code=-1, message=f"Network error getting starred items: {str(e)}"
)
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to get starred items: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_get_unread_items(
ctx: Context, limit: int = 50, offset: int = 0
) -> ListItemsResponse:
"""Get unread News items (requires news:read scope).
Convenience method for retrieving unread articles across all feeds.
Args:
limit: Maximum number of items to return (default 50, -1 for all)
offset: Item ID to start after (for pagination)
Returns:
ListItemsResponse with unread items
"""
client = await get_client(ctx)
try:
items_data = await client.news.get_items(
batch_size=limit,
offset=offset,
type_=NewsItemType.ALL,
get_read=False, # Only unread items
)
items = [NewsItemSummary(**i) for i in items_data]
oldest_id = min((i.id for i in items), default=None) if items else None
has_more = len(items) == limit and limit > 0
return ListItemsResponse(
results=items,
total_count=len(items),
has_more=has_more,
oldest_id=oldest_id,
)
except RequestError as e:
raise McpError(
ErrorData(
code=-1, message=f"Network error getting unread items: {str(e)}"
)
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to get unread items: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_get_feed_health(feed_id: int, ctx: Context) -> FeedHealthResponse:
"""Get health status for a specific feed (requires news:read scope).
Returns error count and last error message if the feed has update issues.
Args:
feed_id: Feed ID to check
Returns:
FeedHealthResponse with error status
"""
client = await get_client(ctx)
try:
data = await client.news.get_feeds()
for feed_data in data.get("feeds", []):
if feed_data.get("id") == feed_id:
feed = NewsFeed(**feed_data)
return FeedHealthResponse(
feed_id=feed.id,
title=feed.title,
url=feed.url,
has_errors=feed.has_errors,
error_count=feed.update_error_count,
last_error=feed.last_update_error,
)
raise McpError(ErrorData(code=-1, message=f"Feed {feed_id} not found"))
except RequestError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Network error getting feed health: {str(e)}",
)
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to get feed health: {e.response.status_code}",
)
)
@mcp.tool()
@require_scopes("news:read")
@instrument_tool
async def nc_news_get_status(ctx: Context) -> GetStatusResponse:
"""Get News app status and version (requires news:read scope).
Returns version information and any configuration warnings.
"""
client = await get_client(ctx)
try:
status_data = await client.news.get_status()
return GetStatusResponse(
version=status_data.get("version", "unknown"),
warnings=status_data.get("warnings", {}),
)
except RequestError as e:
raise McpError(
ErrorData(code=-1, message=f"Network error getting status: {str(e)}")
)
except HTTPStatusError as e:
raise McpError(
ErrorData(
code=-1,
message=f"Failed to get status: {e.response.status_code}",
)
)
+166 -100
View File
@@ -1,7 +1,6 @@
"""Semantic search MCP tools using vector database."""
import logging
from typing import Literal
import anyio
from httpx import RequestError
@@ -26,12 +25,8 @@ from nextcloud_mcp_server.models.semantic import (
from nextcloud_mcp_server.observability.metrics import (
instrument_tool,
)
from nextcloud_mcp_server.search import (
FuzzySearchAlgorithm,
HybridSearchAlgorithm,
KeywordSearchAlgorithm,
SemanticSearchAlgorithm,
)
from nextcloud_mcp_server.search.bm25_hybrid import BM25HybridSearchAlgorithm
from nextcloud_mcp_server.search.context import get_chunk_with_context
logger = logging.getLogger(__name__)
@@ -47,36 +42,38 @@ def configure_semantic_tools(mcp: FastMCP):
ctx: Context,
limit: int = 10,
doc_types: list[str] | None = None,
score_threshold: float = 0.7,
algorithm: Literal["semantic", "keyword", "fuzzy", "hybrid"] = "hybrid",
semantic_weight: float = 0.5,
keyword_weight: float = 0.3,
fuzzy_weight: float = 0.2,
score_threshold: float = 0.0,
fusion: str = "rrf",
include_context: bool = False,
context_chars: int = 300,
) -> SemanticSearchResponse:
"""
Search Nextcloud content using configurable algorithms with cross-app support.
Search Nextcloud content using BM25 hybrid search with cross-app support.
Supports multiple search algorithms with client-configurable weighting:
- semantic: Vector similarity search (requires VECTOR_SYNC_ENABLED=true)
- keyword: Token-based matching (title matches weighted 3x)
- fuzzy: Character overlap matching (typo-tolerant)
- hybrid: Combines all algorithms using Reciprocal Rank Fusion (default)
Uses Qdrant's native hybrid search combining:
- Dense semantic vectors: For conceptual similarity and natural language queries
- BM25 sparse vectors: For precise keyword matching, acronyms, and specific terms
Document types are queried from the vector database to determine what's
actually indexed. Currently only "note" documents are fully supported.
Results are automatically fused using the selected fusion algorithm in the
database for optimal relevance. This provides the best of both semantic
understanding and keyword precision.
Requires VECTOR_SYNC_ENABLED=true. Currently only "note" documents are
fully supported for indexing.
Args:
query: Natural language search query
query: Natural language or keyword search query
limit: Maximum number of results to return (default: 10)
doc_types: Document types to search (e.g., ["note", "file"]). None = search all indexed types (default)
score_threshold: Minimum similarity score for semantic/hybrid (0-1, default: 0.7)
algorithm: Search algorithm to use (default: "hybrid")
semantic_weight: Weight for semantic results in hybrid mode (default: 0.5)
keyword_weight: Weight for keyword results in hybrid mode (default: 0.3)
fuzzy_weight: Weight for fuzzy results in hybrid mode (default: 0.2)
score_threshold: Minimum fusion score (0-1, default: 0.0)
fusion: Fusion algorithm: "rrf" (Reciprocal Rank Fusion, default) or "dbsf" (Distribution-Based Score Fusion)
RRF: Good general-purpose fusion using reciprocal ranks
DBSF: Uses distribution-based normalization, may better balance different score ranges
include_context: Whether to expand results with surrounding context (default: False)
context_chars: Number of characters to include before/after matched chunk (default: 300)
Returns:
SemanticSearchResponse with matching documents and relevance scores
SemanticSearchResponse with matching documents ranked by fusion scores
"""
from nextcloud_mcp_server.config import get_settings
@@ -85,42 +82,24 @@ def configure_semantic_tools(mcp: FastMCP):
username = client.username
logger.info(
f"Search: query='{query}', user={username}, algorithm={algorithm}, "
f"limit={limit}, score_threshold={score_threshold}"
f"BM25 hybrid search: query='{query}', user={username}, "
f"limit={limit}, score_threshold={score_threshold}, fusion={fusion}"
)
# Check that vector sync is enabled
if not settings.vector_sync_enabled:
raise McpError(
ErrorData(
code=-1,
message="BM25 hybrid search requires VECTOR_SYNC_ENABLED=true",
)
)
try:
# Create appropriate algorithm instance
if algorithm == "semantic":
if not settings.vector_sync_enabled:
raise McpError(
ErrorData(
code=-1,
message="Semantic search requires VECTOR_SYNC_ENABLED=true",
)
)
search_algo = SemanticSearchAlgorithm(score_threshold=score_threshold)
elif algorithm == "keyword":
search_algo = KeywordSearchAlgorithm()
elif algorithm == "fuzzy":
search_algo = FuzzySearchAlgorithm()
elif algorithm == "hybrid":
if semantic_weight > 0 and not settings.vector_sync_enabled:
raise McpError(
ErrorData(
code=-1,
message="Hybrid search with semantic component requires VECTOR_SYNC_ENABLED=true",
)
)
search_algo = HybridSearchAlgorithm(
semantic_weight=semantic_weight,
keyword_weight=keyword_weight,
fuzzy_weight=fuzzy_weight,
)
else:
raise McpError(
ErrorData(code=-1, message=f"Unknown algorithm: {algorithm}")
)
# Create BM25 hybrid search algorithm with specified fusion
search_algo = BM25HybridSearchAlgorithm(
score_threshold=score_threshold, fusion=fusion
)
# Execute search across requested document types
# If doc_types is None, search all indexed types (cross-app search)
@@ -154,18 +133,16 @@ def configure_semantic_tools(mcp: FastMCP):
# Sort combined results by score
all_results.sort(key=lambda r: r.score, reverse=True)
# Deduplicate results (hybrid search may return same doc from dense + sparse)
# Qdrant already filters by user_id for multi-tenant isolation
# Sampling tool will verify access when fetching full content
seen = set()
unique_results = []
for result in all_results:
key = (result.id, result.doc_type)
if key not in seen:
seen.add(key)
unique_results.append(result)
search_results = unique_results[:limit] # Final limit after deduplication
# Note: BM25HybridSearchAlgorithm already deduplicates at chunk level
# (doc_id, doc_type, chunk_start, chunk_end), which allows multiple
# chunks from the same document while preventing duplicate chunks.
# No additional deduplication needed here - multiple chunks per document
# are valuable for RAG contexts.
# Qdrant already filters by user_id for multi-tenant isolation.
# Sampling tool will verify access when fetching full content.
search_results = all_results[
:limit
] # Final limit after chunk-level dedup in algorithm
# Convert SearchResult objects to SemanticSearchResult for response
results = []
@@ -184,16 +161,108 @@ def configure_semantic_tools(mcp: FastMCP):
total_chunks=r.metadata.get("total_chunks", 1)
if r.metadata
else 1,
chunk_start_offset=r.chunk_start_offset,
chunk_end_offset=r.chunk_end_offset,
page_number=r.page_number,
)
)
logger.info(f"Returning {len(results)} results from {algorithm} search")
# Expand results with surrounding context if requested
if include_context and results:
logger.info(
f"Expanding {len(results)} results with context "
f"(context_chars={context_chars})"
)
# Fetch context for all results in parallel
# Limit concurrent requests to prevent connection pool exhaustion
max_concurrent = 20
semaphore = anyio.Semaphore(max_concurrent)
expanded_results = [None] * len(results)
async def fetch_context(index: int, result: SemanticSearchResult):
"""Fetch context for a single result (parallel with semaphore)."""
async with semaphore:
# Only expand if we have valid chunk offsets
if (
result.chunk_start_offset is None
or result.chunk_end_offset is None
):
# Keep result as-is without context expansion
expanded_results[index] = result
return
try:
chunk_context = await get_chunk_with_context(
nc_client=client,
user_id=username,
doc_id=result.id,
doc_type=result.doc_type,
chunk_start=result.chunk_start_offset,
chunk_end=result.chunk_end_offset,
page_number=result.page_number,
chunk_index=result.chunk_index,
total_chunks=result.total_chunks,
context_chars=context_chars,
)
if chunk_context:
# Create new result with context fields populated
expanded_results[index] = SemanticSearchResult(
id=result.id,
doc_type=result.doc_type,
title=result.title,
category=result.category,
excerpt=result.excerpt,
score=result.score,
chunk_index=result.chunk_index,
total_chunks=result.total_chunks,
chunk_start_offset=result.chunk_start_offset,
chunk_end_offset=result.chunk_end_offset,
page_number=result.page_number,
# Context expansion fields
has_context_expansion=True,
marked_text=chunk_context.marked_text,
before_context=chunk_context.before_context,
after_context=chunk_context.after_context,
has_before_truncation=chunk_context.has_before_truncation,
has_after_truncation=chunk_context.has_after_truncation,
)
logger.debug(
f"Expanded context for {result.doc_type} {result.id}"
)
else:
# Context expansion failed, keep original result
expanded_results[index] = result
logger.debug(
f"Failed to expand context for {result.doc_type} {result.id}, "
"keeping original result"
)
except Exception as e:
# Context expansion failed, keep original result
expanded_results[index] = result
logger.warning(
f"Error expanding context for {result.doc_type} {result.id}: {e}"
)
# Run all context fetches in parallel using anyio task group
async with anyio.create_task_group() as tg:
for idx, result in enumerate(results):
tg.start_soon(fetch_context, idx, result)
# Replace results with expanded versions
results = [r for r in expanded_results if r is not None]
logger.info(
f"Context expansion completed: {len(results)} results with context"
)
logger.info(f"Returning {len(results)} results from BM25 hybrid search")
return SemanticSearchResponse(
results=results,
query=query,
total_found=len(results),
search_method=algorithm,
search_method=f"bm25_hybrid_{fusion}",
)
except ValueError as e:
@@ -225,6 +294,9 @@ def configure_semantic_tools(mcp: FastMCP):
limit: int = 5,
score_threshold: float = 0.7,
max_answer_tokens: int = 500,
fusion: str = "rrf",
include_context: bool = False,
context_chars: int = 300,
) -> SamplingSearchResponse:
"""
Semantic search with LLM-generated answer using MCP sampling.
@@ -249,6 +321,9 @@ def configure_semantic_tools(mcp: FastMCP):
limit: Maximum number of documents to retrieve (default: 5)
score_threshold: Minimum similarity score 0-1 (default: 0.7)
max_answer_tokens: Maximum tokens for generated answer (default: 500)
fusion: Fusion algorithm: "rrf" (Reciprocal Rank Fusion, default) or "dbsf" (Distribution-Based Score Fusion)
include_context: Whether to expand results with surrounding context (default: False)
context_chars: Number of characters to include before/after matched chunk (default: 300)
Returns:
SamplingSearchResponse containing:
@@ -260,27 +335,6 @@ def configure_semantic_tools(mcp: FastMCP):
Note: Requires MCP client to support sampling. If sampling is unavailable,
the tool gracefully degrades to returning documents with an explanation.
The client may prompt the user to approve the sampling request.
Examples:
>>> # Query about objectives across multiple apps
>>> result = await nc_semantic_search_answer(
... query="What are my Q1 2025 project goals?",
... ctx=ctx
... )
>>> print(result.generated_answer)
"Based on Document 1 (note: Project Kickoff), Document 2 (calendar event:
Q1 Planning Meeting), and Document 3 (deck card: Implement semantic search),
your main goals are: 1) Improve semantic search accuracy by 20%,
2) Deploy new embedding model, 3) Reduce indexing latency..."
>>> # Query about appointments
>>> result = await nc_semantic_search_answer(
... query="When is my next dentist appointment?",
... ctx=ctx,
... limit=10
... )
>>> len(result.sources) # Calendar events and related notes
3
"""
# 1. Retrieve relevant documents via existing semantic search
search_response = await nc_semantic_search(
@@ -288,6 +342,9 @@ def configure_semantic_tools(mcp: FastMCP):
ctx=ctx,
limit=limit,
score_threshold=score_threshold,
fusion=fusion,
include_context=include_context,
context_chars=context_chars,
)
# 2. Handle no results case - don't waste a sampling call
@@ -442,9 +499,11 @@ def configure_semantic_tools(mcp: FastMCP):
)
# 6. Request LLM completion via MCP sampling with timeout
# Note: 5 minute timeout to accommodate slower local LLMs (e.g., Ollama)
sampling_timeout_seconds = 300
try:
with anyio.fail_after(30):
with anyio.fail_after(sampling_timeout_seconds):
sampling_result = await ctx.session.create_message(
messages=[
SamplingMessage(
@@ -491,14 +550,14 @@ def configure_semantic_tools(mcp: FastMCP):
except TimeoutError:
logger.warning(
f"Sampling request timed out after 30 seconds for query: '{query}', "
f"Sampling request timed out after {sampling_timeout_seconds} seconds for query: '{query}', "
f"returning search results only"
)
return SamplingSearchResponse(
query=query,
generated_answer=(
f"[Sampling request timed out]\n\n"
f"The answer generation took too long (>30s). "
f"The answer generation took too long (>{sampling_timeout_seconds}s). "
f"Found {len(accessible_results)} relevant documents. "
f"Please review the sources below or try a simpler query."
),
@@ -618,15 +677,22 @@ def configure_semantic_tools(mcp: FastMCP):
# Get Qdrant client and query indexed count
indexed_count = 0
try:
from qdrant_client.models import Filter
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.vector.placeholder import (
get_placeholder_filter,
)
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
settings = get_settings()
qdrant_client = await get_qdrant_client()
# Count documents in collection
# Count documents in collection, excluding placeholders
# Placeholders are zero-vector points used to track processing state
count_result = await qdrant_client.count(
collection_name=settings.get_collection_name()
collection_name=settings.get_collection_name(),
count_filter=Filter(must=[get_placeholder_filter()]),
)
indexed_count = count_result.count
-14
View File
@@ -64,20 +64,6 @@ def configure_webdav_tools(mcp: FastMCP):
- Text files are decoded to UTF-8
- Documents (PDF, DOCX, etc.) are parsed and text is extracted
- Other binary files are base64 encoded
Examples:
# Read a text file
result = await nc_webdav_read_file("Documents/readme.txt")
logger.info(result['content']) # Decoded text content
# Read a PDF document (automatically parsed)
result = await nc_webdav_read_file("Documents/report.pdf")
logger.info(result['content']) # Extracted text from PDF
logger.info(result['parsing_metadata']) # Document parsing info
# Read a binary file
result = await nc_webdav_read_file("Images/photo.jpg")
logger.info(result['encoding']) # 'base64'
"""
client = await get_client(ctx)
content, content_type = await client.webdav.read_file(path)
+60
View File
@@ -0,0 +1,60 @@
"""Smithery-specific entrypoint for stateless deployment.
ADR-016: This entrypoint is used when deploying on Smithery's hosting platform.
It configures the server for stateless operation with per-session authentication.
Features disabled in Smithery mode:
- Vector sync / semantic search (no persistent storage)
- Admin UI at /app (no webhooks, no vector viz)
- OAuth provisioning tools (no token storage)
Features enabled:
- Core Nextcloud tools (notes, calendar, contacts, files, deck, tables, cookbook)
- Per-session app password authentication via Smithery configSchema
- Health check endpoints (/health/live, /health/ready)
"""
import logging
import os
import uvicorn
from nextcloud_mcp_server.config import setup_logging
logger = logging.getLogger(__name__)
def main():
"""Start the MCP server in Smithery stateless mode."""
# Setup logging first
setup_logging()
# Force stateless mode environment variables
os.environ["SMITHERY_DEPLOYMENT"] = "true"
os.environ["VECTOR_SYNC_ENABLED"] = "false"
logger.info("Starting Nextcloud MCP Server in Smithery stateless mode")
# Import app after setting environment variables
from nextcloud_mcp_server.app import get_app
# Create the app with streamable-http transport (required for Smithery)
app = get_app(transport="streamable-http")
# Smithery sets PORT environment variable
port = int(os.environ.get("PORT", 8081))
logger.info(f"Listening on port {port}")
uvicorn.run(
app,
host="0.0.0.0",
port=port,
log_level="info",
# Disable access log for cleaner output
access_log=False,
)
if __name__ == "__main__":
main()
+70 -24
View File
@@ -1,51 +1,97 @@
"""Document chunking for large texts."""
"""Document chunking for large texts using LangChain text splitters."""
import logging
from dataclasses import dataclass
from langchain_text_splitters import RecursiveCharacterTextSplitter
logger = logging.getLogger(__name__)
class DocumentChunker:
"""Chunk large documents for optimal embedding."""
@dataclass
class ChunkWithPosition:
"""A text chunk with its character position in the original document."""
def __init__(self, chunk_size: int = 512, overlap: int = 50):
text: str
start_offset: int # Character position where chunk starts
end_offset: int # Character position where chunk ends (exclusive)
page_number: int | None = None # Page number for PDF chunks (optional)
metadata: dict | None = None # Additional processor-specific metadata (optional)
class DocumentChunker:
"""Chunk large documents for optimal embedding using LangChain text splitters.
Uses RecursiveCharacterTextSplitter which preserves semantic boundaries
by splitting on sentence and paragraph boundaries before resorting to
character-level splitting.
"""
def __init__(self, chunk_size: int = 2048, overlap: int = 200):
"""
Initialize document chunker.
Args:
chunk_size: Number of words per chunk (default: 512)
overlap: Number of overlapping words between chunks (default: 50)
chunk_size: Number of characters per chunk (default: 2048)
overlap: Number of overlapping characters between chunks (default: 200)
"""
self.chunk_size = chunk_size
self.overlap = overlap
def chunk_text(self, content: str) -> list[str]:
"""
Split text into overlapping chunks.
# Initialize LangChain RecursiveCharacterTextSplitter
# Uses hierarchical splitting to preserve semantic boundaries:
# - Paragraphs (\n\n)
# - Sentences (. ! ?)
# - Words (spaces)
# - Characters (last resort)
# This prevents mid-sentence splitting while maintaining semantic coherence
self.splitter = RecursiveCharacterTextSplitter(
chunk_size=chunk_size,
chunk_overlap=overlap,
add_start_index=True, # Enable position tracking
strip_whitespace=True,
)
Uses simple word-based chunking with configurable overlap to preserve
context across chunk boundaries.
async def chunk_text(self, content: str) -> list[ChunkWithPosition]:
"""
Split text into overlapping chunks with position tracking.
Uses LangChain's RecursiveCharacterTextSplitter to create chunks that
preserve semantic boundaries by splitting at paragraphs and sentences
before resorting to word or character-level splitting. This ensures
sentences are kept intact. Preserves character positions for each chunk
to enable precise document retrieval.
Args:
content: Text content to chunk
Returns:
List of text chunks (may be single item if content is small)
List of chunks with their character positions in the original content
"""
# Simple word-based chunking
words = content.split()
import anyio
if len(words) <= self.chunk_size:
return [content]
# Handle empty content - return single empty chunk for backward compatibility
if not content:
return [ChunkWithPosition(text="", start_offset=0, end_offset=0)]
chunks = []
start = 0
# Run CPU-bound text splitting in thread pool to avoid blocking event loop
docs = await anyio.to_thread.run_sync( # type: ignore[attr-defined]
self.splitter.create_documents,
[content],
)
while start < len(words):
end = start + self.chunk_size
chunk_words = words[start:end]
chunks.append(" ".join(chunk_words))
start = end - self.overlap
# Convert LangChain Documents to ChunkWithPosition objects
chunks = [
ChunkWithPosition(
text=doc.page_content,
start_offset=doc.metadata.get("start_index", 0),
end_offset=doc.metadata.get("start_index", 0) + len(doc.page_content),
)
for doc in docs
]
logger.debug(f"Chunked document into {len(chunks)} chunks ({len(words)} words)")
logger.debug(
f"Chunked document into {len(chunks)} chunks "
f"(chunk_size={self.chunk_size}, overlap={self.overlap})"
)
return chunks
@@ -0,0 +1,49 @@
"""HTML to Markdown conversion utilities for vector sync."""
import logging
from markdownify import markdownify as md
logger = logging.getLogger(__name__)
def html_to_markdown(html_content: str | None) -> str:
"""Convert HTML content to Markdown, preserving semantic structure.
This function converts HTML (typically from RSS/Atom feed items) to Markdown
for better text embedding. Markdown preserves:
- Heading hierarchy (important for document structure)
- Lists (bullet and numbered)
- Links (as [text](url))
- Bold/italic emphasis
- Paragraphs and line breaks
Args:
html_content: HTML string to convert (may be None or empty)
Returns:
Markdown string, or empty string if input is None/empty
Example:
>>> html_to_markdown("<h1>Title</h1><p>Content with <b>bold</b>.</p>")
'# Title\\n\\nContent with **bold**.\\n\\n'
"""
if not html_content:
return ""
try:
markdown = md(
html_content,
heading_style="ATX", # Use # style headings
strip=["script", "style", "iframe", "noscript"], # Remove unsafe elements
bullets="-", # Use - for unordered lists
code_language="", # Don't add language hints to code blocks
)
return markdown.strip()
except Exception as e:
logger.warning(f"Failed to convert HTML to Markdown: {e}")
# Fallback: strip all HTML tags as a last resort
import re
text = re.sub(r"<[^>]+>", " ", html_content)
return " ".join(text.split()) # Normalize whitespace
+306
View File
@@ -0,0 +1,306 @@
"""Placeholder point management for Qdrant state tracking.
Placeholders are zero-vector points stored in Qdrant to track document processing
state. They prevent duplicate work by marking documents as "in-flight" during the
gap between scanner queuing and processor completion.
Architecture:
- Scanner writes placeholders when queuing documents for processing
- Processor deletes placeholders and writes real vectors after processing
- All user-facing queries filter out placeholders (is_placeholder: False)
Placeholders contain:
- Zero vectors (dimension from embedding service)
- is_placeholder: True flag (for filtering)
- status: "pending", "processing", "completed", "failed"
- modified_at, etag from source document
- queued_at timestamp
"""
import logging
import time
import uuid
from qdrant_client.models import FieldCondition, Filter, MatchValue, PointStruct
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.embedding import get_embedding_service
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
def _generate_placeholder_id(doc_type: str, doc_id: str | int) -> str:
"""Generate deterministic UUID for placeholder point.
Args:
doc_type: Document type (note, file, etc.)
doc_id: Document ID
Returns:
UUID string for point ID
"""
point_name = f"{doc_type}:{doc_id}:placeholder"
return str(uuid.uuid5(uuid.NAMESPACE_DNS, point_name))
async def write_placeholder_point(
doc_id: str | int,
doc_type: str,
user_id: str,
modified_at: int,
etag: str = "",
file_path: str | None = None,
) -> None:
"""Write a placeholder point to Qdrant to mark document as queued.
This should be called by the scanner BEFORE queuing a document for processing.
The placeholder prevents duplicate work if the scanner runs again before
processing completes.
Args:
doc_id: Document ID (int for notes/files)
doc_type: Document type (note, file, etc.)
user_id: User ID who owns the document
modified_at: Document modification timestamp
etag: Document ETag (if available)
file_path: File path (for files only)
Raises:
Exception: If Qdrant write fails
"""
try:
qdrant_client = await get_qdrant_client()
settings = get_settings()
embedding_service = get_embedding_service()
# Get dimension dynamically (never hardcode)
dimension = embedding_service.get_dimension()
# Create zero vectors
zero_dense = [0.0] * dimension
# Create empty sparse vector for placeholders
# Use models.SparseVector with empty indices/values
from qdrant_client import models
empty_sparse = models.SparseVector(indices=[], values=[])
# Generate deterministic point ID
point_id = _generate_placeholder_id(doc_type, doc_id)
# Build payload
payload = {
"user_id": user_id,
"doc_id": doc_id,
"doc_type": doc_type,
"is_placeholder": True,
"status": "pending",
"modified_at": modified_at,
"etag": etag,
"queued_at": int(time.time()),
}
# Add file_path for files
if doc_type == "file" and file_path:
payload["file_path"] = file_path
# Create placeholder point
point = PointStruct(
id=point_id,
vector={
"dense": zero_dense,
"sparse": empty_sparse, # Empty sparse vector for placeholders
},
payload=payload,
)
# Upsert to Qdrant
await qdrant_client.upsert(
collection_name=settings.get_collection_name(),
points=[point],
wait=True,
)
logger.debug(
f"Wrote placeholder for {doc_type}_{doc_id} (user={user_id}, "
f"modified_at={modified_at})"
)
except Exception as e:
logger.error(
f"Failed to write placeholder for {doc_type}_{doc_id}: {e}",
exc_info=True,
)
raise
async def query_document_metadata(
doc_id: str | int,
doc_type: str,
user_id: str,
) -> dict | None:
"""Query Qdrant for existing document entry (placeholder or real).
Returns the payload of the first matching point, which could be:
- A placeholder (is_placeholder: True)
- A real indexed document (is_placeholder: False or missing)
- None if document not in Qdrant
Args:
doc_id: Document ID
doc_type: Document type
user_id: User ID
Returns:
Payload dict if found, None otherwise
"""
try:
qdrant_client = await get_qdrant_client()
settings = get_settings()
# Query for any entry matching doc_id, doc_type, user_id
scroll_result = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_id", match=MatchValue(value=doc_id)),
FieldCondition(key="doc_type", match=MatchValue(value=doc_type)),
]
),
limit=1,
with_payload=True,
with_vectors=False,
)
if scroll_result[0]:
point = scroll_result[0][0]
return dict(point.payload)
return None
except Exception as e:
logger.warning(f"Error querying document metadata for {doc_type}_{doc_id}: {e}")
return None
async def delete_placeholder_point(
doc_id: str | int,
doc_type: str,
user_id: str,
) -> None:
"""Delete a placeholder point from Qdrant.
This should be called by the processor BEFORE writing real vectors.
We delete the placeholder to avoid duplicates, then write the real chunks.
Args:
doc_id: Document ID
doc_type: Document type
user_id: User ID
Raises:
Exception: If Qdrant delete fails
"""
try:
qdrant_client = await get_qdrant_client()
settings = get_settings()
# Delete by filter (in case there are multiple chunks from old indexing)
await qdrant_client.delete(
collection_name=settings.get_collection_name(),
points_selector=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_id", match=MatchValue(value=doc_id)),
FieldCondition(key="doc_type", match=MatchValue(value=doc_type)),
FieldCondition(key="is_placeholder", match=MatchValue(value=True)),
]
),
)
logger.debug(f"Deleted placeholder for {doc_type}_{doc_id} (user={user_id})")
except Exception as e:
logger.error(
f"Failed to delete placeholder for {doc_type}_{doc_id}: {e}",
exc_info=True,
)
raise
async def update_placeholder_status(
doc_id: str | int,
doc_type: str,
user_id: str,
status: str,
) -> None:
"""Update the status field of a placeholder point.
Status values:
- "pending": Queued for processing
- "processing": Currently being processed
- "completed": Processing completed successfully
- "failed": Processing failed
Args:
doc_id: Document ID
doc_type: Document type
user_id: User ID
status: New status value
Raises:
Exception: If Qdrant update fails
"""
try:
qdrant_client = await get_qdrant_client()
settings = get_settings()
# Update payload using set_payload
await qdrant_client.set_payload(
collection_name=settings.get_collection_name(),
payload={"status": status},
points=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_id", match=MatchValue(value=doc_id)),
FieldCondition(key="doc_type", match=MatchValue(value=doc_type)),
FieldCondition(key="is_placeholder", match=MatchValue(value=True)),
]
),
)
logger.debug(
f"Updated placeholder status for {doc_type}_{doc_id} to '{status}' "
f"(user={user_id})"
)
except Exception as e:
logger.warning(
f"Failed to update placeholder status for {doc_type}_{doc_id}: {e}"
)
# Don't raise - status updates are non-critical
def get_placeholder_filter() -> FieldCondition:
"""Get a filter condition to exclude placeholders from queries.
Add this to all user-facing search/visualization queries to ensure
placeholders are never returned to users.
Returns:
FieldCondition that filters out is_placeholder: True
Example:
Filter(
must=[
get_placeholder_filter(), # Exclude placeholders
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
]
)
"""
return FieldCondition(
key="is_placeholder",
match=MatchValue(value=False),
)
+428 -25
View File
@@ -15,7 +15,7 @@ from qdrant_client.models import FieldCondition, Filter, MatchValue, PointStruct
from nextcloud_mcp_server.client import NextcloudClient
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.embedding import get_embedding_service
from nextcloud_mcp_server.embedding import get_bm25_service, get_embedding_service
from nextcloud_mcp_server.observability.metrics import (
record_qdrant_operation,
record_vector_sync_processing,
@@ -23,12 +23,50 @@ from nextcloud_mcp_server.observability.metrics import (
)
from nextcloud_mcp_server.observability.tracing import trace_operation
from nextcloud_mcp_server.vector.document_chunker import DocumentChunker
from nextcloud_mcp_server.vector.placeholder import delete_placeholder_point
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
from nextcloud_mcp_server.vector.scanner import DocumentTask
logger = logging.getLogger(__name__)
def assign_page_numbers(chunks, page_boundaries):
"""Assign page numbers to chunks based on page boundaries.
Each chunk gets the page number where most of its content appears.
For chunks spanning multiple pages, assigns the page containing the
majority of the chunk's characters.
Args:
chunks: List of ChunkWithPosition objects
page_boundaries: List of dicts with {page, start_offset, end_offset}
Returns:
None (modifies chunks in place)
"""
if not page_boundaries:
return
for chunk in chunks:
# Find which page(s) this chunk overlaps with
max_overlap = 0
assigned_page = None
for boundary in page_boundaries:
# Calculate overlap between chunk and page
overlap_start = max(chunk.start_offset, boundary["start_offset"])
overlap_end = min(chunk.end_offset, boundary["end_offset"])
overlap = max(0, overlap_end - overlap_start)
# Assign to page with maximum overlap
if overlap > max_overlap:
max_overlap = overlap
assigned_page = boundary["page"]
if assigned_page is not None:
chunk.page_number = assigned_page
async def processor_task(
worker_id: int,
receive_stream: MemoryObjectReceiveStream[DocumentTask],
@@ -218,30 +256,313 @@ async def _index_document(
settings = get_settings()
# Fetch document content
if doc_task.doc_type == "note":
document = await nc_client.notes.get_note(int(doc_task.doc_id))
content = f"{document['title']}\n\n{document['content']}"
title = document["title"]
etag = document.get("etag", "")
else:
raise ValueError(f"Unsupported doc_type: {doc_task.doc_type}")
with trace_operation(
"vector_sync.fetch_content",
attributes={
"vector_sync.doc_type": doc_task.doc_type,
"vector_sync.doc_id": doc_task.doc_id,
},
):
if doc_task.doc_type == "note":
document = await nc_client.notes.get_note(int(doc_task.doc_id))
content = f"{document['title']}\n\n{document['content']}"
title = document["title"]
etag = document.get("etag", "")
file_metadata = {} # No file-specific metadata for notes
file_path = None # Notes don't have file paths
content_bytes = None # Notes don't have binary content
content_type = None
elif doc_task.doc_type == "news_item":
from nextcloud_mcp_server.vector.html_processor import html_to_markdown
item = await nc_client.news.get_item(int(doc_task.doc_id))
# Convert HTML body to Markdown for better embedding
body_markdown = html_to_markdown(item.get("body", ""))
# Build content: title + URL + body
item_title = item.get("title", "")
item_url = item.get("url", "")
feed_title = item.get("feedTitle", "")
# Structure content for embedding
content_parts = [item_title]
if feed_title:
content_parts.append(f"Source: {feed_title}")
if item_url:
content_parts.append(f"URL: {item_url}")
content_parts.append("") # Blank line
content_parts.append(body_markdown)
content = "\n".join(content_parts)
title = item_title
etag = item.get("guidHash", "")
# Store news-specific metadata for later use in payload
file_metadata = {
"feed_id": item.get("feedId"),
"feed_title": feed_title,
"author": item.get("author"),
"pub_date": item.get("pubDate"),
"starred": item.get("starred", False),
"unread": item.get("unread", True),
"url": item_url,
"guid_hash": item.get("guidHash"),
"enclosure_link": item.get("enclosureLink"),
"enclosure_mime": item.get("enclosureMime"),
}
file_path = None
content_bytes = None
content_type = None
elif doc_task.doc_type == "file":
# For files, doc_id is now the numeric file ID, file_path comes from DocumentTask
if not doc_task.file_path:
raise ValueError(
f"File path required for file indexing but not provided (file_id={doc_task.doc_id})"
)
file_path = doc_task.file_path
# Read file content via WebDAV
content_bytes, content_type = await nc_client.webdav.read_file(file_path)
else:
raise ValueError(f"Unsupported doc_type: {doc_task.doc_type}")
# Process file content (text extraction)
if doc_task.doc_type == "file":
# Type narrowing: content_bytes and content_type are set for files
assert content_bytes is not None
assert content_type is not None
assert file_path is not None
with trace_operation(
"vector_sync.document_process",
attributes={
"vector_sync.content_type": content_type,
"vector_sync.file_size": len(content_bytes),
},
):
# Use document processor registry to extract text
from nextcloud_mcp_server.document_processors import get_registry
registry = get_registry()
try:
result = await registry.process(
content=content_bytes,
content_type=content_type,
filename=file_path,
)
content = result.text
file_metadata = result.metadata
title = file_metadata.get("title") or file_path.split("/")[-1]
etag = "" # WebDAV read_file doesn't return etag
# Diagnostic: Log page boundary information if available
if "page_boundaries" in file_metadata:
page_boundaries = file_metadata["page_boundaries"]
logger.info(
f"Page boundaries for {file_path}: "
f"{len(page_boundaries)} pages, text length: {len(content)}"
)
# Log first 3 page boundaries for debugging
for boundary in page_boundaries[:3]:
logger.debug(
f" Page {boundary['page']}: "
f"offsets [{boundary['start_offset']}:{boundary['end_offset']}]"
)
# Verify last boundary matches text length
if page_boundaries:
last_boundary = page_boundaries[-1]
if last_boundary["end_offset"] != len(content):
logger.warning(
f"Text length mismatch: content={len(content)}, "
f"last_boundary_end={last_boundary['end_offset']}"
)
else:
logger.debug(f"No page_boundaries in metadata for {file_path}")
except Exception as e:
logger.error(f"Failed to process file {file_path}: {e}")
raise
# Tokenize and chunk (using configured chunk size and overlap)
chunker = DocumentChunker(
chunk_size=settings.document_chunk_size,
overlap=settings.document_chunk_overlap,
)
chunks = chunker.chunk_text(content)
with trace_operation(
"vector_sync.chunk_text",
attributes={
"vector_sync.input_chars": len(content),
"vector_sync.chunk_size": settings.document_chunk_size,
"vector_sync.overlap": settings.document_chunk_overlap,
},
):
chunker = DocumentChunker(
chunk_size=settings.document_chunk_size,
overlap=settings.document_chunk_overlap,
)
chunks = await chunker.chunk_text(content)
# Generate embeddings (I/O bound - external API call)
embedding_service = get_embedding_service()
embeddings = await embedding_service.embed_batch(chunks)
# Assign page numbers to chunks if page boundaries are available (PDFs)
page_boundaries = file_metadata.get("page_boundaries")
if doc_task.doc_type == "file" and page_boundaries is not None:
with trace_operation(
"vector_sync.assign_page_numbers",
attributes={
"vector_sync.chunk_count": len(chunks),
"vector_sync.page_count": len(page_boundaries),
},
):
assign_page_numbers(chunks, page_boundaries)
# Diagnostic: Verify page number assignment
assigned_count = sum(1 for c in chunks if c.page_number is not None)
logger.info(
f"Assigned page numbers to {assigned_count}/{len(chunks)} chunks "
f"for {file_path}"
)
# Log first 3 chunks to see their page assignments
for i, chunk in enumerate(chunks[:3]):
logger.debug(
f" Chunk {i}: page={chunk.page_number}, "
f"offsets=[{chunk.start_offset}:{chunk.end_offset}]"
)
# Warning if NO page numbers were assigned
if assigned_count == 0:
logger.warning(
f"NO page numbers assigned! "
f"Text length: {len(content)}, "
f"Chunks: {len(chunks)}, "
f"Chunk offset range: [{chunks[0].start_offset}:{chunks[-1].end_offset}], "
f"Page boundaries: {len(page_boundaries)} pages, "
f"First boundary: {page_boundaries[0] if page_boundaries else 'None'}"
)
# Extract chunk texts for embedding
chunk_texts = [chunk.text for chunk in chunks]
# Initialize results containers
dense_embeddings: list = []
sparse_embeddings: list = []
chunk_images: dict[int, dict] = {}
# Determine if we need PDF highlighting
is_pdf = doc_task.doc_type == "file" and content_type == "application/pdf"
# Define async tasks for parallel execution
async def generate_dense_embeddings():
"""Generate dense embeddings (I/O bound - external API call)."""
nonlocal dense_embeddings
with trace_operation(
"vector_sync.embed_dense",
attributes={
"vector_sync.chunk_count": len(chunk_texts),
"vector_sync.total_chars": sum(len(t) for t in chunk_texts),
},
):
embedding_service = get_embedding_service()
dense_embeddings = await embedding_service.embed_batch(chunk_texts)
async def generate_sparse_embeddings():
"""Generate sparse embeddings (BM25 for keyword matching)."""
nonlocal sparse_embeddings
with trace_operation(
"vector_sync.embed_sparse",
attributes={
"vector_sync.chunk_count": len(chunk_texts),
},
):
bm25_service = get_bm25_service()
sparse_embeddings = await bm25_service.encode_batch(chunk_texts)
async def generate_highlights():
"""Generate highlighted page images for PDF chunks (CPU-bound)."""
nonlocal chunk_images
if not is_pdf:
return
# Type narrowing: content_bytes is set for PDF files
assert content_bytes is not None
with trace_operation(
"vector_sync.generate_highlights",
attributes={
"vector_sync.chunk_count": len(chunks),
"vector_sync.pdf_size": len(content_bytes),
},
):
import base64
from nextcloud_mcp_server.search.pdf_highlighter import PDFHighlighter
# Build chunk data for batch processing
# Format: (chunk_index, start_offset, end_offset, page_number, chunk_text)
chunk_data: list[tuple[int, int, int, int | None, str]] = [
(i, chunk.start_offset, chunk.end_offset, chunk.page_number, chunk.text)
for i, chunk in enumerate(chunks)
if chunk.page_number is not None
]
# Get pre-computed page boundaries from document processor
page_boundaries = file_metadata.get("page_boundaries")
if not page_boundaries:
logger.warning("No page boundaries available, skipping highlighting")
return
logger.info(
f"Batch generating highlighted page images for {len(chunk_data)} PDF chunks"
)
# Run CPU-bound highlighting in thread pool
# Pass pre-computed page boundaries and full text to avoid re-processing the PDF
batch_results = await anyio.to_thread.run_sync( # type: ignore[attr-defined]
lambda: PDFHighlighter.highlight_chunks_batch(
pdf_bytes=content_bytes,
chunks=chunk_data,
page_boundaries=page_boundaries,
full_text=content,
color="yellow",
zoom=2.0,
)
)
# Convert results to storage format
for chunk_index, (
png_bytes,
actual_page_num,
highlight_count,
) in batch_results.items():
image_base64 = base64.b64encode(png_bytes).decode("utf-8")
chunk_images[chunk_index] = {
"image": image_base64,
"page": actual_page_num,
"highlights": highlight_count,
"size": len(png_bytes),
}
logger.info(
f"Generated {len(chunk_images)}/{len(chunks)} highlighted page images "
f"(avg {sum(img['size'] for img in chunk_images.values()) // max(len(chunk_images), 1):,} bytes)"
)
# Run all embedding/highlighting operations in parallel
# - Dense embeddings: I/O bound (API call)
# - Sparse embeddings: CPU bound (local BM25)
# - Highlighting: CPU bound (PyMuPDF rendering, runs in thread pool)
with trace_operation(
"vector_sync.parallel_processing",
attributes={
"vector_sync.is_pdf": is_pdf,
"vector_sync.chunk_count": len(chunks),
},
):
async with anyio.create_task_group() as tg:
tg.start_soon(generate_dense_embeddings)
tg.start_soon(generate_sparse_embeddings)
tg.start_soon(generate_highlights)
# Prepare Qdrant points
indexed_at = int(time.time())
points = []
for i, (chunk, embedding) in enumerate(zip(chunks, embeddings)):
for i, (chunk, dense_emb, sparse_emb) in enumerate(
zip(chunks, dense_embeddings, sparse_embeddings)
):
# Generate deterministic UUID for point ID
# Using uuid5 with DNS namespace and combining doc info
point_name = f"{doc_task.doc_type}:{doc_task.doc_id}:chunk:{i}"
@@ -250,28 +571,110 @@ async def _index_document(
points.append(
PointStruct(
id=point_id,
vector=embedding,
vector={
"dense": dense_emb,
"sparse": sparse_emb,
},
payload={
"user_id": doc_task.user_id,
"doc_id": doc_task.doc_id,
"doc_type": doc_task.doc_type,
"is_placeholder": False, # Real indexed document (not placeholder)
"title": title,
"excerpt": chunk[:200],
"excerpt": chunk.text, # Full chunk text (up to chunk_size, default 2048 chars)
"indexed_at": indexed_at,
"modified_at": doc_task.modified_at,
"etag": etag,
"chunk_index": i,
"total_chunks": len(chunks),
"chunk_start_offset": chunk.start_offset,
"chunk_end_offset": chunk.end_offset,
"metadata_version": 2, # v2 includes position metadata
# File-specific metadata (PDF, etc.)
**(
{
"file_path": file_path, # Store file path for retrieval
"mime_type": content_type, # From WebDAV response
"file_size": file_metadata.get("file_size"),
"page_number": chunk.page_number,
"page_count": file_metadata.get("page_count"),
"author": file_metadata.get("author"),
"creation_date": file_metadata.get("creation_date"),
"has_images": file_metadata.get("has_images", False),
"image_count": file_metadata.get("image_count", 0),
}
if doc_task.doc_type == "file"
else {}
),
# News item-specific metadata
**(
{
"feed_id": file_metadata.get("feed_id"),
"feed_title": file_metadata.get("feed_title"),
"author": file_metadata.get("author"),
"pub_date": file_metadata.get("pub_date"),
"starred": file_metadata.get("starred"),
"unread": file_metadata.get("unread"),
"url": file_metadata.get("url"),
"guid_hash": file_metadata.get("guid_hash"),
"enclosure_link": file_metadata.get("enclosure_link"),
"enclosure_mime": file_metadata.get("enclosure_mime"),
}
if doc_task.doc_type == "news_item"
else {}
),
# Highlighted page image (PDF only)
**(
{
"highlighted_page_image": chunk_images[i]["image"],
"highlighted_page_number": chunk_images[i]["page"],
"highlight_count": chunk_images[i]["highlights"],
}
if i in chunk_images
else {}
),
},
)
)
# Upsert to Qdrant
await qdrant_client.upsert(
collection_name=settings.get_collection_name(),
points=points,
wait=True,
)
# Delete placeholder before writing real vectors
# This prevents duplicates and cleans up the placeholder state
try:
await delete_placeholder_point(
doc_id=doc_task.doc_id,
doc_type=doc_task.doc_type,
user_id=doc_task.user_id,
)
except Exception as e:
# Log but don't fail indexing if placeholder deletion fails
logger.warning(
f"Failed to delete placeholder for {doc_task.doc_type}_{doc_task.doc_id}: {e}"
)
# Upsert to Qdrant in batches to avoid timeout with large payloads
# Each batch is limited to avoid WriteTimeout when sending large image payloads
BATCH_SIZE = 10 # ~2MB per batch with images
with trace_operation(
"vector_sync.qdrant_upsert",
attributes={
"vector_sync.point_count": len(points),
"vector_sync.collection": settings.get_collection_name(),
"vector_sync.images_count": len(chunk_images),
"vector_sync.batch_size": BATCH_SIZE,
},
):
for batch_start in range(0, len(points), BATCH_SIZE):
batch_end = min(batch_start + BATCH_SIZE, len(points))
batch = points[batch_start:batch_end]
await qdrant_client.upsert(
collection_name=settings.get_collection_name(),
points=batch,
wait=True,
)
if batch_end < len(points):
logger.debug(
f"Upserted batch {batch_start // BATCH_SIZE + 1}/{(len(points) + BATCH_SIZE - 1) // BATCH_SIZE}"
)
logger.info(
f"Indexed {doc_task.doc_type}_{doc_task.doc_id} for {doc_task.user_id} "
+30 -13
View File
@@ -2,7 +2,7 @@
import logging
from qdrant_client import AsyncQdrantClient
from qdrant_client import AsyncQdrantClient, models
from qdrant_client.models import Distance, VectorParams
from nextcloud_mcp_server.config import get_settings
@@ -84,45 +84,62 @@ async def get_qdrant_client() -> AsyncQdrantClient:
f"Collection '{collection_name}' found, validating dimensions..."
)
collection_info = await _qdrant_client.get_collection(collection_name)
actual_dimension = collection_info.config.params.vectors.size
# Handle both named vectors (dict) and legacy single vector
vectors = collection_info.config.params.vectors
if isinstance(vectors, dict):
actual_dimension = vectors["dense"].size
else:
actual_dimension = vectors.size
# Validate dimension matches
if actual_dimension != expected_dimension:
embedding_model = settings.get_embedding_model_name()
raise ValueError(
f"Dimension mismatch for collection '{collection_name}':\n"
f" Expected: {expected_dimension} (from embedding model '{settings.ollama_embedding_model}')\n"
f" Expected: {expected_dimension} (from embedding model '{embedding_model}')\n"
f" Found: {actual_dimension}\n"
f"This usually means you changed the embedding model.\n"
f"Solutions:\n"
f" 1. Delete the old collection: Collection will be recreated with new dimensions\n"
f" 2. Set QDRANT_COLLECTION to use a different collection name\n"
f" 3. Revert OLLAMA_EMBEDDING_MODEL to the original model"
f" 3. Revert to the original embedding model"
)
logger.info(
f"Using existing Qdrant collection: {collection_name} "
f"(dimension={actual_dimension}, model={settings.ollama_embedding_model})"
f"(dimension={actual_dimension}, model={settings.get_embedding_model_name()})"
)
else:
# Collection doesn't exist - create it
embedding_model = settings.get_embedding_model_name()
logger.info(
f"Collection '{collection_name}' not found, creating with "
f"dimension={expected_dimension}, model={settings.ollama_embedding_model}..."
f"dimension={expected_dimension}, model={embedding_model}..."
)
await _qdrant_client.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(
size=expected_dimension,
distance=Distance.COSINE,
),
vectors_config={
"dense": VectorParams(
size=expected_dimension,
distance=Distance.COSINE,
),
},
sparse_vectors_config={
"sparse": models.SparseVectorParams(
index=models.SparseIndexParams(
on_disk=False,
)
),
},
)
logger.info(
f"Created Qdrant collection: {collection_name}\n"
f" Dimension: {expected_dimension}\n"
f" Model: {settings.ollama_embedding_model}\n"
f" Dense vector dimension: {expected_dimension}\n"
f" Dense embedding model: {embedding_model}\n"
f" Sparse vectors: BM25 (for hybrid search)\n"
f" Distance: COSINE\n"
f"Background sync will index all documents with this embedding model."
f"Background sync will index all documents with dense + sparse vectors."
)
return _qdrant_client
+455 -15
View File
@@ -4,6 +4,7 @@ Periodically scans enabled users' content and queues changed documents for proce
"""
import logging
import os
import time
from dataclasses import dataclass
@@ -16,6 +17,10 @@ from nextcloud_mcp_server.client import NextcloudClient
from nextcloud_mcp_server.config import get_settings
from nextcloud_mcp_server.observability.metrics import record_vector_sync_scan
from nextcloud_mcp_server.observability.tracing import trace_operation
from nextcloud_mcp_server.vector.placeholder import (
query_document_metadata,
write_placeholder_point,
)
from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
logger = logging.getLogger(__name__)
@@ -26,10 +31,11 @@ class DocumentTask:
"""Document task for processing queue."""
user_id: str
doc_id: str
doc_id: int | str # int for files/notes, str for legacy
doc_type: str # "note", "file", "calendar"
operation: str # "index" or "delete"
modified_at: int
file_path: str | None = None # File path for files (when doc_id is file_id)
# Track documents potentially deleted (grace period before actual deletion)
@@ -182,8 +188,9 @@ async def scan_user_documents(
f"[SCAN-{scan_id}] Using pruneBefore={prune_before} to optimize data transfer"
)
# Get indexed state from Qdrant first (for incremental sync)
indexed_docs = {}
# For deletion tracking, get all doc_ids in Qdrant (for incremental sync)
# Note: We no longer bulk-query indexed_at, instead check per-document
indexed_doc_ids = set()
if not initial_sync:
qdrant_client = await get_qdrant_client()
scroll_result = await qdrant_client.scroll(
@@ -194,17 +201,18 @@ async def scan_user_documents(
FieldCondition(key="doc_type", match=MatchValue(value="note")),
]
),
with_payload=["doc_id", "indexed_at"],
with_payload=["doc_id"],
with_vectors=False,
limit=10000,
)
indexed_docs = {
point.payload["doc_id"]: point.payload["indexed_at"]
for point in scroll_result[0]
indexed_doc_ids = {
point.payload["doc_id"]
for point in (scroll_result[0] or [])
if point.payload is not None
}
logger.debug(f"Found {len(indexed_docs)} indexed documents in Qdrant")
logger.debug(f"Found {len(indexed_doc_ids)} indexed documents in Qdrant")
# Stream notes from Nextcloud and process immediately
note_count = 0
@@ -218,7 +226,14 @@ async def scan_user_documents(
modified_at = note.get("modified", 0)
if initial_sync:
# Send everything on first sync
# Send everything on first sync - write placeholder first
await write_placeholder_point(
doc_id=doc_id,
doc_type="note",
user_id=user_id,
modified_at=modified_at,
etag=note.get("etag", ""),
)
await send_stream.send(
DocumentTask(
user_id=user_id,
@@ -230,9 +245,7 @@ async def scan_user_documents(
)
queued += 1
else:
# Incremental sync: compare with indexed state
indexed_at = indexed_docs.get(doc_id)
# Incremental sync: check if document exists and compare modified_at
# If document reappeared, remove from potentially_deleted
doc_key = (user_id, doc_id)
if doc_key in _potentially_deleted:
@@ -241,8 +254,48 @@ async def scan_user_documents(
)
del _potentially_deleted[doc_key]
# Query Qdrant for existing entry (placeholder or real)
existing_metadata = await query_document_metadata(
doc_id=doc_id, doc_type="note", user_id=user_id
)
# Send if never indexed or modified since last index
if indexed_at is None or modified_at > indexed_at:
# Compare against stored modified_at (not indexed_at!)
needs_indexing = False
if existing_metadata is None:
# Never seen before
needs_indexing = True
elif existing_metadata.get("modified_at", 0) < modified_at:
# Document modified since last indexing
needs_indexing = True
elif existing_metadata.get("is_placeholder", False):
# Placeholder exists - check if it's stale (processing may have failed)
# Only requeue if placeholder is older than 5x scan interval
# (Large PDFs can take 3-4 minutes to process)
queued_at = existing_metadata.get("queued_at", 0)
placeholder_age = time.time() - queued_at
stale_threshold = get_settings().vector_sync_scan_interval * 5
if placeholder_age > stale_threshold:
logger.debug(
f"Found stale placeholder for note {doc_id} "
f"(age={placeholder_age:.1f}s), requeuing"
)
needs_indexing = True
else:
logger.debug(
f"Skipping note {doc_id} with recent placeholder "
f"(age={placeholder_age:.1f}s < {stale_threshold:.1f}s)"
)
if needs_indexing:
# Write placeholder before queuing
await write_placeholder_point(
doc_id=doc_id,
doc_type="note",
user_id=user_id,
modified_at=modified_at,
etag=note.get("etag", ""),
)
await send_stream.send(
DocumentTask(
user_id=user_id,
@@ -270,7 +323,7 @@ async def scan_user_documents(
) # Allow 1.5 scan intervals
current_time = time.time()
for doc_id in indexed_docs:
for doc_id in indexed_doc_ids:
if doc_id not in nextcloud_doc_ids:
doc_key = (user_id, doc_id)
@@ -309,7 +362,394 @@ async def scan_user_documents(
)
_potentially_deleted[doc_key] = current_time
# Scan tagged PDF files (after notes)
# Get indexed file IDs from Qdrant (for deletion tracking)
indexed_file_ids = set()
if not initial_sync:
file_scroll_result = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_type", match=MatchValue(value="file")),
]
),
limit=10000, # Reasonable limit for file count
with_payload=["doc_id"],
with_vectors=False,
)
indexed_file_ids = {
point.payload["doc_id"]
for point in (file_scroll_result[0] or [])
if point.payload is not None
}
logger.debug(f"Found {len(indexed_file_ids)} indexed files in Qdrant")
# Scan for tagged PDF files
file_count = 0
file_queued = 0
nextcloud_file_ids = set()
try:
# Find files with vector-index tag using OCS Tags API
settings = get_settings()
tag_name = os.getenv("VECTOR_SYNC_PDF_TAG", "vector-index")
# Use NextcloudClient.find_files_by_tag() which uses proper OCS API
# and filters by PDF MIME type
tagged_files = await nc_client.find_files_by_tag(
tag_name, mime_type_filter="application/pdf"
)
for file_info in tagged_files:
# Files are already filtered by MIME type in find_files_by_tag()
file_count += 1
file_id = file_info["id"] # Use numeric file ID, not path
file_path = file_info["path"] # Keep path for logging
nextcloud_file_ids.add(file_id)
# Use last_modified timestamp if available, otherwise use current time
modified_at = file_info.get("last_modified_timestamp", int(time.time()))
if isinstance(file_info.get("last_modified"), str):
# Parse RFC 2822 date format if needed
from email.utils import parsedate_to_datetime
try:
dt = parsedate_to_datetime(file_info["last_modified"])
modified_at = int(dt.timestamp())
except (ValueError, KeyError):
pass
if initial_sync:
# Send everything on first sync - write placeholder first
await write_placeholder_point(
doc_id=file_id,
doc_type="file",
user_id=user_id,
modified_at=modified_at,
file_path=file_path,
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=file_id, # Use numeric file ID
doc_type="file",
operation="index",
modified_at=modified_at,
file_path=file_path, # Pass file path for content retrieval
)
)
file_queued += 1
else:
# Incremental sync: check if file exists and compare modified_at
# If file reappeared, remove from potentially_deleted
file_key = (user_id, file_id)
if file_key in _potentially_deleted:
logger.debug(
f"File {file_path} (ID: {file_id}) reappeared, removing from deletion grace period"
)
del _potentially_deleted[file_key]
# Query Qdrant for existing entry (placeholder or real)
existing_metadata = await query_document_metadata(
doc_id=file_id, doc_type="file", user_id=user_id
)
# Send if never indexed or modified since last index
# Compare against stored modified_at (not indexed_at!)
needs_indexing = False
if existing_metadata is None:
# Never seen before
needs_indexing = True
elif existing_metadata.get("modified_at", 0) < modified_at:
# File modified since last indexing
needs_indexing = True
elif existing_metadata.get("is_placeholder", False):
# Placeholder exists - check if it's stale (processing may have failed)
# Only requeue if placeholder is older than 5x scan interval
# (Large PDFs can take 3-4 minutes to process)
queued_at = existing_metadata.get("queued_at", 0)
placeholder_age = time.time() - queued_at
stale_threshold = get_settings().vector_sync_scan_interval * 5
if placeholder_age > stale_threshold:
logger.debug(
f"Found stale placeholder for file {file_path} (ID: {file_id}) "
f"(age={placeholder_age:.1f}s), requeuing"
)
needs_indexing = True
else:
logger.debug(
f"Skipping file {file_path} (ID: {file_id}) with recent placeholder "
f"(age={placeholder_age:.1f}s < {stale_threshold:.1f}s)"
)
if needs_indexing:
# Write placeholder before queuing
await write_placeholder_point(
doc_id=file_id,
doc_type="file",
user_id=user_id,
modified_at=modified_at,
file_path=file_path,
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=file_id, # Use numeric file ID
doc_type="file",
operation="index",
modified_at=modified_at,
file_path=file_path, # Pass file path for content retrieval
)
)
file_queued += 1
logger.info(
f"[SCAN-{scan_id}] Found {file_count} tagged PDFs for {user_id}"
)
record_vector_sync_scan(file_count)
# Check for deleted files (not initial sync)
if not initial_sync:
for file_id in indexed_file_ids:
if file_id not in nextcloud_file_ids:
file_key = (user_id, file_id)
if file_key in _potentially_deleted:
# Check if grace period elapsed
first_missing_time = _potentially_deleted[file_key]
time_missing = current_time - first_missing_time
if time_missing >= grace_period:
# Grace period elapsed, send for deletion
logger.info(
f"File ID {file_id} missing for {time_missing:.1f}s "
f"(>{grace_period:.1f}s grace period), sending deletion"
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=file_id, # Use numeric file ID
doc_type="file",
operation="delete",
modified_at=0,
)
)
file_queued += 1
del _potentially_deleted[file_key]
else:
# First time missing, add to grace period tracking
logger.debug(
f"File ID {file_id} missing for first time, starting grace period"
)
_potentially_deleted[file_key] = current_time
except Exception as e:
logger.warning(f"Failed to scan tagged files for {user_id}: {e}")
queued += file_queued
# Scan News items (starred + unread)
news_queued = 0
try:
news_queued = await scan_news_items(
user_id=user_id,
send_stream=send_stream,
nc_client=nc_client,
initial_sync=initial_sync,
scan_id=scan_id,
)
queued += news_queued
except Exception as e:
logger.warning(f"Failed to scan news items for {user_id}: {e}")
if queued > 0:
logger.info(f"Sent {queued} documents for incremental sync: {user_id}")
logger.info(
f"Sent {queued} documents ({file_queued} files, {news_queued} news items) for incremental sync: {user_id}"
)
else:
logger.debug(f"No changes detected for {user_id}")
async def scan_news_items(
user_id: str,
send_stream: MemoryObjectSendStream[DocumentTask],
nc_client: NextcloudClient,
initial_sync: bool,
scan_id: int,
) -> int:
"""
Scan user's News items and queue changed items for indexing.
Indexes all items from the user's feeds. The News app's auto-purge
feature (default: 200 items per feed) naturally limits the total
number of items, making explicit filtering unnecessary.
Args:
user_id: User to scan
send_stream: Stream to send changed documents to processors
nc_client: Authenticated Nextcloud client
initial_sync: If True, send all documents (first-time sync)
scan_id: Scan identifier for logging
Returns:
Number of items queued for processing
"""
from nextcloud_mcp_server.client.news import NewsItemType
settings = get_settings()
queued = 0
# Get indexed news item IDs from Qdrant (for deletion tracking)
indexed_item_ids: set[str] = set()
if not initial_sync:
qdrant_client = await get_qdrant_client()
scroll_result = await qdrant_client.scroll(
collection_name=settings.get_collection_name(),
scroll_filter=Filter(
must=[
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
FieldCondition(key="doc_type", match=MatchValue(value="news_item")),
]
),
with_payload=["doc_id"],
with_vectors=False,
limit=10000,
)
indexed_item_ids = {
point.payload["doc_id"]
for point in (scroll_result[0] or [])
if point.payload is not None
}
logger.debug(f"Found {len(indexed_item_ids)} indexed news items in Qdrant")
# Fetch all items (News app caps at ~200 per feed via auto-purge)
all_items = await nc_client.news.get_items(
batch_size=-1,
type_=NewsItemType.ALL,
get_read=True,
)
logger.debug(f"[SCAN-{scan_id}] Found {len(all_items)} news items")
item_count = len(all_items)
nextcloud_item_ids: set[str] = set()
for item in all_items:
doc_id = str(item["id"])
nextcloud_item_ids.add(doc_id)
# Use lastModified timestamp (microseconds in News API)
modified_at = item.get("lastModified", 0)
# Convert to seconds if needed (News API uses microseconds)
if modified_at > 10000000000: # > year 2286 in seconds
modified_at = modified_at // 1000000
if initial_sync:
# Send everything on first sync - write placeholder first
await write_placeholder_point(
doc_id=doc_id,
doc_type="news_item",
user_id=user_id,
modified_at=modified_at,
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=doc_id,
doc_type="news_item",
operation="index",
modified_at=modified_at,
)
)
queued += 1
else:
# Incremental sync: check if item exists and compare modified_at
doc_key = (user_id, doc_id)
if doc_key in _potentially_deleted:
logger.debug(
f"News item {doc_id} reappeared, removing from deletion grace period"
)
del _potentially_deleted[doc_key]
# Query Qdrant for existing entry
existing_metadata = await query_document_metadata(
doc_id=doc_id, doc_type="news_item", user_id=user_id
)
needs_indexing = False
if existing_metadata is None:
needs_indexing = True
elif existing_metadata.get("modified_at", 0) < modified_at:
needs_indexing = True
elif existing_metadata.get("is_placeholder", False):
queued_at = existing_metadata.get("queued_at", 0)
placeholder_age = time.time() - queued_at
stale_threshold = settings.vector_sync_scan_interval * 5
if placeholder_age > stale_threshold:
logger.debug(
f"Found stale placeholder for news item {doc_id} "
f"(age={placeholder_age:.1f}s), requeuing"
)
needs_indexing = True
if needs_indexing:
await write_placeholder_point(
doc_id=doc_id,
doc_type="news_item",
user_id=user_id,
modified_at=modified_at,
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=doc_id,
doc_type="news_item",
operation="index",
modified_at=modified_at,
)
)
queued += 1
logger.info(
f"[SCAN-{scan_id}] Found {item_count} news items (starred+unread) for {user_id}"
)
record_vector_sync_scan(item_count)
# Check for deleted items (not initial sync)
# Items become "deleted" when they are no longer starred AND become read
if not initial_sync:
grace_period = settings.vector_sync_scan_interval * 1.5
current_time = time.time()
for doc_id in indexed_item_ids:
if doc_id not in nextcloud_item_ids:
doc_key = (user_id, doc_id)
if doc_key in _potentially_deleted:
first_missing_time = _potentially_deleted[doc_key]
time_missing = current_time - first_missing_time
if time_missing >= grace_period:
logger.info(
f"News item {doc_id} missing for {time_missing:.1f}s "
f"(>{grace_period:.1f}s grace period), sending deletion"
)
await send_stream.send(
DocumentTask(
user_id=user_id,
doc_id=doc_id,
doc_type="news_item",
operation="delete",
modified_at=0,
)
)
queued += 1
del _potentially_deleted[doc_key]
else:
logger.debug(
f"News item {doc_id} missing for first time, starting grace period"
)
_potentially_deleted[doc_key] = current_time
return queued
+15 -5
View File
@@ -1,6 +1,6 @@
[project]
name = "nextcloud-mcp-server"
version = "0.35.0"
version = "0.49.0"
description = "Model Context Protocol (MCP) server for Nextcloud integration - enables AI assistants to interact with Nextcloud data"
authors = [
{name = "Chris Coutinho", email = "chris@coutinho.io"}
@@ -10,9 +10,9 @@ license = {text = "AGPL-3.0-only"}
requires-python = ">=3.11"
keywords = ["nextcloud", "mcp", "model-context-protocol", "llm", "ai", "claude", "webdav", "caldav", "carddav"]
dependencies = [
"mcp[cli] (>=1.21,<1.22)",
"mcp[cli] (>=1.23,<1.24)",
"httpx (>=0.28.1,<0.29.0)",
"pillow (>=12.0.0,<12.1.0)",
"pillow (>=10.3.0,<12.0.0)", # Compatible with fastembed
"icalendar (>=6.0.0,<7.0.0)",
"pythonvcard4>=0.2.0",
"pydantic>=2.11.4",
@@ -22,6 +22,9 @@ dependencies = [
"aiosqlite>=0.20.0", # Async SQLite for refresh token storage
"authlib>=1.6.5",
"qdrant-client>=1.7.0",
"fastembed>=0.7.3", # BM25 sparse vector embeddings for hybrid search
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
"boto3>=1.35.0", # For Amazon Bedrock provider (optional)
# Observability dependencies
"prometheus-client>=0.21.0", # Prometheus metrics
"opentelemetry-api>=1.28.2", # OpenTelemetry API
@@ -31,6 +34,13 @@ dependencies = [
"opentelemetry-instrumentation-logging>=0.49b2", # Logging integration
"opentelemetry-exporter-otlp-proto-grpc>=1.28.2", # OTLP gRPC exporter
"python-json-logger>=3.2.0", # Structured JSON logging
"jinja2>=3.1.6",
"langchain-text-splitters>=1.0.0",
"markdownify>=0.14.1", # HTML to Markdown conversion for News items
"pymupdf>=1.26.6",
"pymupdf4llm>=0.2.2",
"pymupdf-layout>=1.26.6",
"openai>=2.8.1",
]
classifiers = [
"Development Status :: 4 - Beta",
@@ -102,9 +112,8 @@ module-root = ""
[dependency-groups]
dev = [
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
"commitizen>=4.8.2",
"datasets>=3.3.0", # For BeIR nfcorpus dataset loading
"datasets>=3.3.0", # For BeIR nfcorpus dataset loading
"ipython>=9.2.0",
"playwright>=1.49.1",
"pytest>=8.3.5",
@@ -119,6 +128,7 @@ dev = [
[project.scripts]
nextcloud-mcp-server = "nextcloud_mcp_server.cli:run"
smithery-main = "nextcloud_mcp_server.smithery_main:main"
[[tool.uv.index]]
name = "testpypi"
+7 -1
View File
@@ -4,5 +4,11 @@
"config:best-practices",
"mergeConfidence:all-badges"
],
"dependencyDashboard": true
"dependencyDashboard": true,
"packageRules": [
{
"matchPackageNames": ["pillow"],
"allowedVersions": "<12.0.0"
}
]
}
+38
View File
@@ -0,0 +1,38 @@
# Smithery configuration for Nextcloud MCP Server
# See: https://smithery.ai/docs/build/configuration
# ADR-016: Stateless deployment mode for multi-user public Nextcloud instances
runtime: "container"
build:
dockerfile: "Dockerfile.smithery"
dockerBuildPath: "."
startCommand:
type: "http"
configSchema:
type: "object"
required:
- "nextcloud_url"
- "username"
- "app_password"
properties:
nextcloud_url:
type: "string"
title: "Nextcloud URL"
description: "Your Nextcloud instance URL (e.g., https://cloud.example.com). Must be publicly accessible."
pattern: "^https?://.+"
username:
type: "string"
title: "Username"
description: "Your Nextcloud username"
minLength: 1
app_password:
type: "string"
title: "App Password"
description: "Nextcloud app password. Generate at Settings > Security > App passwords. Do NOT use your main password."
minLength: 1
exampleConfig:
nextcloud_url: "https://cloud.example.com"
username: "alice"
app_password: "xxxxx-xxxxx-xxxxx-xxxxx-xxxxx"
+219
View File
@@ -480,3 +480,222 @@ def create_mock_table_row_ocs_response(
ocs_response = {"ocs": {"meta": {"status": "ok"}, "data": row_data}}
return create_mock_response(status_code=200, json_data=ocs_response)
# ============================================================================
# News Mock Response Helpers
# ============================================================================
def create_mock_news_folders_response(
folders: list[dict] | None = None,
) -> httpx.Response:
"""Create a mock response for News folders list.
Args:
folders: List of folder dictionaries. If None, returns empty list.
Returns:
Mock httpx.Response with folders data
"""
if folders is None:
folders = []
return create_mock_response(status_code=200, json_data={"folders": folders})
def create_mock_news_folder_response(
folder_id: int = 1,
name: str = "Test Folder",
**kwargs,
) -> httpx.Response:
"""Create a mock response for a News folder.
Args:
folder_id: Folder ID
name: Folder name
**kwargs: Additional folder fields
Returns:
Mock httpx.Response with folder data
"""
folder_data = {
"id": folder_id,
"name": name,
**kwargs,
}
return create_mock_response(status_code=200, json_data={"folders": [folder_data]})
def create_mock_news_feeds_response(
feeds: list[dict] | None = None,
starred_count: int = 0,
newest_item_id: int | None = None,
) -> httpx.Response:
"""Create a mock response for News feeds list.
Args:
feeds: List of feed dictionaries. If None, returns empty list.
starred_count: Number of starred items
newest_item_id: ID of newest item
Returns:
Mock httpx.Response with feeds data
"""
if feeds is None:
feeds = []
data = {
"feeds": feeds,
"starredCount": starred_count,
}
if newest_item_id is not None:
data["newestItemId"] = newest_item_id
return create_mock_response(status_code=200, json_data=data)
def create_mock_news_feed_response(
feed_id: int = 1,
url: str = "https://example.com/feed",
title: str = "Test Feed",
favicon_link: str | None = None,
folder_id: int | None = None,
unread_count: int = 0,
**kwargs,
) -> httpx.Response:
"""Create a mock response for a News feed.
Args:
feed_id: Feed ID
url: Feed URL
title: Feed title
favicon_link: Favicon URL
folder_id: Parent folder ID
unread_count: Number of unread items
**kwargs: Additional feed fields
Returns:
Mock httpx.Response with feed data
"""
feed_data = {
"id": feed_id,
"url": url,
"title": title,
"faviconLink": favicon_link,
"folderId": folder_id,
"unreadCount": unread_count,
"link": kwargs.get("link", "https://example.com"),
"added": kwargs.get("added", 1700000000),
"updateErrorCount": kwargs.get("updateErrorCount", 0),
"lastUpdateError": kwargs.get("lastUpdateError"),
**{
k: v
for k, v in kwargs.items()
if k not in ["link", "added", "updateErrorCount", "lastUpdateError"]
},
}
return create_mock_response(status_code=200, json_data={"feeds": [feed_data]})
def create_mock_news_items_response(
items: list[dict] | None = None,
) -> httpx.Response:
"""Create a mock response for News items list.
Args:
items: List of item dictionaries. If None, returns empty list.
Returns:
Mock httpx.Response with items data
"""
if items is None:
items = []
return create_mock_response(status_code=200, json_data={"items": items})
def create_mock_news_item(
item_id: int = 1,
feed_id: int = 1,
title: str = "Test Article",
body: str = "<p>Test content</p>",
url: str = "https://example.com/article",
author: str | None = "Test Author",
pub_date: int = 1700000000,
unread: bool = True,
starred: bool = False,
**kwargs,
) -> dict:
"""Create a mock News item dictionary.
Args:
item_id: Item ID
feed_id: Parent feed ID
title: Article title
body: Article body (HTML)
url: Article URL
author: Article author
pub_date: Publication timestamp (Unix)
unread: Whether item is unread
starred: Whether item is starred
**kwargs: Additional item fields
Returns:
Item dictionary
"""
return {
"id": item_id,
"feedId": feed_id,
"title": title,
"body": body,
"url": url,
"author": author,
"pubDate": pub_date,
"unread": unread,
"starred": starred,
"guid": kwargs.get("guid", f"guid-{item_id}"),
"guidHash": kwargs.get("guidHash", f"hash-{item_id}"),
"lastModified": kwargs.get("lastModified", pub_date * 1000000),
"enclosureLink": kwargs.get("enclosureLink"),
"enclosureMime": kwargs.get("enclosureMime"),
"fingerprint": kwargs.get("fingerprint", f"fp-{item_id}"),
"contentHash": kwargs.get("contentHash", f"ch-{item_id}"),
**{
k: v
for k, v in kwargs.items()
if k
not in [
"guid",
"guidHash",
"lastModified",
"enclosureLink",
"enclosureMime",
"fingerprint",
"contentHash",
]
},
}
def create_mock_news_status_response(
version: str = "25.0.0",
warnings: dict | None = None,
) -> httpx.Response:
"""Create a mock response for News status.
Args:
version: News app version
warnings: Warning messages
Returns:
Mock httpx.Response with status data
"""
data = {
"version": version,
"warnings": warnings or {},
}
return create_mock_response(status_code=200, json_data=data)
View File
+561
View File
@@ -0,0 +1,561 @@
"""Unit tests for NewsClient API methods."""
import logging
import httpx
import pytest
from nextcloud_mcp_server.client.news import NewsClient, NewsItemType
from tests.client.conftest import (
create_mock_error_response,
create_mock_news_feed_response,
create_mock_news_feeds_response,
create_mock_news_folder_response,
create_mock_news_folders_response,
create_mock_news_item,
create_mock_news_items_response,
create_mock_news_status_response,
create_mock_response,
)
logger = logging.getLogger(__name__)
# Mark all tests in this module as unit tests
pytestmark = pytest.mark.unit
# ============================================================================
# Folder Tests
# ============================================================================
async def test_news_api_get_folders(mocker):
"""Test that get_folders correctly parses the API response."""
mock_response = create_mock_news_folders_response(
folders=[
{"id": 1, "name": "Tech"},
{"id": 2, "name": "News"},
]
)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
folders = await client.get_folders()
assert len(folders) == 2
assert folders[0]["id"] == 1
assert folders[0]["name"] == "Tech"
assert folders[1]["name"] == "News"
mock_make_request.assert_called_once_with("GET", "/apps/news/api/v1-3/folders")
async def test_news_api_create_folder(mocker):
"""Test that create_folder correctly creates a folder."""
mock_response = create_mock_news_folder_response(folder_id=3, name="New Folder")
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
folder = await client.create_folder(name="New Folder")
assert folder["id"] == 3
assert folder["name"] == "New Folder"
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/folders", json={"name": "New Folder"}
)
async def test_news_api_rename_folder(mocker):
"""Test that rename_folder makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.rename_folder(folder_id=1, name="Renamed")
mock_make_request.assert_called_once_with(
"PUT", "/apps/news/api/v1-3/folders/1", json={"name": "Renamed"}
)
async def test_news_api_delete_folder(mocker):
"""Test that delete_folder makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.delete_folder(folder_id=1)
mock_make_request.assert_called_once_with("DELETE", "/apps/news/api/v1-3/folders/1")
# ============================================================================
# Feed Tests
# ============================================================================
async def test_news_api_get_feeds(mocker):
"""Test that get_feeds correctly parses the API response."""
mock_response = create_mock_news_feeds_response(
feeds=[
{"id": 1, "url": "https://example.com/feed1", "title": "Feed 1"},
{"id": 2, "url": "https://example.com/feed2", "title": "Feed 2"},
],
starred_count=5,
newest_item_id=100,
)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
result = await client.get_feeds()
assert len(result["feeds"]) == 2
assert result["starredCount"] == 5
assert result["newestItemId"] == 100
mock_make_request.assert_called_once_with("GET", "/apps/news/api/v1-3/feeds")
async def test_news_api_create_feed(mocker):
"""Test that create_feed correctly creates a feed."""
mock_response = create_mock_news_feed_response(
feed_id=10, url="https://example.com/new-feed", title="New Feed"
)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
feed = await client.create_feed(url="https://example.com/new-feed")
assert feed["id"] == 10
assert feed["url"] == "https://example.com/new-feed"
mock_make_request.assert_called_once_with(
"POST",
"/apps/news/api/v1-3/feeds",
json={"url": "https://example.com/new-feed"},
)
async def test_news_api_create_feed_with_folder(mocker):
"""Test that create_feed correctly creates a feed in a folder."""
mock_response = create_mock_news_feed_response(
feed_id=10, url="https://example.com/feed", folder_id=5
)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
feed = await client.create_feed(url="https://example.com/feed", folder_id=5)
assert feed["folderId"] == 5
mock_make_request.assert_called_once_with(
"POST",
"/apps/news/api/v1-3/feeds",
json={"url": "https://example.com/feed", "folderId": 5},
)
async def test_news_api_delete_feed(mocker):
"""Test that delete_feed makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.delete_feed(feed_id=10)
mock_make_request.assert_called_once_with("DELETE", "/apps/news/api/v1-3/feeds/10")
async def test_news_api_move_feed(mocker):
"""Test that move_feed makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.move_feed(feed_id=10, folder_id=5)
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/feeds/10/move", json={"folderId": 5}
)
async def test_news_api_rename_feed(mocker):
"""Test that rename_feed makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.rename_feed(feed_id=10, title="Renamed Feed")
mock_make_request.assert_called_once_with(
"POST",
"/apps/news/api/v1-3/feeds/10/rename",
json={"feedTitle": "Renamed Feed"},
)
# ============================================================================
# Item Tests
# ============================================================================
async def test_news_api_get_items(mocker):
"""Test that get_items correctly parses the API response."""
items = [
create_mock_news_item(item_id=1, title="Article 1"),
create_mock_news_item(item_id=2, title="Article 2"),
]
mock_response = create_mock_news_items_response(items=items)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
result = await client.get_items()
assert len(result) == 2
assert result[0]["title"] == "Article 1"
assert result[1]["title"] == "Article 2"
# Verify default parameters
call_args = mock_make_request.call_args
assert call_args[0] == ("GET", "/apps/news/api/v1-3/items")
params = call_args[1]["params"]
assert params["batchSize"] == 50
assert params["type"] == NewsItemType.ALL
async def test_news_api_get_items_starred(mocker):
"""Test that get_items with STARRED type filters correctly."""
items = [create_mock_news_item(item_id=1, starred=True)]
mock_response = create_mock_news_items_response(items=items)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
result = await client.get_items(type_=NewsItemType.STARRED)
assert len(result) == 1
assert result[0]["starred"] is True
call_args = mock_make_request.call_args
params = call_args[1]["params"]
assert params["type"] == NewsItemType.STARRED
async def test_news_api_get_items_unread_only(mocker):
"""Test that get_items with get_read=False filters correctly."""
items = [create_mock_news_item(item_id=1, unread=True)]
mock_response = create_mock_news_items_response(items=items)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
result = await client.get_items(get_read=False)
assert len(result) == 1
call_args = mock_make_request.call_args
params = call_args[1]["params"]
assert params["getRead"] == "false"
async def test_news_api_get_item(mocker):
"""Test that get_item fetches a single item by ID."""
item = create_mock_news_item(item_id=123, title="Single Item")
mock_response = create_mock_response(status_code=200, json_data=item)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
result = await client.get_item(item_id=123)
assert result["id"] == 123
assert result["title"] == "Single Item"
mock_make_request.assert_called_once_with("GET", "/apps/news/api/v1-3/items/123")
async def test_news_api_get_updated_items(mocker):
"""Test that get_updated_items correctly calls the updated endpoint."""
items = [create_mock_news_item(item_id=1)]
mock_response = create_mock_news_items_response(items=items)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
result = await client.get_updated_items(last_modified=1700000000)
assert len(result) == 1
call_args = mock_make_request.call_args
assert call_args[0] == ("GET", "/apps/news/api/v1-3/items/updated")
params = call_args[1]["params"]
assert params["lastModified"] == 1700000000
async def test_news_api_mark_item_read(mocker):
"""Test that mark_item_read makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.mark_item_read(item_id=123)
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/items/123/read"
)
async def test_news_api_mark_item_unread(mocker):
"""Test that mark_item_unread makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.mark_item_unread(item_id=123)
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/items/123/unread"
)
async def test_news_api_star_item(mocker):
"""Test that star_item makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.star_item(item_id=123)
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/items/123/star"
)
async def test_news_api_unstar_item(mocker):
"""Test that unstar_item makes the correct API call."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.unstar_item(item_id=123)
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/items/123/unstar"
)
async def test_news_api_mark_items_read_multiple(mocker):
"""Test that mark_items_read makes the correct API call for multiple items."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.mark_items_read(item_ids=[1, 2, 3])
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/items/read/multiple", json={"itemIds": [1, 2, 3]}
)
async def test_news_api_star_items_multiple(mocker):
"""Test that star_items makes the correct API call for multiple items."""
mock_response = create_mock_response(status_code=200, json_data={})
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
await client.star_items(item_ids=[1, 2, 3])
mock_make_request.assert_called_once_with(
"POST", "/apps/news/api/v1-3/items/star/multiple", json={"itemIds": [1, 2, 3]}
)
# ============================================================================
# Status Tests
# ============================================================================
async def test_news_api_get_status(mocker):
"""Test that get_status correctly parses the API response."""
mock_response = create_mock_news_status_response(
version="25.0.0",
warnings={"improperlyConfiguredCron": False},
)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
status = await client.get_status()
assert status["version"] == "25.0.0"
assert "warnings" in status
mock_make_request.assert_called_once_with("GET", "/apps/news/api/v1-3/status")
async def test_news_api_get_version(mocker):
"""Test that get_version correctly parses the API response."""
mock_response = create_mock_response(
status_code=200, json_data={"version": "25.0.0"}
)
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(
NewsClient, "_make_request", return_value=mock_response
)
client = NewsClient(mock_client, "testuser")
version = await client.get_version()
assert version == "25.0.0"
mock_make_request.assert_called_once_with("GET", "/apps/news/api/v1-3/version")
# ============================================================================
# Error Handling Tests
# ============================================================================
async def test_news_api_create_folder_conflict(mocker):
"""Test that create_folder raises HTTPStatusError on 409 conflict."""
error_response = create_mock_error_response(409, "Folder name already exists")
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(NewsClient, "_make_request")
mock_make_request.side_effect = httpx.HTTPStatusError(
"409 Conflict",
request=httpx.Request("POST", "http://test.local"),
response=error_response,
)
client = NewsClient(mock_client, "testuser")
with pytest.raises(httpx.HTTPStatusError) as excinfo:
await client.create_folder(name="Existing Folder")
assert excinfo.value.response.status_code == 409
async def test_news_api_delete_feed_not_found(mocker):
"""Test that delete_feed raises HTTPStatusError on 404."""
error_response = create_mock_error_response(404, "Feed not found")
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(NewsClient, "_make_request")
mock_make_request.side_effect = httpx.HTTPStatusError(
"404 Not Found",
request=httpx.Request("DELETE", "http://test.local"),
response=error_response,
)
client = NewsClient(mock_client, "testuser")
with pytest.raises(httpx.HTTPStatusError) as excinfo:
await client.delete_feed(feed_id=999999)
assert excinfo.value.response.status_code == 404
async def test_news_api_create_feed_invalid_url(mocker):
"""Test that create_feed raises HTTPStatusError on 422 for invalid URL."""
error_response = create_mock_error_response(422, "Invalid feed URL")
mock_client = mocker.AsyncMock(spec=httpx.AsyncClient)
mock_make_request = mocker.patch.object(NewsClient, "_make_request")
mock_make_request.side_effect = httpx.HTTPStatusError(
"422 Unprocessable Entity",
request=httpx.Request("POST", "http://test.local"),
response=error_response,
)
client = NewsClient(mock_client, "testuser")
with pytest.raises(httpx.HTTPStatusError) as excinfo:
await client.create_feed(url="not-a-valid-url")
assert excinfo.value.response.status_code == 422
+8 -56
View File
@@ -9,7 +9,6 @@ import pytest
from httpx import HTTPStatusError
from mcp import ClientSession
from mcp.client.session import RequestContext
from mcp.client.sse import sse_client
from mcp.client.streamable_http import streamablehttp_client
from mcp.types import ElicitRequestParams, ElicitResult, ErrorData
@@ -114,6 +113,7 @@ async def create_mcp_client_session(
token: str | None = None,
client_name: str = "MCP",
elicitation_callback: Any = None,
sampling_callback: Any = None,
) -> AsyncGenerator[ClientSession, Any]:
"""
Factory function to create an MCP client session with proper lifecycle management.
@@ -133,6 +133,8 @@ async def create_mcp_client_session(
client_name: Client name for logging (e.g., "OAuth MCP (Playwright)")
elicitation_callback: Optional callback for handling elicitation requests.
Should match signature: async def callback(context: RequestContext, params: ElicitRequestParams) -> ElicitResult | ErrorData
sampling_callback: Optional callback for handling sampling (LLM generation) requests.
Should match signature: async def callback(context: RequestContext, params: CreateMessageRequestParams) -> CreateMessageResult | ErrorData
Yields:
Initialized MCP ClientSession
@@ -156,52 +158,10 @@ async def create_mcp_client_session(
_,
):
async with ClientSession(
read_stream, write_stream, elicitation_callback=elicitation_callback
) as session:
await session.initialize()
logger.info(f"{client_name} client session initialized successfully")
yield session
# Cleanup happens automatically in LIFO order - no exception suppression needed
logger.debug(f"{client_name} client session cleaned up successfully")
async def create_mcp_client_session_sse(
url: str,
token: str | None = None,
client_name: str = "MCP",
elicitation_callback: Any = None,
) -> AsyncGenerator[ClientSession, Any]:
"""
Factory function to create an MCP client session using SSE transport.
Similar to create_mcp_client_session but uses SSE transport instead of streamable-http.
Uses native async context managers to ensure correct LIFO cleanup order.
Args:
url: MCP server URL (e.g., "http://localhost:8000/sse")
token: Optional OAuth access token for Bearer authentication
client_name: Client name for logging (e.g., "Basic MCP (SSE)")
elicitation_callback: Optional callback for handling elicitation requests
Yields:
Initialized MCP ClientSession
Note:
SSE transport is being deprecated in favor of streamable-http.
This function exists for compatibility testing only.
"""
logger.info(f"Creating SSE client for {client_name}")
# Prepare headers with OAuth token if provided
headers = {"Authorization": f"Bearer {token}"} if token else None
# Use native async with - Python ensures LIFO cleanup
# Cleanup order will be: ClientSession.__aexit__ -> sse_client.__aexit__
# Note: sse_client yields only (read_stream, write_stream), not 3 values like streamablehttp_client
async with sse_client(url, headers=headers) as (read_stream, write_stream):
async with ClientSession(
read_stream, write_stream, elicitation_callback=elicitation_callback
read_stream,
write_stream,
elicitation_callback=elicitation_callback,
sampling_callback=sampling_callback,
) as session:
await session.initialize()
logger.info(f"{client_name} client session initialized successfully")
@@ -249,18 +209,10 @@ async def nc_client(anyio_backend) -> AsyncGenerator[NextcloudClient, Any]:
@pytest.fixture(scope="session")
async def nc_mcp_client(anyio_backend) -> AsyncGenerator[ClientSession, Any]:
"""
Fixture to create an MCP client session for integration tests using SSE transport.
Fixture to create an MCP client session for integration tests using streamable-http.
Uses anyio pytest plugin for proper async fixture handling.
Note: SSE transport is being deprecated. This fixture uses SSE for compatibility testing.
"""
# async for session in create_mcp_client_session_sse(
# url="http://localhost:8000/sse", client_name="Basic MCP (SSE)"
# ):
# yield session
async for session in create_mcp_client_session(
url="http://localhost:8000/mcp",
client_name="Basic MCP (HTTP)",
+26
View File
@@ -0,0 +1,26 @@
"""Pytest configuration for integration tests.
This conftest.py provides hooks and fixtures specific to integration tests,
including the --provider flag for RAG tests.
"""
# Valid provider names
VALID_PROVIDERS = ["openai", "ollama", "anthropic", "bedrock"]
def pytest_addoption(parser):
"""Add --provider command line option for RAG tests."""
parser.addoption(
"--provider",
action="store",
default=None,
choices=VALID_PROVIDERS,
help="LLM provider for RAG tests: openai, ollama, anthropic, bedrock",
)
def pytest_configure(config):
"""Configure custom markers."""
config.addinivalue_line(
"markers", "rag: mark test as RAG integration test (requires --provider flag)"
)

Some files were not shown because too many files have changed in this diff Show More