- Increase sampling timeout from 30s to 300s in semantic.py to accommodate
slower local LLMs like Ollama
- Refactor RAG integration tests to support multiple providers (ollama,
openai, anthropic, bedrock)
- Remove unnecessary embedding_provider fixture since MCP server handles
embeddings internally
- Add --provider flag via tests/integration/conftest.py
- Add provider_fixtures.py with factory functions for generation providers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The refactor in fafeaf3 moved background tasks to Starlette server lifespan
but broke nc_get_vector_sync_status because it still looked for streams in
FastMCP's AppContext (lifespan_context).
Add VectorSyncState module singleton to bridge the lifespans:
- starlette_lifespan sets the singleton when starting background tasks
- app_lifespan_basic reads from singleton and includes in AppContext
- MCP tools can now access document_receive_stream for pending count
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The refactor in fafeaf3 moved background tasks to Starlette server lifespan
but broke nc_get_vector_sync_status because it still looked for streams in
FastMCP's AppContext (lifespan_context).
Add VectorSyncState module singleton to bridge the lifespans:
- starlette_lifespan sets the singleton when starting background tasks
- app_lifespan_basic reads from singleton and includes in AppContext
- MCP tools can now access document_receive_stream for pending count
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Move scanner/processor tasks from FastMCP session lifespan to Starlette
server lifespan (correct architecture: background tasks run once at
server level, not per-session)
- Change default CLI transport from SSE to streamable-http
- Remove SSE transport option from CLI (SSE is deprecated)
- Remove SSE client session factory from test fixtures
- Add tracing instrumentation to BM25 hybrid search operations for
better observability
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Change create_tag() to use WebDAV POST instead of OCS API which
returned 404 in some Nextcloud versions
- Add llm_judge() helper that evaluates system output against ground
truth with simple TRUE/FALSE prompt
- Replace keyword-based assertions in RAG tests with LLM judge for
more flexible semantic evaluation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add get_file_info() to get file info including file ID via PROPFIND
- Add create_tag() to create system tags via OCS API
- Add get_or_create_tag() for idempotent tag creation
- Add assign_tag_to_file() to assign tags to files via WebDAV
- Add remove_tag_from_file() to remove tags from files
Also refactors RAG evaluation:
- Add indexed_manual_pdf fixture using existing nc_client/nc_mcp_client
- Remove manual tag creation steps from workflow (now handled by fixture)
- Add comprehensive unit tests for new WebDAV methods
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Adds a manually-triggered GitHub Actions workflow for RAG evaluation:
- Builds Nextcloud User Manual PDF from documentation source
- Uploads PDF to Nextcloud via WebDAV
- Tags file with 'vector-index' for vector sync indexing
- Waits for vector sync to complete
- Runs RAG integration tests with OpenAI/GitHub Models API
Inputs:
- embedding_model: OpenAI embedding model (default: openai/text-embedding-3-small)
- generation_model: OpenAI generation model (default: openai/gpt-4o-mini)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Adds OpenAI provider to the unified provider architecture (ADR-015),
supporting:
- OpenAI API (api.openai.com)
- GitHub Models API (models.github.ai/inference)
- OpenAI-compatible endpoints (Fireworks, Together, etc.)
Features:
- Embedding support with text-embedding-3-small/large models
- Text generation via chat completions API
- Automatic retry with exponential backoff for rate limits
- Provider auto-detection in registry (priority after Bedrock)
Environment variables:
- OPENAI_API_KEY: API key (required)
- OPENAI_BASE_URL: Base URL override (optional)
- OPENAI_EMBEDDING_MODEL: Embedding model (default: text-embedding-3-small)
- OPENAI_GENERATION_MODEL: Generation model (default: gpt-4o-mini)
Also adds:
- Integration tests for RAG pipeline with MCP sampling
- MCP client sampling support for integration tests
- Ground truth Q&A pairs for Nextcloud User Manual
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The Smithery scanner was reporting "0 tools" despite the server returning
valid tool definitions. Root cause: the server was returning SSE-formatted
responses (event: message\ndata: {...}) which the scanner couldn't parse.
Changes:
- Add json_response=True to FastMCP for Smithery stateless mode
- Clean up verbose docstring examples in semantic.py and webdav.py
The MCP spec allows both SSE and plain JSON responses for HTTP transport.
Setting json_response=True returns Content-Type: application/json with
plain JSON-RPC instead of text/event-stream with SSE format.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace sequential Qdrant scroll calls with batch retrieve
(50 HTTP requests → 1 request, ~50x faster vector fetch)
- Add point_id to SearchResult to enable batch retrieval by Qdrant point ID
- Reuse query embedding from search algorithm in viz_routes
(eliminates redundant embedding call, saves ~30ms)
- Make BM25 encode() async with thread pool to avoid blocking event loop
(~4.4s was blocking, now properly async)
- Run PCA computation in thread pool to avoid blocking event loop
(~1.2s was blocking, now properly async)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Reorganize README to promote Smithery as the fastest way to get started:
- Quick Start now features Smithery one-click deployment
- Docker instructions moved to separate "Docker (Self-Hosted)" section
- Added note about Smithery's stateless mode limitations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
ADR-016: For container runtime deployment, Smithery does not auto-generate
the .well-known/mcp-config endpoint like it does for Python CLI runtime.
Changes:
- Remove [tool.smithery] from pyproject.toml (not used in container mode)
- Remove smithery_server.py (Python CLI runtime specific)
- Add .well-known/mcp-config endpoint to return JSON Schema config
- Add SmitheryConfigMiddleware to extract config from URL query params
- Use ContextVar to pass session config to tool handlers
The container runtime passes config as URL query parameters to /mcp:
GET /mcp?nextcloud_url=...&username=...&app_password=...
Tested:
- All 164 unit tests passing
- Docker container builds successfully
- .well-known/mcp-config returns valid JSON Schema
- Health endpoints working
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Adds support for Smithery hosted deployment with stateless operation:
- Add DeploymentMode enum with SELF_HOSTED and SMITHERY_STATELESS modes
- Add get_deployment_mode() to detect mode from SMITHERY_DEPLOYMENT env var
- Update get_client() to create per-request clients from session config
- Add conditional tool registration (skip semantic search in Smithery mode)
- Add conditional /app admin UI mounting (skip in Smithery mode)
- Create smithery.yaml with configSchema for user credentials
- Create Dockerfile.smithery for minimal stateless container
- Create smithery_main.py entrypoint for Smithery deployment
In Smithery mode:
- Users provide nextcloud_url, username, app_password via session config
- Each request creates a fresh NextcloudClient (no state between requests)
- Semantic search tools are disabled (no vector database)
- Admin UI (/app) is disabled (no webhooks, vector viz)
All existing self-hosted functionality remains unchanged.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add architecture decision record for supporting Smithery-hosted MCP
server in a stateless mode for multi-user public Nextcloud instances.
Key decisions:
- New SMITHERY_STATELESS deployment mode alongside SELF_HOSTED
- Session-based configuration (nextcloud_url, username, app_password)
- Feature subset excluding semantic search and background sync
- Admin UI (/app) excluded in Smithery mode
- Per-request client creation from session config
This enables users to try the MCP server without self-hosting
infrastructure while supporting multiple Nextcloud instances.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Drawing directly with ImageDraw on RGBA mode doesn't blend alpha
properly. Use Image.alpha_composite() with a transparent overlay
to achieve correct semi-transparent highlight fills.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>