feat(vector): Support multiple embedding models with auto-generated collection names
This PR enables safe switching between embedding models and multi-server
deployments by implementing auto-generated Qdrant collection names based on
deployment ID and model name.
## Problem
Previously, all deployments used a single hardcoded collection name
"nextcloud_content", which caused two critical issues:
1. **Dimension mismatches when switching models**: Changing
OLLAMA_EMBEDDING_MODEL (e.g., nomic-embed-text at 768D → all-minilm at
384D) would cause runtime errors as vectors couldn't be inserted into a
collection with incompatible dimensions.
2. **Collection collisions in multi-server setups**: Multiple MCP servers
sharing a single Qdrant instance would overwrite each other's data,
making horizontal scaling impossible.
## Solution
### Auto-Generated Collection Naming
Collections are now automatically named using the pattern:
\`{deployment-id}-{model-name}\`
**Deployment ID**: Uses \`OTEL_SERVICE_NAME\` if configured (and not default
value), otherwise falls back to \`hostname\` for simple Docker deployments.
**Model Name**: From \`OLLAMA_EMBEDDING_MODEL\` with path separators sanitized.
**Examples**:
- \`my-mcp-server-nomic-embed-text\` (with OTEL_SERVICE_NAME=my-mcp-server)
- \`mcp-container-all-minilm\` (simple Docker, hostname=mcp-container)
**Override**: Users can still set \`QDRANT_COLLECTION\` explicitly to bypass
auto-generation for backward compatibility.
### Dimension Validation
Added startup validation that checks collection dimensions match the
embedding service. If a mismatch is detected, the server fails fast with a
clear error message explaining:
- Expected vs actual dimensions
- Likely cause (model change)
- Solutions (delete collection, use different name, or revert model)
### Improved Sampling Error Handling
Enhanced MCP sampling rejection handling to treat user rejections as normal
behavior rather than errors:
- **User rejections** ("rejected", "denied") → INFO log, no traceback
- **Unsupported clients** → INFO log, no traceback
- **Other MCP errors** → WARNING log, no traceback
- **Unexpected errors** → ERROR log WITH traceback
This aligns with the MCP specification where clients SHOULD prompt users for
approval/denial of sampling requests.
## Changes
### Core Implementation
- **nextcloud_mcp_server/config.py**: Added \`get_collection_name()\` method
with deployment ID detection and model name sanitization
- **nextcloud_mcp_server/vector/qdrant_client.py**: Dimension validation on
collection open with helpful error messages
- **nextcloud_mcp_server/vector/{scanner,processor}.py**: Updated to use
\`get_collection_name()\`
- **nextcloud_mcp_server/auth/userinfo_routes.py**: Vector sync status uses
\`get_collection_name()\`
- **nextcloud_mcp_server/server/semantic.py**:
- Updated semantic search tools to use \`get_collection_name()\`
- Improved sampling rejection error handling (McpError vs Exception)
### Documentation
- **docs/semantic-search-architecture.md**: New comprehensive architecture
document (557 lines) covering background sync, semantic search flow, RAG
implementation, and deployment modes
- **docs/configuration.md**: Added detailed "Qdrant Collection Naming"
section with examples and multi-server deployment guidance
- **docker-compose.yml**: Added comments explaining collection naming behavior
- **README.md**: Updated semantic search descriptions to clarify
experimental status, Notes-only support, and infrastructure requirements
## Migration Guide
**For existing single-server deployments:**
Option 1 (Recommended): Use explicit collection name for continuity
\`\`\`bash
QDRANT_COLLECTION=nextcloud_content # Keep existing collection
\`\`\`
Option 2: Allow auto-generation and re-embed
\`\`\`bash
# Remove QDRANT_COLLECTION override
# New collection will be created based on deployment ID + model
# Requires re-embedding all documents (may take time)
\`\`\`
**For new multi-server deployments:**
Set unique OTEL service names per server:
\`\`\`bash
# Server 1
OTEL_SERVICE_NAME=mcp-prod
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# → Collection: "mcp-prod-nomic-embed-text"
# Server 2
OTEL_SERVICE_NAME=mcp-staging
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# → Collection: "mcp-staging-nomic-embed-text"
\`\`\`
## Benefits
✅ **Safe model switching**: Each model gets its own collection, preventing
dimension mismatch errors
✅ **Multi-server support**: Multiple MCP servers can share one Qdrant
instance without conflicts
✅ **Clear ownership**: Collection names show which deployment and model owns
the data
✅ **Better error messages**: Dimension validation provides actionable
guidance
✅ **Backward compatible**: Existing deployments can continue using
\`QDRANT_COLLECTION\` override
## Testing
Validated with:
- Single-server deployments (default hostname-based naming)
- Multi-server deployments (OTEL service name-based naming)
- Model switching scenarios (dimension validation)
- Collection override scenarios (backward compatibility)
Next steps: Testing various Ollama embedding models to investigate optimal
chunk sizes and performance characteristics.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -2,286 +2,134 @@
|
||||
|
||||
[](https://github.com/cbcoutinho/nextcloud-mcp-server/pkgs/container/nextcloud-mcp-server)
|
||||
|
||||
**Enable AI assistants to interact with your Nextcloud instance.**
|
||||
**A production-ready MCP server that connects AI assistants to your Nextcloud instance.**
|
||||
|
||||
The Nextcloud MCP (Model Context Protocol) server allows Large Language Models like Claude, GPT, and Gemini to interact with your Nextcloud data through a secure API. Create notes, manage calendars, organize contacts, work with files, and more - all through natural language.
|
||||
Enable Large Language Models like Claude, GPT, and Gemini to interact with your Nextcloud data through a secure API. Create notes, manage calendars, organize contacts, work with files, and more - all through natural language conversations.
|
||||
|
||||
This is a **dedicated standalone MCP server** designed for external MCP clients like Claude Code and IDEs. It runs independently of Nextcloud (Docker, VM, Kubernetes, or local) and provides deep CRUD operations across Nextcloud apps.
|
||||
|
||||
> [!NOTE]
|
||||
> **Nextcloud has two ways to enable AI access:** Nextcloud provides [Context Agent](https://github.com/nextcloud/context_agent), an AI agent backend that powers the [Assistant](https://github.com/nextcloud/assistant) app and allows AI to interact with Nextcloud apps like Calendar, Talk, and Contacts. Context Agent runs as an ExApp inside Nextcloud and also _[exposes an MCP server](https://docs.nextcloud.com/server/stable/admin_manual/ai/app_context_agent.html#using-nextcloud-mcp-server)_ for external MCP clients.
|
||||
>
|
||||
> This project (Nextcloud MCP Server) is a **dedicated standalone MCP server** designed specifically for external MCP clients like Claude Code and IDEs, with deep CRUD operations and OAuth support. It does not require any additional AI-features to be enabled in Nextcloud beyond the apps that you intend to interact with.
|
||||
|
||||
### High-level Comparison: Nextcloud MCP Server vs. Nextcloud AI Stack
|
||||
|
||||
| Aspect | **Nextcloud MCP Server**<br/>(This Project) | **Nextcloud AI Stack**<br/>(Assistant + Context Agent) |
|
||||
|--------|---------------------------------------------|--------------------------------------------------------|
|
||||
| **Purpose** | External MCP client access to Nextcloud | AI assistance within Nextcloud UI |
|
||||
| **Deployment** | Standalone (Docker, VM, K8s) | Inside Nextcloud (ExApp via AppAPI) |
|
||||
| **Primary Users** | Claude Code, IDEs, external developers | Nextcloud end users via Assistant app |
|
||||
| **Authentication** | OAuth2/OIDC or Basic Auth | Session-based (integrated) |
|
||||
| **Notes Support** | ✅ Full CRUD + keyword search (7 tools) | ❌ Not implemented |
|
||||
| **Semantic Search** | ✅ Multi-app vector search (2+ tools) | ❌ Not implemented |
|
||||
| **Calendar** | ✅ Full CalDAV + tasks (20+ tools) | ✅ Events, free/busy, tasks (4 tools) |
|
||||
| **Contacts** | ✅ Full CardDAV (8 tools) | ✅ Find person, current user (2 tools) |
|
||||
| **Files (WebDAV)** | ✅ Full filesystem access (12 tools) | ✅ Read, folder tree, sharing (3 tools) |
|
||||
| **Document Processing** | ✅ OCR with progress (PDF, DOCX, images) | ❌ Not implemented |
|
||||
| **Deck** | ✅ Full project management (15 tools) | ✅ Basic board/card ops (2 tools) |
|
||||
| **Tables** | ✅ Row operations (5 tools) | ❌ Not implemented |
|
||||
| **Cookbook** | ✅ Full recipe management (13 tools) | ❌ Not implemented |
|
||||
| **Talk** | ❌ Not implemented | ✅ Messages, conversations (4 tools) |
|
||||
| **Mail** | ❌ Not implemented | ✅ Send email (2 tools) |
|
||||
| **AI Features** | ❌ Not implemented | ✅ Image gen, transcription, doc gen (4 tools) |
|
||||
| **Web/Maps** | ❌ Not implemented | ✅ Search, weather, transit (5 tools) |
|
||||
| **MCP Resources** | ✅ Structured data URIs | ❌ Not supported |
|
||||
| **External MCP** | ❌ Pure server | ✅ Consumes external MCP servers |
|
||||
| **Safety Model** | Client-controlled | Built-in safe/dangerous distinction |
|
||||
| **Best For** | • Deep CRUD operations<br/>• External integrations<br/>• OAuth security<br/>• IDE/editor integration | • AI-driven actions in Nextcloud UI<br/>• Multi-service orchestration<br/>• User task automation<br/>• MCP aggregation hub |
|
||||
|
||||
See our [detailed comparison](docs/comparison-context-agent.md) for architecture diagrams, workflow examples, and guidance on when to use each approach.
|
||||
|
||||
Want to see another Nextcloud app supported? [Open an issue](https://github.com/cbcoutinho/nextcloud-mcp-server/issues) or contribute a pull request!
|
||||
|
||||
### Authentication
|
||||
|
||||
| Mode | Security | Best For |
|
||||
|------|----------|----------|
|
||||
| **OAuth2/OIDC** ⚠️ **Experimental** | 🔒 High | Testing, evaluation (requires patch for app-specific APIs) |
|
||||
| **Basic Auth** ✅ | Lower | Development, testing, production |
|
||||
|
||||
> [!IMPORTANT]
|
||||
> **OAuth is experimental** and requires a manual patch to the `user_oidc` app for full functionality:
|
||||
> - **Required patch**: `user_oidc` app needs modifications for Bearer token support ([issue #1221](https://github.com/nextcloud/user_oidc/issues/1221))
|
||||
> - **Impact**: Without the patch, most app-specific APIs (Notes, Calendar, Contacts, Deck, etc.) will fail with 401 errors
|
||||
> - **What works without patches**: OAuth flow, PKCE support (with `oidc` v1.10.0+), OCS APIs
|
||||
> - **Production use**: Wait for upstream patch to be merged into official releases
|
||||
>
|
||||
> See [OAuth Upstream Status](docs/oauth-upstream-status.md) for detailed information on required patches and workarounds.
|
||||
|
||||
OAuth2/OIDC provides secure, per-user authentication with access tokens. See [Authentication Guide](docs/authentication.md) for details.
|
||||
> **Looking for AI features inside Nextcloud?** Nextcloud also provides [Context Agent](https://github.com/nextcloud/context_agent), which powers the Assistant app and runs as an ExApp inside Nextcloud. See [docs/comparison-context-agent.md](docs/comparison-context-agent.md) for a detailed comparison of use cases.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Install
|
||||
Get up and running in 60 seconds using Docker:
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/cbcoutinho/nextcloud-mcp-server.git
|
||||
cd nextcloud-mcp-server
|
||||
|
||||
# Install with uv (recommended)
|
||||
uv sync
|
||||
|
||||
# Or using Docker
|
||||
docker pull ghcr.io/cbcoutinho/nextcloud-mcp-server:latest
|
||||
|
||||
# Or deploy to Kubernetes with Helm
|
||||
helm repo add nextcloud-mcp https://cbcoutinho.github.io/nextcloud-mcp-server
|
||||
helm repo update
|
||||
helm install nextcloud-mcp nextcloud-mcp/nextcloud-mcp-server \
|
||||
--set nextcloud.host=https://cloud.example.com \
|
||||
--set auth.basic.username=myuser \
|
||||
--set auth.basic.password=mypassword
|
||||
```
|
||||
|
||||
See [Installation Guide](docs/installation.md) for detailed instructions, or [Helm Chart README](charts/nextcloud-mcp-server/README.md) for Kubernetes deployment.
|
||||
|
||||
### 2. Configure
|
||||
|
||||
Create a `.env` file:
|
||||
|
||||
```bash
|
||||
# Copy the sample
|
||||
cp env.sample .env
|
||||
```
|
||||
|
||||
**For Basic Auth (recommended for most users):**
|
||||
```dotenv
|
||||
# 1. Create a minimal configuration
|
||||
cat > .env << EOF
|
||||
NEXTCLOUD_HOST=https://your.nextcloud.instance.com
|
||||
NEXTCLOUD_USERNAME=your_username
|
||||
NEXTCLOUD_PASSWORD=your_app_password
|
||||
```
|
||||
EOF
|
||||
|
||||
**For OAuth (experimental - requires patches):**
|
||||
```dotenv
|
||||
NEXTCLOUD_HOST=https://your.nextcloud.instance.com
|
||||
```
|
||||
|
||||
See [Configuration Guide](docs/configuration.md) for all options.
|
||||
|
||||
### 3. Set Up Authentication
|
||||
|
||||
**Basic Auth Setup (recommended):**
|
||||
1. Create an app password in Nextcloud (Settings → Security → Devices & sessions)
|
||||
2. Add credentials to `.env` file
|
||||
3. Start the server
|
||||
|
||||
**OAuth Setup (experimental):**
|
||||
1. Install Nextcloud OIDC apps (`oidc` v1.10.0+ + `user_oidc`)
|
||||
2. **Apply required patch** to `user_oidc` app for Bearer token support (see [OAuth Upstream Status](docs/oauth-upstream-status.md))
|
||||
3. Enable dynamic client registration or create an OIDC client with id & secret
|
||||
4. Configure Bearer token validation in `user_oidc`
|
||||
5. Start the server
|
||||
|
||||
See [OAuth Quick Start](docs/quickstart-oauth.md) for 5-minute setup or [OAuth Setup Guide](docs/oauth-setup.md) for detailed instructions.
|
||||
|
||||
### 4. Run the Server
|
||||
|
||||
```bash
|
||||
# Load environment variables
|
||||
export $(grep -v '^#' .env | xargs)
|
||||
|
||||
# Start with Basic Auth (default)
|
||||
uv run nextcloud-mcp-server
|
||||
|
||||
# Or start with OAuth (experimental - requires patches)
|
||||
uv run nextcloud-mcp-server --oauth
|
||||
|
||||
# Or with Docker
|
||||
# 2. Start the server
|
||||
docker run -p 127.0.0.1:8000:8000 --env-file .env --rm \
|
||||
ghcr.io/cbcoutinho/nextcloud-mcp-server:latest
|
||||
|
||||
# 3. Test the connection
|
||||
curl http://127.0.0.1:8000/health
|
||||
```
|
||||
|
||||
The server starts on `http://127.0.0.1:8000` by default.
|
||||
**Next Steps:**
|
||||
- Create an app password in Nextcloud: Settings → Security → Devices & sessions
|
||||
- Connect your MCP client (Claude Desktop, IDEs, `mcp dev`, etc.)
|
||||
- See [docs/installation.md](docs/installation.md) for other deployment options (local, Kubernetes)
|
||||
|
||||
See [Running the Server](docs/running.md) for more options.
|
||||
## Key Features
|
||||
|
||||
### 5. Connect an MCP Client
|
||||
- **90+ MCP Tools** - Comprehensive API coverage across 8 Nextcloud apps
|
||||
- **MCP Resources** - Structured data URIs for browsing Nextcloud data
|
||||
- **Semantic Search (Experimental)** - Optional vector-powered search for Notes (requires Qdrant + Ollama)
|
||||
- **Document Processing** - OCR and text extraction from PDFs, DOCX, images with progress notifications
|
||||
- **Flexible Deployment** - Docker, Kubernetes (Helm), VM, or local installation
|
||||
- **Production-Ready Auth** - Basic Auth with app passwords (recommended) or OAuth2/OIDC (experimental)
|
||||
- **Multiple Transports** - SSE, HTTP, and streamable-http support
|
||||
|
||||
Test with MCP Inspector:
|
||||
## Supported Apps
|
||||
|
||||
```bash
|
||||
uv run mcp dev
|
||||
```
|
||||
| App | Tools | Capabilities |
|
||||
|-----|-------|--------------|
|
||||
| **Notes** | 7 | Full CRUD, keyword search, semantic search |
|
||||
| **Calendar** | 20+ | Events, todos (tasks), recurring events, attendees, availability |
|
||||
| **Contacts** | 8 | Full CardDAV support, address books |
|
||||
| **Files (WebDAV)** | 12 | Filesystem access, OCR/document processing |
|
||||
| **Deck** | 15 | Boards, stacks, cards, labels, assignments |
|
||||
| **Cookbook** | 13 | Recipe management, URL import (schema.org) |
|
||||
| **Tables** | 5 | Row operations on Nextcloud Tables |
|
||||
| **Sharing** | 10+ | Create and manage shares |
|
||||
| **Semantic Search** | 2+ | Vector search for Notes (experimental, opt-in, requires infrastructure) |
|
||||
|
||||
Or connect from:
|
||||
- Claude Desktop
|
||||
- Any MCP-compatible client
|
||||
Want to see another Nextcloud app supported? [Open an issue](https://github.com/cbcoutinho/nextcloud-mcp-server/issues) or contribute a pull request!
|
||||
|
||||
## Authentication
|
||||
|
||||
> [!IMPORTANT]
|
||||
> **OAuth2/OIDC is experimental** and requires a manual patch to the `user_oidc` app:
|
||||
> - **Required patch**: Bearer token support ([issue #1221](https://github.com/nextcloud/user_oidc/issues/1221))
|
||||
> - **Impact**: Without the patch, most app-specific APIs fail with 401 errors
|
||||
> - **Recommendation**: Use Basic Auth for production until upstream patches are merged
|
||||
>
|
||||
> See [docs/oauth-upstream-status.md](docs/oauth-upstream-status.md) for patch status and workarounds.
|
||||
|
||||
**Recommended:** Basic Auth with app-specific passwords provides secure, production-ready authentication. See [docs/authentication.md](docs/authentication.md) for setup details and OAuth configuration.
|
||||
|
||||
### Authentication Modes
|
||||
|
||||
The server supports two authentication modes:
|
||||
|
||||
**Single-User Mode (BasicAuth):**
|
||||
- One set of credentials shared by all MCP clients
|
||||
- Simple setup: username + app password in environment variables
|
||||
- All clients access Nextcloud as the same user
|
||||
- Best for: Personal use, development, single-user deployments
|
||||
|
||||
**Multi-User Mode (OAuth):**
|
||||
- Each MCP client authenticates separately with their own Nextcloud account
|
||||
- Per-user scopes and permissions (clients only see tools they're authorized for)
|
||||
- More secure: tokens expire, credentials never shared with server
|
||||
- Best for: Teams, multi-user deployments, production environments with multiple users
|
||||
|
||||
See [docs/authentication.md](docs/authentication.md) for detailed setup instructions.
|
||||
|
||||
## Semantic Search
|
||||
|
||||
The server provides an experimental RAG pipeline to enable _Semantic Search_ that enables MCP clients to find information in Nextcloud based on **meaning** rather than just keywords. Instead of matching "machine learning" only when those exact words appear, it understands that "neural networks," "AI models," and "deep learning" are semantically related concepts.
|
||||
|
||||
**Example:**
|
||||
- **Keyword search**: Query "car" only finds notes containing "car"
|
||||
- **Semantic search**: Query "car" also finds notes about "automobile," "vehicle," "sedan," "transportation"
|
||||
|
||||
This enables natural language queries and helps discover related content across your Nextcloud notes.
|
||||
|
||||
> [!NOTE]
|
||||
> **Semantic Search is experimental and opt-in:**
|
||||
> - Disabled by default (`VECTOR_SYNC_ENABLED=false`)
|
||||
> - Currently supports Notes app only (multi-app support planned)
|
||||
> - Requires additional infrastructure: vector database + embedding service
|
||||
> - Answer generation (`nc_semantic_search_answer`) requires MCP client sampling support
|
||||
>
|
||||
> See [docs/semantic-search-architecture.md](docs/semantic-search-architecture.md) for architecture details and [docs/configuration.md](docs/configuration.md) for setup instructions.
|
||||
|
||||
## Documentation
|
||||
|
||||
### Getting Started
|
||||
- **[Installation](docs/installation.md)** - Install the server
|
||||
- **[Configuration](docs/configuration.md)** - Environment variables and settings
|
||||
- **[Authentication](docs/authentication.md)** - OAuth vs BasicAuth
|
||||
- **[Running the Server](docs/running.md)** - Start and manage the server
|
||||
- **[Installation](docs/installation.md)** - Docker, Kubernetes, local, or VM deployment
|
||||
- **[Configuration](docs/configuration.md)** - Environment variables and advanced options
|
||||
- **[Authentication](docs/authentication.md)** - Basic Auth vs OAuth2/OIDC setup
|
||||
- **[Running the Server](docs/running.md)** - Start, manage, and troubleshoot
|
||||
|
||||
### Architecture
|
||||
- **[Comparison with Context Agent](docs/comparison-context-agent.md)** - How this MCP server differs from Nextcloud's Context Agent
|
||||
### Features
|
||||
- **[App Documentation](docs/)** - Notes, Calendar, Contacts, WebDAV, Deck, Cookbook, Tables
|
||||
- **[Document Processing](docs/configuration.md#document-processing)** - OCR and text extraction setup
|
||||
- **[Semantic Search Architecture](docs/semantic-search-architecture.md)** - Experimental vector search (Notes only, opt-in)
|
||||
|
||||
### OAuth Documentation (Experimental)
|
||||
- **[OAuth Quick Start](docs/quickstart-oauth.md)** - 5-minute setup guide
|
||||
- **[OAuth Setup Guide](docs/oauth-setup.md)** - Detailed setup instructions
|
||||
- **[OAuth Architecture](docs/oauth-architecture.md)** - How OAuth works
|
||||
- **[OAuth Troubleshooting](docs/oauth-troubleshooting.md)** - OAuth-specific issues
|
||||
- **[Upstream Status](docs/oauth-upstream-status.md)** - **Required patches and PRs** ⚠️
|
||||
|
||||
### Reference
|
||||
### Advanced Topics
|
||||
- **[OAuth Architecture](docs/oauth-architecture.md)** - How OAuth works (experimental)
|
||||
- **[OAuth Quick Start](docs/quickstart-oauth.md)** - 5-minute OAuth setup
|
||||
- **[OAuth Setup Guide](docs/oauth-setup.md)** - Detailed OAuth configuration
|
||||
- **[Troubleshooting](docs/troubleshooting.md)** - Common issues and solutions
|
||||
|
||||
### App-Specific Documentation
|
||||
- [Notes API](docs/notes.md)
|
||||
- [Calendar (CalDAV)](docs/calendar.md)
|
||||
- [Contacts (CardDAV)](docs/contacts.md)
|
||||
- [Cookbook](docs/cookbook.md)
|
||||
- [Deck](docs/deck.md)
|
||||
- [Tables](docs/table.md)
|
||||
- [WebDAV](docs/webdav.md)
|
||||
|
||||
## MCP Tools & Resources
|
||||
|
||||
The server exposes Nextcloud functionality through MCP tools (for actions) and resources (for data browsing).
|
||||
|
||||
### Tools
|
||||
|
||||
The server provides 90+ tools across 8 Nextcloud apps. When using OAuth, tools are dynamically filtered based on your granted scopes.
|
||||
|
||||
For a complete list of all supported OAuth scopes and their descriptions, see [OAuth Scopes Documentation](docs/oauth-architecture.md#oauth-scopes).
|
||||
|
||||
#### Available Tool Categories
|
||||
|
||||
| App | Tools | Read Scope | Write Scope | Operations |
|
||||
|-----|-------|-----------|-------------|------------|
|
||||
| **Notes** | 7 | `notes:read` | `notes:write` | Create, read, update, delete, search notes (keyword search) |
|
||||
| **Calendar** | 20+ | `calendar:read` `todo:read` | `calendar:write` `todo:write` | Events, todos (tasks), calendars, recurring events, attendees |
|
||||
| **Contacts** | 8 | `contacts:read` | `contacts:write` | Create, read, update, delete contacts and address books |
|
||||
| **Files (WebDAV)** | 12 | `files:read` | `files:write` | List, read, upload, delete, move files; **OCR/document processing** |
|
||||
| **Deck** | 15 | `deck:read` | `deck:write` | Boards, stacks, cards, labels, assignments |
|
||||
| **Cookbook** | 13 | `cookbook:read` | `cookbook:write` | Recipes, import from URLs, search, categories |
|
||||
| **Tables** | 5 | `tables:read` | `tables:write` | Row operations on Nextcloud Tables |
|
||||
| **Sharing** | 10+ | `sharing:read` | `sharing:write` | Create, manage, delete shares |
|
||||
| **Semantic Search** | 2+ | `semantic:read` | `semantic:write` | Vector-powered semantic search across **all apps** (notes, calendar, deck, files, contacts), background indexing |
|
||||
|
||||
#### Document Processing (Optional)
|
||||
|
||||
The WebDAV file reading tool (`nc_webdav_read_file`) supports **automatic text extraction** from documents and images:
|
||||
|
||||
**Supported Formats:**
|
||||
- **Documents**: PDF, DOCX, PPTX, XLSX, RTF, ODT, EPUB
|
||||
- **Images**: PNG, JPEG, TIFF, BMP (with OCR)
|
||||
- **Email**: EML, MSG files
|
||||
|
||||
**Features:**
|
||||
- **Progress Notifications**: Long-running OCR operations (up to 120s) send progress updates every 10 seconds to prevent client timeouts
|
||||
- **Pluggable Architecture**: Multiple processor backends (Unstructured.io, Tesseract, custom HTTP APIs)
|
||||
- **Automatic Detection**: Files are processed based on MIME type
|
||||
- **Graceful Fallback**: Returns base64-encoded content if processing fails
|
||||
|
||||
**Configuration:**
|
||||
```dotenv
|
||||
# Enable document processing (optional)
|
||||
ENABLE_DOCUMENT_PROCESSING=true
|
||||
|
||||
# Unstructured.io processor (cloud/API-based, supports many formats)
|
||||
ENABLE_UNSTRUCTURED=true
|
||||
UNSTRUCTURED_API_URL=http://localhost:8002
|
||||
UNSTRUCTURED_STRATEGY=auto # auto, fast, or hi_res
|
||||
UNSTRUCTURED_LANGUAGES=eng,deu
|
||||
PROGRESS_INTERVAL=10 # Progress update interval in seconds
|
||||
|
||||
# Tesseract processor (local OCR, images only)
|
||||
ENABLE_TESSERACT=false
|
||||
TESSERACT_LANG=eng
|
||||
|
||||
# Custom HTTP processor
|
||||
ENABLE_CUSTOM_PROCESSOR=false
|
||||
CUSTOM_PROCESSOR_URL=http://localhost:9000/process
|
||||
CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg
|
||||
```
|
||||
|
||||
**Example Usage:**
|
||||
```
|
||||
AI: "Read the contents of Documents/report.pdf"
|
||||
→ Uses nc_webdav_read_file tool with automatic OCR processing
|
||||
→ Returns extracted text with parsing metadata
|
||||
→ Sends progress updates during long operations
|
||||
```
|
||||
|
||||
See [env.sample](env.sample) for complete configuration options.
|
||||
|
||||
**Example Tools:**
|
||||
- `nc_notes_create_note` - Create a new note
|
||||
- `nc_cookbook_import_recipe` - Import recipes from URLs with schema.org metadata
|
||||
- `deck_create_card` - Create a Deck card
|
||||
- `nc_calendar_create_event` - Create a calendar event
|
||||
- `nc_calendar_create_todo` - Create a CalDAV task/todo
|
||||
- `nc_contacts_create_contact` - Create a contact
|
||||
- `nc_webdav_upload_file` - Upload a file to Nextcloud
|
||||
- And 80+ more...
|
||||
|
||||
> [!TIP]
|
||||
> **OAuth Scope Filtering**: When connecting via OAuth, MCP clients will only see tools for which you've granted access. For example, granting only `notes:read` and `notes:write` will show 7 Notes tools instead of all 90+ tools. See [OAuth Scopes Documentation](docs/oauth-architecture.md#oauth-scopes) for the complete scope reference, or [OAuth Troubleshooting - Limited Scopes](docs/oauth-troubleshooting.md#limited-scopes---only-seeing-notes-tools) if you're only seeing a subset of tools.
|
||||
>
|
||||
> **Known Issue**: Claude Code and some other MCP clients may only request/grant Notes scopes during initial connection. Track progress at [#234](https://github.com/cbcoutinho/nextcloud-mcp-server/issues/234).
|
||||
|
||||
### Resources
|
||||
Resources provide read-only access to Nextcloud data:
|
||||
- `nc://capabilities` - Server capabilities
|
||||
- `cookbook://version` - Cookbook app version info
|
||||
- `nc://Deck/boards/{board_id}` - Deck board data
|
||||
- `notes://settings` - Notes app settings
|
||||
- And more...
|
||||
|
||||
Run `uv run nextcloud-mcp-server --help` to see all available options.
|
||||
- **[Comparison with Context Agent](docs/comparison-context-agent.md)** - When to use each approach
|
||||
|
||||
## Examples
|
||||
|
||||
@@ -291,45 +139,31 @@ AI: "Create a note called 'Meeting Notes' with today's agenda"
|
||||
→ Uses nc_notes_create_note tool
|
||||
```
|
||||
|
||||
### Manage Recipes
|
||||
### Import Recipes
|
||||
```
|
||||
AI: "Import the recipe from this URL: https://www.example.com/recipe/chocolate-cake"
|
||||
→ Uses nc_cookbook_import_recipe tool to extract schema.org metadata
|
||||
AI: "Import the recipe from https://www.example.com/recipe/chocolate-cake"
|
||||
→ Uses nc_cookbook_import_recipe tool with schema.org metadata extraction
|
||||
```
|
||||
|
||||
### Manage Calendar
|
||||
### Schedule Meetings
|
||||
```
|
||||
AI: "Schedule a team meeting for next Tuesday at 2pm"
|
||||
→ Uses nc_calendar_create_event tool
|
||||
```
|
||||
|
||||
### Organize Files
|
||||
### Manage Files
|
||||
```
|
||||
AI: "Create a folder called 'Project X' and move all PDFs there"
|
||||
→ Uses WebDAV tools (nc_webdav_create_directory, nc_webdav_move)
|
||||
→ Uses nc_webdav_create_directory and nc_webdav_move tools
|
||||
```
|
||||
|
||||
### Project Management
|
||||
### Semantic Search (Experimental, Opt-in)
|
||||
```
|
||||
AI: "Create a new Deck board for Q1 planning with Todo, In Progress, and Done stacks"
|
||||
→ Uses deck_create_board and deck_create_stack tools
|
||||
AI: "Find notes related to machine learning concepts"
|
||||
→ Uses nc_semantic_search to find semantically similar notes (requires Qdrant + Ollama setup)
|
||||
```
|
||||
|
||||
## Transport Protocols
|
||||
|
||||
The server supports multiple MCP transport protocols:
|
||||
|
||||
- **streamable-http** (recommended) - Modern streaming protocol
|
||||
- **sse** (default, deprecated) - Server-Sent Events for backward compatibility
|
||||
- **http** - Standard HTTP protocol
|
||||
|
||||
```bash
|
||||
# Use streamable-http (recommended)
|
||||
uv run nextcloud-mcp-server --transport streamable-http
|
||||
```
|
||||
|
||||
> [!WARNING]
|
||||
> SSE transport is deprecated and will be removed in a future MCP specification version. Please migrate to `streamable-http`.
|
||||
**Note:** For AI-generated answers with citations, use `nc_semantic_search_answer` (requires MCP client with sampling support).
|
||||
|
||||
## Contributing
|
||||
|
||||
@@ -337,17 +171,17 @@ Contributions are welcome!
|
||||
|
||||
- Report bugs or request features: [GitHub Issues](https://github.com/cbcoutinho/nextcloud-mcp-server/issues)
|
||||
- Submit improvements: [Pull Requests](https://github.com/cbcoutinho/nextcloud-mcp-server/pulls)
|
||||
- Read [CLAUDE.md](CLAUDE.md) for development guidelines
|
||||
- Development guidelines: [CLAUDE.md](CLAUDE.md)
|
||||
|
||||
## Security
|
||||
|
||||
[](https://mseep.ai/app/cbcoutinho-nextcloud-mcp-server)
|
||||
|
||||
This project takes security seriously:
|
||||
- OAuth2/OIDC support (experimental - requires upstream patches)
|
||||
- Basic Auth with app-specific passwords (recommended)
|
||||
- No credential storage with OAuth mode
|
||||
- Production-ready Basic Auth with app-specific passwords
|
||||
- OAuth2/OIDC support (experimental, requires upstream patches)
|
||||
- Per-user access tokens
|
||||
- No credential storage in OAuth mode
|
||||
- Regular security assessments
|
||||
|
||||
Found a security issue? Please report it privately to the maintainers.
|
||||
|
||||
+14
-7
@@ -88,20 +88,27 @@ services:
|
||||
- VECTOR_SYNC_SCAN_INTERVAL=10
|
||||
- VECTOR_SYNC_PROCESSOR_WORKERS=1
|
||||
|
||||
- LOG_FORMAT=json
|
||||
- LOG_FORMAT=text
|
||||
|
||||
# Qdrant configuration (three modes):
|
||||
# 1. Network mode: Set QDRANT_URL=http://qdrant:6333 (requires qdrant service)
|
||||
# 2. In-memory mode: Set QDRANT_LOCATION=:memory: (default if nothing set)
|
||||
# 3. Persistent local: Set QDRANT_LOCATION=/app/data/qdrant (stored in mcp-data volume)
|
||||
- QDRANT_LOCATION=/app/data/qdrant
|
||||
# - QDRANT_URL=http://qdrant:6333 # Uncomment for network mode
|
||||
# - QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key} # Only for network mode
|
||||
#- QDRANT_LOCATION=/app/data/qdrant
|
||||
- QDRANT_URL=http://qdrant:6333 # Uncomment for network mode
|
||||
- QDRANT_API_KEY=${QDRANT_API_KEY:-my_secret_api_key} # Only for network mode
|
||||
|
||||
# Collection naming: Auto-generated as {deployment-id}-{model-name}
|
||||
# - Deployment ID: OTEL_SERVICE_NAME (if set) or hostname (fallback)
|
||||
# - Model name: OLLAMA_EMBEDDING_MODEL
|
||||
# - Example: "nextcloud-mcp-server-nomic-embed-text"
|
||||
# - Changing models creates new collection (requires re-embedding)
|
||||
# - Set QDRANT_COLLECTION to override auto-generation:
|
||||
- QDRANT_COLLECTION=nextcloud_content
|
||||
|
||||
# Ollama configuration (optional - uses SimpleEmbeddingProvider if not set)
|
||||
# - OLLAMA_BASE_URL=http://your-ollama-endpoint:port
|
||||
# - OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
- OLLAMA_BASE_URL=https://ollama.internal.coutinho.io:443
|
||||
- OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Changing this creates new collection
|
||||
# - OLLAMA_VERIFY_SSL=false
|
||||
|
||||
mcp-oauth:
|
||||
@@ -207,7 +214,7 @@ services:
|
||||
- keycloak-oauth-storage:/app/.oauth
|
||||
|
||||
qdrant:
|
||||
image: qdrant/qdrant:latest
|
||||
image: qdrant/qdrant:v1.15.5
|
||||
restart: always
|
||||
ports:
|
||||
- 127.0.0.1:6333:6333 # REST API
|
||||
|
||||
@@ -178,6 +178,111 @@ VECTOR_SYNC_ENABLED=true
|
||||
- Requires separate Qdrant service
|
||||
- More complex deployment
|
||||
|
||||
### Qdrant Collection Naming
|
||||
|
||||
Collection names are automatically generated to include the embedding model, ensuring safe model switching and preventing dimension mismatches.
|
||||
|
||||
#### Auto-Generated Naming (Default)
|
||||
|
||||
**Format:** `{deployment-id}-{model-name}`
|
||||
|
||||
**Components:**
|
||||
- **Deployment ID:** `OTEL_SERVICE_NAME` (if configured) or `hostname` (fallback)
|
||||
- **Model name:** `OLLAMA_EMBEDDING_MODEL`
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# With OTEL service name configured
|
||||
OTEL_SERVICE_NAME=my-mcp-server
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
# → Collection: "my-mcp-server-nomic-embed-text"
|
||||
|
||||
# Simple Docker deployment (OTEL not configured)
|
||||
# hostname=mcp-container
|
||||
OLLAMA_EMBEDDING_MODEL=all-minilm
|
||||
# → Collection: "mcp-container-all-minilm"
|
||||
```
|
||||
|
||||
#### Switching Embedding Models
|
||||
|
||||
When you change `OLLAMA_EMBEDDING_MODEL`, a new collection is automatically created:
|
||||
|
||||
```bash
|
||||
# Initial setup
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
# Collection: "my-server-nomic-embed-text" (768 dimensions)
|
||||
|
||||
# Change model
|
||||
OLLAMA_EMBEDDING_MODEL=all-minilm
|
||||
# Collection: "my-server-all-minilm" (384 dimensions)
|
||||
# → New collection created, full re-embedding occurs
|
||||
```
|
||||
|
||||
**Important:**
|
||||
- **Collections are mutually exclusive** - vectors cannot be shared between different embedding models
|
||||
- **Switching models requires re-embedding** all documents (may take time for large note collections)
|
||||
- **Old collection remains** in Qdrant and can be deleted manually if no longer needed
|
||||
|
||||
#### Explicit Override
|
||||
|
||||
Set `QDRANT_COLLECTION` to use a specific collection name:
|
||||
|
||||
```bash
|
||||
QDRANT_COLLECTION=my-custom-collection # Bypasses auto-generation
|
||||
```
|
||||
|
||||
**Use cases:**
|
||||
- Backward compatibility with existing deployments
|
||||
- Custom naming schemes
|
||||
- Sharing a collection across deployments (advanced)
|
||||
|
||||
#### Multi-Server Deployments
|
||||
|
||||
Each server should have a unique deployment ID to avoid collection collisions:
|
||||
|
||||
```bash
|
||||
# Server 1 (Production)
|
||||
OTEL_SERVICE_NAME=mcp-prod
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
# → Collection: "mcp-prod-nomic-embed-text"
|
||||
|
||||
# Server 2 (Staging)
|
||||
OTEL_SERVICE_NAME=mcp-staging
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
# → Collection: "mcp-staging-nomic-embed-text"
|
||||
|
||||
# Server 3 (Different model)
|
||||
OTEL_SERVICE_NAME=mcp-experimental
|
||||
OLLAMA_EMBEDDING_MODEL=bge-large
|
||||
# → Collection: "mcp-experimental-bge-large"
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Multiple MCP servers can share one Qdrant instance safely
|
||||
- No naming collisions between deployments
|
||||
- Clear collection ownership (can see which deployment and model)
|
||||
|
||||
#### Dimension Validation
|
||||
|
||||
The server validates collection dimensions on startup:
|
||||
|
||||
```
|
||||
Dimension mismatch for collection 'my-server-nomic-embed-text':
|
||||
Expected: 384 (from embedding model 'all-minilm')
|
||||
Found: 768
|
||||
This usually means you changed the embedding model.
|
||||
Solutions:
|
||||
1. Delete the old collection: Collection will be recreated with new dimensions
|
||||
2. Set QDRANT_COLLECTION to use a different collection name
|
||||
3. Revert OLLAMA_EMBEDDING_MODEL to the original model
|
||||
```
|
||||
|
||||
**What this prevents:**
|
||||
- Runtime errors from dimension mismatches
|
||||
- Data corruption in Qdrant
|
||||
- Confusing error messages during indexing
|
||||
|
||||
### Vector Sync Configuration
|
||||
|
||||
Control background indexing behavior:
|
||||
|
||||
@@ -0,0 +1,874 @@
|
||||
# Semantic Search Architecture
|
||||
|
||||
This document explains the architecture of the semantic search feature in the Nextcloud MCP Server, including background synchronization, vector search, and optional AI-generated answers via MCP sampling.
|
||||
|
||||
> [!IMPORTANT]
|
||||
> **Status: Experimental**
|
||||
> - Disabled by default (`VECTOR_SYNC_ENABLED=false`)
|
||||
> - Currently supports **Notes app only** (multi-app architecture ready, additional apps planned)
|
||||
> - Requires additional infrastructure (Qdrant vector database + Ollama embedding service)
|
||||
> - RAG answer generation requires MCP client sampling support
|
||||
|
||||
## Overview
|
||||
|
||||
### What is Semantic Search?
|
||||
|
||||
**Semantic search** finds information based on **meaning** rather than exact keyword matches. It uses vector embeddings to understand that "car" and "automobile" are similar, or that "bread recipe" matches "how to bake bread."
|
||||
|
||||
**Traditional keyword search:**
|
||||
```
|
||||
Query: "machine learning"
|
||||
Matches: Only notes containing "machine learning" exactly
|
||||
Misses: Notes with "neural networks", "AI models", "deep learning"
|
||||
```
|
||||
|
||||
**Semantic search:**
|
||||
```
|
||||
Query: "machine learning"
|
||||
Matches: Notes about machine learning, neural networks, AI, deep learning, etc.
|
||||
Understanding: Semantic similarity via vector embeddings
|
||||
```
|
||||
|
||||
### Why It Matters
|
||||
|
||||
Semantic search enables:
|
||||
- **Natural language queries** - Ask questions in plain language
|
||||
- **Conceptual discovery** - Find related content even with different terminology
|
||||
- **Cross-reference insights** - Connect ideas across your knowledge base
|
||||
- **AI-powered answers** - Generate summaries with citations (optional, requires MCP sampling)
|
||||
|
||||
### Current Support
|
||||
|
||||
- **Supported Apps**: Notes (fully implemented)
|
||||
- **Planned Apps**: Calendar events, Calendar tasks, Deck cards, Files (with text extraction), Contacts
|
||||
- **Architecture**: Multi-app plugin system ready, awaiting implementation
|
||||
|
||||
## System Components
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "MCP Client"
|
||||
Client[Claude Desktop, IDEs, etc.]
|
||||
end
|
||||
|
||||
subgraph "Nextcloud MCP Server"
|
||||
MCP[MCP Server]
|
||||
Scanner[Background Scanner<br/>Hourly Change Detection]
|
||||
Queue[Document Queue]
|
||||
Processor[Embedding Processors<br/>Concurrent Workers]
|
||||
end
|
||||
|
||||
subgraph "Infrastructure"
|
||||
Qdrant[(Qdrant<br/>Vector Database)]
|
||||
Ollama[Ollama<br/>Embedding Service]
|
||||
NC[Nextcloud<br/>Notes API, CalDAV, etc.]
|
||||
end
|
||||
|
||||
Client <-->|MCP Protocol| MCP
|
||||
Scanner -->|Fetch Changes| NC
|
||||
Scanner -->|Enqueue Documents| Queue
|
||||
Queue -->|Process Batch| Processor
|
||||
Processor -->|Generate Embeddings| Ollama
|
||||
Processor -->|Store Vectors| Qdrant
|
||||
MCP -->|Search Queries| Qdrant
|
||||
MCP -->|Verify Access| NC
|
||||
```
|
||||
|
||||
**Component Roles:**
|
||||
|
||||
- **MCP Server**: Exposes semantic search tools (`nc_semantic_search`, `nc_semantic_search_answer`, `nc_get_vector_sync_status`)
|
||||
- **Background Scanner**: Discovers changed documents every hour using ETag-based change detection
|
||||
- **Document Queue**: Holds pending documents for embedding generation
|
||||
- **Embedding Processors**: Generate vector embeddings via Ollama (concurrent workers)
|
||||
- **Qdrant Vector Database**: Stores document vectors with metadata and user_id filtering
|
||||
- **Ollama Embedding Service**: Converts text to 768-dimensional vectors (default: `nomic-embed-text` model)
|
||||
- **Nextcloud APIs**: Source of truth for documents and access control verification
|
||||
|
||||
## How It Works: Background Synchronization
|
||||
|
||||
Background synchronization runs automatically when `VECTOR_SYNC_ENABLED=true`, discovering changes and indexing documents without user intervention.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Timer
|
||||
participant Scanner
|
||||
participant NC as Nextcloud API
|
||||
participant Queue
|
||||
participant Processor
|
||||
participant Ollama
|
||||
participant Qdrant
|
||||
|
||||
Timer->>Scanner: Trigger (hourly)
|
||||
Scanner->>NC: Fetch all notes<br/>(Notes API)
|
||||
NC-->>Scanner: Notes with ETags
|
||||
Scanner->>Qdrant: Check indexed documents
|
||||
Qdrant-->>Scanner: Existing ETags
|
||||
Scanner->>Scanner: Identify changes<br/>(new/modified/deleted)
|
||||
Scanner->>Queue: Enqueue changed docs
|
||||
|
||||
loop Continuous Processing
|
||||
Processor->>Queue: Fetch batch
|
||||
Queue-->>Processor: Documents
|
||||
Processor->>Ollama: Generate embeddings
|
||||
Ollama-->>Processor: 768-dim vectors
|
||||
Processor->>Qdrant: Upsert vectors<br/>(with user_id, doc_type)
|
||||
end
|
||||
```
|
||||
|
||||
### Scanner Behavior
|
||||
|
||||
**Hourly Trigger:**
|
||||
- Runs every hour (configurable)
|
||||
- Fetches all notes from Nextcloud Notes API
|
||||
- Compares ETags with Qdrant's indexed state
|
||||
- Enqueues new/modified documents
|
||||
|
||||
**Change Detection:**
|
||||
- **New documents**: No entry in Qdrant → enqueue for indexing
|
||||
- **Modified documents**: ETag mismatch → enqueue for re-indexing
|
||||
- **Deleted documents**: In Qdrant but not in Nextcloud → delete from Qdrant
|
||||
|
||||
**Multi-App Plugin Architecture:**
|
||||
```python
|
||||
# Each app implements DocumentScanner interface
|
||||
class NotesScanner(DocumentScanner):
|
||||
async def scan(self) -> list[Document]:
|
||||
# Fetch notes, detect changes, return documents
|
||||
```
|
||||
|
||||
Currently only `NotesScanner` is implemented. Future: `CalendarScanner`, `DeckScanner`, `FilesScanner`, etc.
|
||||
|
||||
### Queue Processing
|
||||
|
||||
**Document Queue:**
|
||||
- In-memory FIFO queue (not persistent across restarts)
|
||||
- Holds documents pending embedding generation
|
||||
- Batch processing for efficiency
|
||||
|
||||
**Processor Pool:**
|
||||
- Concurrent workers using `anyio.TaskGroup`
|
||||
- Process documents in parallel (default: 4 workers)
|
||||
- Each worker: fetch document → generate embedding → store in Qdrant
|
||||
|
||||
**Backpressure Handling:**
|
||||
- Queue size limits prevent memory exhaustion
|
||||
- Slow consumers (Ollama) naturally pace the system
|
||||
|
||||
### Vector Storage
|
||||
|
||||
**Qdrant Collection Schema:**
|
||||
```
|
||||
{
|
||||
"id": "note_123",
|
||||
"vector": [768 dimensions],
|
||||
"payload": {
|
||||
"user_id": "alice",
|
||||
"doc_type": "note",
|
||||
"doc_id": "123",
|
||||
"title": "Machine Learning Notes",
|
||||
"content": "Neural networks are...",
|
||||
"etag": "abc123",
|
||||
"last_modified": "2025-01-15T10:30:00Z"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Key Fields:**
|
||||
- `user_id`: Multi-tenancy filtering (each user's vectors isolated)
|
||||
- `doc_type`: App identifier ("note", "event", "card", etc.)
|
||||
- `etag`: Change detection for incremental updates
|
||||
|
||||
### Collection Naming and Model Switching
|
||||
|
||||
**Auto-generated collection names:**
|
||||
- **Format:** `{deployment-id}-{model-name}`
|
||||
- **Deployment ID:** `OTEL_SERVICE_NAME` (if configured) or `hostname` (fallback)
|
||||
- **Model name:** `OLLAMA_EMBEDDING_MODEL`
|
||||
- **Example:** `"my-mcp-server-nomic-embed-text"`, `"mcp-container-all-minilm"`
|
||||
|
||||
**Why model-based naming:**
|
||||
- Ensures each embedding model gets its own collection
|
||||
- Prevents dimension mismatches when switching models
|
||||
- Enables safe model experimentation (new model = new collection)
|
||||
- Supports multi-server deployments (different deployment IDs)
|
||||
|
||||
**Switching embedding models:**
|
||||
|
||||
Collections are **mutually exclusive** - vectors from one embedding model cannot be used with another. When you change the embedding model:
|
||||
|
||||
1. **New collection is created** with the new model's dimensions
|
||||
2. **Full re-embedding occurs** - scanner processes all documents again
|
||||
3. **Old collection remains** - can be deleted manually if no longer needed
|
||||
4. **Dimension validation** - server fails fast if collection dimension doesn't match model
|
||||
|
||||
**Example workflow:**
|
||||
```bash
|
||||
# Start with nomic-embed-text (768 dimensions)
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
# Collection: "my-server-nomic-embed-text"
|
||||
# → Scanner indexes 1000 notes → 1000 vectors in collection
|
||||
|
||||
# Switch to all-minilm (384 dimensions)
|
||||
OLLAMA_EMBEDDING_MODEL=all-minilm
|
||||
# Collection: "my-server-all-minilm"
|
||||
# → Scanner detects 0 indexed documents → re-embeds 1000 notes
|
||||
# → Old collection "my-server-nomic-embed-text" still exists in Qdrant
|
||||
```
|
||||
|
||||
**Re-embedding performance:**
|
||||
- CPU-only: 1-5 notes/second
|
||||
- With GPU: 50-200 notes/second
|
||||
- 1000 notes: 3-16 minutes (CPU) or 5-20 seconds (GPU)
|
||||
|
||||
**Multi-server deployments:**
|
||||
|
||||
Multiple MCP servers can share one Qdrant instance safely:
|
||||
|
||||
```bash
|
||||
# Server 1 (Production)
|
||||
OTEL_SERVICE_NAME=mcp-prod
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
|
||||
# → Collection: "mcp-prod-nomic-embed-text"
|
||||
|
||||
# Server 2 (Staging with different model)
|
||||
OTEL_SERVICE_NAME=mcp-staging
|
||||
OLLAMA_EMBEDDING_MODEL=all-minilm
|
||||
# → Collection: "mcp-staging-all-minilm"
|
||||
```
|
||||
|
||||
Each deployment gets its own collection - no naming collisions or dimension conflicts.
|
||||
|
||||
## How It Works: Semantic Search
|
||||
|
||||
Semantic search converts user queries into vectors and finds similar documents using cosine similarity.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant MCP as MCP Server
|
||||
participant Ollama
|
||||
participant Qdrant
|
||||
participant NC as Nextcloud API
|
||||
|
||||
User->>MCP: nc_semantic_search("machine learning")
|
||||
MCP->>MCP: Check OAuth scope<br/>(semantic:read)
|
||||
MCP->>Ollama: Generate query embedding
|
||||
Ollama-->>MCP: Query vector (768-dim)
|
||||
MCP->>Qdrant: Search similar vectors<br/>(filter: user_id=alice)
|
||||
Qdrant-->>MCP: Top K results<br/>(with similarity scores)
|
||||
|
||||
loop For each result
|
||||
MCP->>NC: Verify access<br/>(fetch note by ID)
|
||||
alt Access granted
|
||||
NC-->>MCP: Note metadata
|
||||
else Access denied (404/401)
|
||||
MCP->>MCP: Filter out result
|
||||
end
|
||||
end
|
||||
|
||||
MCP-->>User: Search results<br/>(with scores, excerpts)
|
||||
```
|
||||
|
||||
### Dual-Phase Authorization
|
||||
|
||||
**Phase 1: OAuth Scope Check**
|
||||
- Verify user has `semantic:read` scope
|
||||
- Rejects unauthorized users immediately
|
||||
|
||||
**Phase 2: Per-Document Verification**
|
||||
- For each search result, fetch document via app API (Notes, Calendar, etc.)
|
||||
- If fetch succeeds (200 OK), user has access
|
||||
- If fetch fails (404 Not Found, 401 Unauthorized), filter out result
|
||||
- **Security**: Prevents information leakage from vector search alone
|
||||
|
||||
**Rationale:**
|
||||
- Vector database doesn't know about sharing, permissions changes, or deleted documents
|
||||
- App APIs are source of truth for access control
|
||||
- Verification ensures users only see documents they can access
|
||||
|
||||
### Search Flow
|
||||
|
||||
1. **Query Embedding**: Convert user query to 768-dimensional vector via Ollama
|
||||
2. **Vector Search**: Find top K similar vectors in Qdrant (cosine similarity)
|
||||
3. **User Filtering**: Qdrant pre-filters by `user_id` (multi-tenancy)
|
||||
4. **Access Verification**: Fetch each document via app API to verify current access
|
||||
5. **Result Ranking**: Return results sorted by similarity score
|
||||
6. **Response**: Include document excerpts, metadata, and similarity scores
|
||||
|
||||
### Performance
|
||||
|
||||
- **Query latency**: 50-200ms typical (embedding + vector search + verification)
|
||||
- **Accuracy**: Depends on embedding model quality (`nomic-embed-text` recommended)
|
||||
- **Scalability**: Qdrant handles millions of vectors efficiently
|
||||
|
||||
## How It Works: RAG with MCP Sampling (Optional)
|
||||
|
||||
The `nc_semantic_search_answer` tool generates AI-powered answers with citations using **MCP sampling** - requesting the MCP client's LLM to generate text.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant MCP as MCP Server
|
||||
participant Client as MCP Client<br/>(Claude Desktop)
|
||||
participant LLM as Client's LLM<br/>(Claude, GPT, etc.)
|
||||
|
||||
User->>MCP: nc_semantic_search_answer("What are my Q1 goals?")
|
||||
MCP->>MCP: Semantic search<br/>(find relevant notes)
|
||||
MCP->>MCP: Construct prompt<br/>(query + documents + instructions)
|
||||
MCP->>Client: Sampling request<br/>(MCP Protocol)
|
||||
Client->>User: Prompt for approval<br/>(optional, client-controlled)
|
||||
User-->>Client: Approve
|
||||
Client->>LLM: Generate answer<br/>(with context)
|
||||
LLM-->>Client: Answer with citations
|
||||
Client-->>MCP: Sampling response
|
||||
MCP-->>User: Generated answer<br/>(with source documents)
|
||||
```
|
||||
|
||||
### MCP Sampling Architecture
|
||||
|
||||
**Why MCP Sampling?**
|
||||
- **No server-side LLM**: MCP server has no API keys, doesn't call LLMs directly
|
||||
- **Client controls everything**: Which model, who pays, user approval prompts
|
||||
- **Privacy**: Documents stay with the client's LLM provider, not a third-party
|
||||
- **Flexibility**: Works with any MCP client that supports sampling (Claude Desktop, future clients)
|
||||
|
||||
**Prompt Construction:**
|
||||
```
|
||||
User Query: {query}
|
||||
|
||||
Relevant Documents:
|
||||
1. Document: {title} (Note)
|
||||
Content: {excerpt}
|
||||
|
||||
2. Document: {title} (Note)
|
||||
Content: {excerpt}
|
||||
|
||||
Instructions:
|
||||
- Provide a comprehensive answer to the user's query
|
||||
- Use the documents above as context
|
||||
- Include citations: "According to Document 1 (title)..."
|
||||
- If documents don't contain enough information, say so
|
||||
```
|
||||
|
||||
**Graceful Fallback:**
|
||||
```python
|
||||
try:
|
||||
result = await ctx.session.create_message(...)
|
||||
return answer_with_citations
|
||||
except Exception as e:
|
||||
# Fallback: Return documents without generated answer
|
||||
return SearchResponse(
|
||||
generated_answer=f"[Sampling unavailable: {e}]",
|
||||
sources=search_results
|
||||
)
|
||||
```
|
||||
|
||||
**Client Support:**
|
||||
- **Requires**: MCP client with sampling capability
|
||||
- **Known support**: Claude Desktop (as of Claude 3.5+)
|
||||
- **Graceful degradation**: Returns raw documents if sampling unavailable
|
||||
|
||||
## Authentication & Security
|
||||
|
||||
### OAuth Scopes
|
||||
|
||||
**`semantic:read`** - Search permission
|
||||
- Allows using `nc_semantic_search` and `nc_semantic_search_answer` tools
|
||||
- Does NOT grant access to documents (verified via app APIs)
|
||||
- Required for any semantic search operation
|
||||
|
||||
**`semantic:write`** - Sync control permission
|
||||
- Allows enabling/disabling background sync (`provision_vector_sync`, `deprovision_vector_sync`)
|
||||
- Controls whether user's documents are indexed
|
||||
- Currently not implemented in OAuth mode (BasicAuth only)
|
||||
|
||||
### Dual-Phase Authorization Pattern
|
||||
|
||||
**Phase 1: Scope Check** (semantic:read)
|
||||
- Verifies user authorized to search
|
||||
- Prevents unauthorized vector database access
|
||||
|
||||
**Phase 2: Document Verification** (app-specific APIs)
|
||||
- For each search result, fetch via Notes API, CalDAV, etc.
|
||||
- If user can fetch → include in results
|
||||
- If user cannot fetch (404/401) → filter out
|
||||
- **Security**: Vector search cannot leak documents user shouldn't see
|
||||
|
||||
**Example Scenario:**
|
||||
1. Alice creates note "Secret Project X"
|
||||
2. Background sync indexes note with `user_id=alice`
|
||||
3. Bob searches for "project"
|
||||
4. Vector search finds "Secret Project X" (vector similarity)
|
||||
5. Qdrant filters by `user_id=bob` → no match (Alice's note excluded)
|
||||
6. Even if Bob somehow got the doc_id, Phase 2 verification would fail (404 Not Found)
|
||||
|
||||
### Offline Access for Background Sync
|
||||
|
||||
**Why needed:**
|
||||
- Background scanner runs hourly without user interaction
|
||||
- Requires valid access tokens to fetch documents from Nextcloud APIs
|
||||
- User's session token expires after hours/days
|
||||
|
||||
**OAuth Mode (ADR-004 Flow 2):**
|
||||
- User explicitly provisions offline access via `provision_nextcloud_access` tool
|
||||
- Server requests `offline_access` scope → receives refresh token
|
||||
- Refresh token stored securely (database, encrypted)
|
||||
- Background sync uses refresh tokens to obtain access tokens
|
||||
|
||||
**BasicAuth Mode:**
|
||||
- Username/password stored in environment variables
|
||||
- Always available for background operations
|
||||
- Simpler but less secure (credentials never expire)
|
||||
|
||||
## Deployment Modes
|
||||
|
||||
### Authentication Modes
|
||||
|
||||
| Mode | Security | Offline Access | Background Sync | Best For |
|
||||
|------|----------|----------------|-----------------|----------|
|
||||
| **BasicAuth** | Lower (credentials in env) | Always available | ✅ Works immediately | Single-user, development, testing |
|
||||
| **OAuth** | Higher (tokens, scopes) | User must provision | ⚠️ Not yet implemented | Multi-user, production |
|
||||
|
||||
**BasicAuth:**
|
||||
- Set `NEXTCLOUD_USERNAME` and `NEXTCLOUD_PASSWORD`
|
||||
- Background sync works immediately when `VECTOR_SYNC_ENABLED=true`
|
||||
- Credentials stored in `.env` file (secure server access required)
|
||||
|
||||
**OAuth:**
|
||||
- Client authenticates with `semantic:read` scope
|
||||
- User must explicitly provision offline access (future: `provision_vector_sync` tool)
|
||||
- Background sync only works for users who provisioned access
|
||||
- More secure: tokens expire, user controls access
|
||||
|
||||
### Qdrant Deployment Modes
|
||||
|
||||
| Mode | Configuration | Persistence | Scalability | Best For |
|
||||
|------|---------------|-------------|-------------|----------|
|
||||
| **In-Memory** (default) | `QDRANT_LOCATION=:memory:` | ❌ Lost on restart | Single instance | Testing, development |
|
||||
| **Persistent Local** | `QDRANT_LOCATION=/data/qdrant` | ✅ Survives restarts | Single instance | Small deployments |
|
||||
| **Network** | `QDRANT_URL=http://qdrant:6333` | ✅ Dedicated service | ✅ Horizontal scaling | Production |
|
||||
|
||||
**In-Memory Mode:**
|
||||
```bash
|
||||
VECTOR_SYNC_ENABLED=true
|
||||
# QDRANT_LOCATION not set → defaults to :memory:
|
||||
```
|
||||
- Fastest startup
|
||||
- No disk I/O
|
||||
- **Warning**: All vectors lost when server restarts (must re-index)
|
||||
|
||||
**Persistent Local Mode:**
|
||||
```bash
|
||||
VECTOR_SYNC_ENABLED=true
|
||||
QDRANT_LOCATION=/var/lib/qdrant
|
||||
```
|
||||
- Vectors survive restarts
|
||||
- Single server only (no distributed setup)
|
||||
- Disk I/O for durability
|
||||
|
||||
**Network Mode (Recommended for Production):**
|
||||
```bash
|
||||
VECTOR_SYNC_ENABLED=true
|
||||
QDRANT_URL=http://qdrant:6333
|
||||
QDRANT_API_KEY=secret # optional
|
||||
```
|
||||
- Dedicated Qdrant service (Docker, Kubernetes)
|
||||
- Horizontal scaling (multiple MCP servers → one Qdrant)
|
||||
- High availability options
|
||||
|
||||
### Embedding Service Options
|
||||
|
||||
| Service | Configuration | Cost | Performance | Best For |
|
||||
|---------|---------------|------|-------------|----------|
|
||||
| **Ollama** (recommended) | `OLLAMA_BASE_URL=http://ollama:11434` | Free (self-hosted) | Fast (local GPU) | Production, development |
|
||||
| **OpenAI** (future) | `OPENAI_API_KEY=sk-...` | Paid (API) | Fast (cloud) | Cloud deployments |
|
||||
| **Fallback** | No config | Free | Slow (random) | Testing only (not production) |
|
||||
|
||||
**Ollama Setup (Recommended):**
|
||||
```bash
|
||||
# docker-compose.yml
|
||||
services:
|
||||
ollama:
|
||||
image: ollama/ollama
|
||||
volumes:
|
||||
- ollama-data:/root/.ollama
|
||||
ports:
|
||||
- "11434:11434"
|
||||
|
||||
# Pull embedding model
|
||||
docker compose exec ollama ollama pull nomic-embed-text
|
||||
```
|
||||
|
||||
**Environment Configuration:**
|
||||
```bash
|
||||
OLLAMA_BASE_URL=http://ollama:11434
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # 768-dimensional vectors
|
||||
```
|
||||
|
||||
**Model Options:**
|
||||
- `nomic-embed-text` (default): 768-dim, optimized for semantic search
|
||||
- `all-minilm`: Smaller, faster, slightly less accurate
|
||||
- `mxbai-embed-large`: Larger, more accurate, slower
|
||||
|
||||
## Configuration Overview
|
||||
|
||||
### Key Environment Variables
|
||||
|
||||
**Enable Semantic Search:**
|
||||
```bash
|
||||
VECTOR_SYNC_ENABLED=true # Default: false (opt-in)
|
||||
```
|
||||
|
||||
**Qdrant Vector Database:**
|
||||
```bash
|
||||
# In-memory mode (default if VECTOR_SYNC_ENABLED=true)
|
||||
# QDRANT_LOCATION not set → uses :memory:
|
||||
|
||||
# Persistent local mode
|
||||
QDRANT_LOCATION=/var/lib/qdrant
|
||||
|
||||
# Network mode (production)
|
||||
QDRANT_URL=http://qdrant:6333
|
||||
QDRANT_API_KEY=secret # optional
|
||||
```
|
||||
|
||||
**Ollama Embedding Service:**
|
||||
```bash
|
||||
OLLAMA_BASE_URL=http://ollama:11434
|
||||
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Default
|
||||
```
|
||||
|
||||
**Scanner Configuration:**
|
||||
```bash
|
||||
VECTOR_SYNC_INTERVAL=3600 # Scan interval in seconds (default: 1 hour)
|
||||
```
|
||||
|
||||
### Resource Requirements
|
||||
|
||||
**Qdrant:**
|
||||
- **Memory**: ~100-200 MB base + ~1 KB per vector (1M vectors ≈ 1 GB)
|
||||
- **Disk**: Persistent mode only, ~200 bytes per vector
|
||||
- **CPU**: Low (indexing) to moderate (search)
|
||||
|
||||
**Ollama:**
|
||||
- **Memory**: 2-4 GB for `nomic-embed-text` model
|
||||
- **CPU**: High during embedding generation, idle otherwise
|
||||
- **GPU**: Optional but recommended (10-100x faster)
|
||||
|
||||
**MCP Server:**
|
||||
- **Memory**: +50-100 MB for background sync workers
|
||||
- **CPU**: Moderate during scanning/processing, low otherwise
|
||||
|
||||
### Trade-offs
|
||||
|
||||
| Consideration | In-Memory Qdrant | Persistent Qdrant | Network Qdrant |
|
||||
|---------------|------------------|-------------------|----------------|
|
||||
| Setup complexity | ✅ Minimal | ✅ Easy | ⚠️ Requires separate service |
|
||||
| Durability | ❌ Lost on restart | ✅ Survives restarts | ✅ Survives restarts |
|
||||
| Scalability | ❌ Single instance | ❌ Single instance | ✅ Horizontal scaling |
|
||||
| Performance | ✅ Fastest | ✅ Fast | ⚠️ Network latency |
|
||||
|
||||
## Operational Behavior
|
||||
|
||||
### What Happens When VECTOR_SYNC_ENABLED=true
|
||||
|
||||
**Immediate (Server Startup):**
|
||||
1. MCP server connects to Qdrant (creates collection if needed)
|
||||
2. MCP server connects to Ollama (verifies embedding model available)
|
||||
3. Background scanner starts (schedules hourly runs)
|
||||
4. Document queue and processors initialize
|
||||
|
||||
**First Scan (Within 1 hour):**
|
||||
1. Scanner fetches all notes from Nextcloud
|
||||
2. Compares with Qdrant (likely empty on first run)
|
||||
3. Enqueues all notes for indexing
|
||||
4. Processors generate embeddings (may take minutes for large note collections)
|
||||
5. Vectors stored in Qdrant with user_id filtering
|
||||
|
||||
**Hourly Thereafter:**
|
||||
1. Scanner fetches all notes
|
||||
2. Identifies new/modified/deleted notes (ETag comparison)
|
||||
3. Enqueues changes only
|
||||
4. Incremental updates processed
|
||||
|
||||
### Performance Expectations
|
||||
|
||||
**Embedding Generation:**
|
||||
- **Without GPU**: 1-5 notes/second (CPU-bound)
|
||||
- **With GPU**: 50-200 notes/second (highly parallel)
|
||||
- **Initial indexing**: 100 notes ≈ 20-100 seconds (CPU), 1-2 seconds (GPU)
|
||||
|
||||
**Search Query:**
|
||||
- **Embedding generation**: 50-100ms
|
||||
- **Vector search**: 10-50ms (depends on collection size)
|
||||
- **Access verification**: 20-100ms per document (Nextcloud API calls)
|
||||
- **Total latency**: 100-300ms typical
|
||||
|
||||
**Resource Usage:**
|
||||
- **Idle**: Minimal (background scanner sleeps)
|
||||
- **Scanning**: Moderate CPU (ETag checks, API calls)
|
||||
- **Processing**: High CPU/GPU (embedding generation)
|
||||
- **Searching**: Low to moderate (depends on query frequency)
|
||||
|
||||
### Background Sync Behavior
|
||||
|
||||
**Scanner Triggers:**
|
||||
- Hourly (configurable via `VECTOR_SYNC_INTERVAL`)
|
||||
- Manual trigger via `nc_trigger_vector_sync` (future)
|
||||
|
||||
**Queue Processing:**
|
||||
- Continuous (workers always running)
|
||||
- Batch processing (fetch 10 documents at a time)
|
||||
- Concurrent workers (4 by default)
|
||||
|
||||
**Error Handling:**
|
||||
- Individual document failures logged but don't stop scanning
|
||||
- Retries for transient errors (network timeouts, rate limits)
|
||||
- Failed documents skipped, re-attempted on next scan
|
||||
|
||||
**What Gets Indexed:**
|
||||
- **Notes**: All notes accessible to the authenticated user
|
||||
- **Future**: Calendar events, tasks, deck cards, files with text extraction, contacts
|
||||
|
||||
## Monitoring & Observability
|
||||
|
||||
### MCP Tools
|
||||
|
||||
**`nc_get_vector_sync_status`** - Check sync status
|
||||
```python
|
||||
{
|
||||
"total_documents": 1234,
|
||||
"indexed_documents": 1200,
|
||||
"pending_documents": 34,
|
||||
"sync_enabled": true,
|
||||
"last_scan": "2025-01-15T14:30:00Z",
|
||||
"status": "syncing" # idle | syncing | error
|
||||
}
|
||||
```
|
||||
|
||||
**Interpreting Status:**
|
||||
- `idle`: No pending work, last scan completed successfully
|
||||
- `syncing`: Currently processing documents
|
||||
- `error`: Last scan failed (check logs)
|
||||
|
||||
### Logs to Check
|
||||
|
||||
**Scanner Logs:**
|
||||
```
|
||||
[INFO] Vector sync scanner started (interval: 3600s)
|
||||
[INFO] Scanning notes: found 150 documents
|
||||
[INFO] Changes detected: 5 new, 2 modified, 1 deleted
|
||||
[INFO] Enqueued 7 documents for processing
|
||||
```
|
||||
|
||||
**Processor Logs:**
|
||||
```
|
||||
[INFO] Processing document: note_123
|
||||
[DEBUG] Generated embedding (768 dimensions)
|
||||
[INFO] Stored vector in Qdrant: note_123
|
||||
```
|
||||
|
||||
**Error Logs:**
|
||||
```
|
||||
[ERROR] Failed to generate embedding for note_123: Connection timeout
|
||||
[WARN] Qdrant connection lost, retrying...
|
||||
[ERROR] Ollama embedding failed: Model not found
|
||||
```
|
||||
|
||||
**Log Locations:**
|
||||
- **Docker**: `docker compose logs mcp`
|
||||
- **Local**: stdout (redirect to file if needed)
|
||||
- **Kubernetes**: `kubectl logs -f deployment/nextcloud-mcp-server`
|
||||
|
||||
### Metrics to Monitor
|
||||
|
||||
**Indexing Progress:**
|
||||
- Total documents vs indexed documents
|
||||
- Pending queue size
|
||||
- Processing rate (docs/second)
|
||||
|
||||
**Search Performance:**
|
||||
- Query latency (p50, p95, p99)
|
||||
- Results per query
|
||||
- Verification overhead (API calls per query)
|
||||
|
||||
**Resource Usage:**
|
||||
- Qdrant memory/disk usage
|
||||
- Ollama CPU/GPU usage
|
||||
- MCP server memory
|
||||
|
||||
For detailed observability setup, see [docs/observability.md](observability.md).
|
||||
|
||||
## Troubleshooting from Architecture Perspective
|
||||
|
||||
### Documents Not Appearing in Search
|
||||
|
||||
**Diagnosis Flow:**
|
||||
1. Check sync status: `nc_get_vector_sync_status`
|
||||
- `sync_enabled: false` → Enable with `VECTOR_SYNC_ENABLED=true`
|
||||
- `status: error` → Check scanner logs for failures
|
||||
2. Check queue size:
|
||||
- `pending_documents > 0` → Processing in progress, wait
|
||||
- `pending_documents == 0` but `indexed_documents` low → Scan hasn't run yet (wait up to 1 hour)
|
||||
3. Check Qdrant:
|
||||
- Connection errors in logs → Verify `QDRANT_URL` or `QDRANT_LOCATION`
|
||||
- Collection empty → First scan hasn't completed
|
||||
4. Check Ollama:
|
||||
- Embedding errors in logs → Verify `OLLAMA_BASE_URL`
|
||||
- Model not found → Pull model: `ollama pull nomic-embed-text`
|
||||
|
||||
**Common Causes:**
|
||||
- Sync disabled (default): Enable `VECTOR_SYNC_ENABLED=true`
|
||||
- Ollama not running: Start Ollama service
|
||||
- Qdrant not accessible: Check network/URL
|
||||
- First scan in progress: Wait up to 1 hour + processing time
|
||||
|
||||
### Slow Search Performance
|
||||
|
||||
**Diagnosis:**
|
||||
1. **Query embedding slow (>500ms)**:
|
||||
- Ollama overloaded or CPU-bound
|
||||
- Solution: Use GPU, upgrade CPU, or reduce concurrent requests
|
||||
2. **Vector search slow (>200ms)**:
|
||||
- Large collection (millions of vectors)
|
||||
- Solution: Use network Qdrant with SSDs, add indexing
|
||||
3. **Verification slow (>500ms)**:
|
||||
- Many results to verify (10+ documents)
|
||||
- Nextcloud API slow or overloaded
|
||||
- Solution: Reduce `limit` parameter, optimize Nextcloud
|
||||
|
||||
**Performance Tuning:**
|
||||
- Reduce search `limit` (default: 10 results)
|
||||
- Use network Qdrant for large collections
|
||||
- Enable Ollama GPU acceleration
|
||||
- Check Nextcloud API response times
|
||||
|
||||
### Background Sync Stopped
|
||||
|
||||
**Diagnosis:**
|
||||
1. Check logs for errors:
|
||||
- Authentication failures (401/403) → Token expired (OAuth) or credentials invalid (BasicAuth)
|
||||
- Connection timeouts → Network issues with Nextcloud/Qdrant/Ollama
|
||||
- Rate limiting (429) → Reduce scan frequency
|
||||
2. Check `nc_get_vector_sync_status`:
|
||||
- `status: error` → See logs for details
|
||||
- `last_scan` timestamp old (>2 hours) → Scanner may have crashed
|
||||
3. Verify services:
|
||||
- Qdrant accessible: `curl http://qdrant:6333/`
|
||||
- Ollama accessible: `curl http://ollama:11434/api/tags`
|
||||
- Nextcloud accessible: Check API health
|
||||
|
||||
**OAuth Mode (Future):**
|
||||
- Offline access token expired → Re-provision via `provision_vector_sync`
|
||||
- User deprovisioned access → Sync stops intentionally
|
||||
|
||||
### Out of Memory
|
||||
|
||||
**Diagnosis:**
|
||||
1. Check Qdrant mode:
|
||||
- In-memory mode with large collection → Switch to persistent or network mode
|
||||
2. Check embedding batch size:
|
||||
- Too many documents processed simultaneously → Reduce worker count
|
||||
3. Check Ollama memory:
|
||||
- Large models loaded → Use smaller embedding model
|
||||
|
||||
**Solutions:**
|
||||
- Use persistent or network Qdrant (frees server memory)
|
||||
- Reduce concurrent processor workers
|
||||
- Use smaller embedding model (`all-minilm` instead of `nomic-embed-text`)
|
||||
- Increase server memory allocation
|
||||
|
||||
## Limitations & Future Work
|
||||
|
||||
### Current Limitations
|
||||
|
||||
1. **Notes App Only**
|
||||
- Architecture supports multiple apps (plugin system ready)
|
||||
- Only `NotesScanner` and `NotesProcessor` implemented
|
||||
- Future: Calendar, Deck, Files, Contacts
|
||||
|
||||
2. **MCP Sampling Support**
|
||||
- `nc_semantic_search_answer` requires client sampling capability
|
||||
- Not all MCP clients support sampling yet
|
||||
- Graceful fallback: Returns documents without generated answer
|
||||
|
||||
3. **OAuth Background Sync**
|
||||
- User-controlled background jobs not yet implemented
|
||||
- Currently works in BasicAuth mode only
|
||||
- Future: Users opt-in via `provision_vector_sync` tool
|
||||
|
||||
4. **No Incremental Updates**
|
||||
- Document changes trigger full re-embedding
|
||||
- Cannot update just modified paragraphs
|
||||
- Future: Paragraph-level chunking and incremental updates
|
||||
|
||||
5. **No Query Caching**
|
||||
- Each search generates new query embedding
|
||||
- Repeated queries re-search Qdrant
|
||||
- Future: Cache recent query embeddings and results
|
||||
|
||||
6. **Single Embedding Model**
|
||||
- Uses one model for all documents and queries
|
||||
- Cannot customize per app or user
|
||||
- Future: App-specific or user-selected models
|
||||
|
||||
### Future Enhancements
|
||||
|
||||
**Multi-App Support** (In Progress):
|
||||
- Scanner plugins for Calendar, Deck, Files, Contacts
|
||||
- Unified vector search across all apps
|
||||
- App-specific metadata in vector payloads
|
||||
|
||||
**User-Controlled Sync (OAuth Mode)**:
|
||||
- `provision_vector_sync` and `deprovision_vector_sync` tools
|
||||
- Per-user background job scheduling
|
||||
- User dashboard for sync status and controls
|
||||
|
||||
**Advanced Search Features**:
|
||||
- Hybrid search (vector + keyword combined)
|
||||
- Filtering by date range, app type, tags
|
||||
- Aggregations and faceted search
|
||||
- Search result explanations (why this matched)
|
||||
|
||||
**Performance Optimizations**:
|
||||
- Query caching for repeated searches
|
||||
- Incremental document updates (paragraph-level)
|
||||
- Batch query processing
|
||||
- Qdrant HNSW indexing tuning
|
||||
|
||||
**Embedding Improvements**:
|
||||
- Support for OpenAI embeddings (ada-002, text-embedding-3)
|
||||
- Multi-language embedding models
|
||||
- Fine-tuned models for Nextcloud content
|
||||
- Paragraph-level chunking for long documents
|
||||
|
||||
## References
|
||||
|
||||
### Architecture Decision Records (ADRs)
|
||||
|
||||
- **[ADR-003: Vector Database Semantic Search](ADR-003-vector-database-semantic-search.md)** - Qdrant selection rationale, embedding strategy, hybrid search (superseded by ADR-007 but technical decisions remain valid)
|
||||
- **[ADR-007: Background Vector Sync Job Management](ADR-007-background-vector-sync-job-management.md)** - Current implementation, Scanner-Queue-Processor architecture, plugin system
|
||||
- **[ADR-008: MCP Sampling for Semantic Search](ADR-008-mcp-sampling-for-semantic-search.md)** - RAG with MCP sampling, client-server separation, prompt construction
|
||||
- **[ADR-009: Semantic Search OAuth Scope](ADR-009-semantic-search-oauth-scope.md)** - OAuth scope model, dual-phase authorization, security rationale
|
||||
|
||||
### Configuration & Setup
|
||||
|
||||
- **[Configuration Guide](configuration.md)** - Environment variables, Qdrant setup, Ollama setup, detailed configuration options
|
||||
- **[Installation Guide](installation.md)** - Deployment options (Docker, Kubernetes, local)
|
||||
- **[Running the Server](running.md)** - Starting the server, transport options, testing
|
||||
|
||||
### Monitoring & Troubleshooting
|
||||
|
||||
- **[Observability Guide](observability.md)** - Logging, metrics, tracing, debugging
|
||||
- **[Troubleshooting](troubleshooting.md)** - General issues and solutions
|
||||
|
||||
### Related Documentation
|
||||
|
||||
- **[OAuth Architecture](oauth-architecture.md)** - OAuth flows, scopes, token management
|
||||
- **[Comparison with Context Agent](comparison-context-agent.md)** - When to use Nextcloud MCP Server vs Context Agent
|
||||
|
||||
---
|
||||
|
||||
**Questions or Issues?**
|
||||
- [Open an issue](https://github.com/cbcoutinho/nextcloud-mcp-server/issues)
|
||||
- [Contribute improvements](https://github.com/cbcoutinho/nextcloud-mcp-server/pulls)
|
||||
@@ -43,14 +43,17 @@ async def _get_processing_status(request: Request) -> dict[str, Any] | None:
|
||||
return None
|
||||
|
||||
try:
|
||||
# Get document queue from app state
|
||||
document_queue = getattr(request.app.state, "document_queue", None)
|
||||
if document_queue is None:
|
||||
logger.debug("document_queue not available in app state")
|
||||
# Get document receive stream from app state
|
||||
document_receive_stream = getattr(
|
||||
request.app.state, "document_receive_stream", None
|
||||
)
|
||||
if document_receive_stream is None:
|
||||
logger.debug("document_receive_stream not available in app state")
|
||||
return None
|
||||
|
||||
# Get pending count from queue
|
||||
pending_count = document_queue.qsize()
|
||||
# Get pending count from stream statistics
|
||||
stats = document_receive_stream.statistics()
|
||||
pending_count = stats.current_buffer_used
|
||||
|
||||
# Get Qdrant client and query indexed count
|
||||
indexed_count = 0
|
||||
@@ -63,7 +66,7 @@ async def _get_processing_status(request: Request) -> dict[str, Any] | None:
|
||||
|
||||
# Count documents in collection
|
||||
count_result = await qdrant_client.count(
|
||||
collection_name=settings.qdrant_collection
|
||||
collection_name=settings.get_collection_name()
|
||||
)
|
||||
indexed_count = count_result.count
|
||||
|
||||
|
||||
@@ -209,6 +209,46 @@ class Settings:
|
||||
"API key is only relevant for network mode and will be ignored."
|
||||
)
|
||||
|
||||
def get_collection_name(self) -> str:
|
||||
"""
|
||||
Get Qdrant collection name.
|
||||
|
||||
Auto-generates from deployment ID + model name unless explicitly set.
|
||||
Deployment ID uses OTEL_SERVICE_NAME if configured, otherwise hostname.
|
||||
|
||||
This enables:
|
||||
- Safe embedding model switching (new model → new collection)
|
||||
- Multi-server deployments (unique deployment IDs)
|
||||
- Clear collection naming (shows deployment and model)
|
||||
|
||||
Format: {deployment-id}-{model-name}
|
||||
|
||||
Examples:
|
||||
- "my-deployment-nomic-embed-text" (OTEL_SERVICE_NAME set)
|
||||
- "mcp-container-all-minilm" (hostname fallback)
|
||||
|
||||
Returns:
|
||||
Collection name string
|
||||
"""
|
||||
import socket
|
||||
|
||||
# Use explicit override if user configured non-default value
|
||||
if self.qdrant_collection != "nextcloud_content":
|
||||
return self.qdrant_collection
|
||||
|
||||
# Determine deployment ID (OTEL service name or hostname fallback)
|
||||
if self.otel_service_name != "nextcloud-mcp-server": # Non-default
|
||||
deployment_id = self.otel_service_name
|
||||
else:
|
||||
# Fallback to hostname for simple Docker deployments without OTEL config
|
||||
deployment_id = socket.gethostname()
|
||||
|
||||
# Sanitize deployment ID and model name
|
||||
deployment_id = deployment_id.lower().replace(" ", "-").replace("_", "-")
|
||||
model_name = self.ollama_embedding_model.replace("/", "-").replace(":", "-")
|
||||
|
||||
return f"{deployment_id}-{model_name}"
|
||||
|
||||
|
||||
def get_settings() -> Settings:
|
||||
"""Get application settings from environment variables.
|
||||
|
||||
@@ -68,17 +68,25 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
client = await get_client(ctx)
|
||||
username = client.username
|
||||
|
||||
logger.info(
|
||||
f"Semantic search: query='{query}', user={username}, "
|
||||
f"limit={limit}, score_threshold={score_threshold}"
|
||||
)
|
||||
|
||||
try:
|
||||
# Generate embedding for query
|
||||
embedding_service = get_embedding_service()
|
||||
query_embedding = await embedding_service.embed(query)
|
||||
logger.debug(
|
||||
f"Generated embedding for query (dimension={len(query_embedding)})"
|
||||
)
|
||||
|
||||
# Search Qdrant with user filtering
|
||||
# Note: Currently only searching notes (doc_type="note")
|
||||
# Future: Remove doc_type filter to search all apps
|
||||
qdrant_client = await get_qdrant_client()
|
||||
search_response = await qdrant_client.query_points(
|
||||
collection_name=settings.qdrant_collection,
|
||||
collection_name=settings.get_collection_name(),
|
||||
query=query_embedding,
|
||||
query_filter=Filter(
|
||||
must=[
|
||||
@@ -98,6 +106,15 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
with_vectors=False, # Don't return vectors to save bandwidth
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Qdrant returned {len(search_response.points)} results "
|
||||
f"(before deduplication and access verification)"
|
||||
)
|
||||
if search_response.points:
|
||||
# Log top 3 scores to help with threshold tuning
|
||||
top_scores = [p.score for p in search_response.points[:3]]
|
||||
logger.debug(f"Top 3 similarity scores: {top_scores}")
|
||||
|
||||
# Deduplicate by document ID (multiple chunks per document)
|
||||
seen_doc_ids = set()
|
||||
results = []
|
||||
@@ -137,9 +154,14 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
except HTTPStatusError as e:
|
||||
if e.response.status_code == 403:
|
||||
# User lost access, skip this document
|
||||
logger.debug(f"Skipping note {doc_id}: access denied (403)")
|
||||
continue
|
||||
elif e.response.status_code == 404:
|
||||
# Document was deleted but not yet removed from vector DB
|
||||
logger.debug(
|
||||
f"Skipping note {doc_id}: not found (404), "
|
||||
f"likely deleted after indexing"
|
||||
)
|
||||
continue
|
||||
else:
|
||||
# Log other errors but continue processing
|
||||
@@ -148,6 +170,16 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
)
|
||||
continue
|
||||
|
||||
logger.info(
|
||||
f"Returning {len(results)} results after deduplication and access verification"
|
||||
)
|
||||
if results:
|
||||
result_details = [
|
||||
f"note_{r.id} (score={r.score:.3f}, title='{r.title}')"
|
||||
for r in results[:5] # Show top 5
|
||||
]
|
||||
logger.debug(f"Top results: {', '.join(result_details)}")
|
||||
|
||||
return SemanticSearchResponse(
|
||||
results=results,
|
||||
query=query,
|
||||
@@ -259,7 +291,47 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
success=True,
|
||||
)
|
||||
|
||||
# 3. Construct context from retrieved documents
|
||||
# 3. Check if client supports sampling
|
||||
from mcp.types import ClientCapabilities, SamplingCapability
|
||||
|
||||
client_has_sampling = ctx.session.check_client_capability(
|
||||
ClientCapabilities(sampling=SamplingCapability())
|
||||
)
|
||||
|
||||
# Log capability check result for debugging
|
||||
logger.info(
|
||||
f"Sampling capability check: client_has_sampling={client_has_sampling}, "
|
||||
f"query='{query}'"
|
||||
)
|
||||
if hasattr(ctx.session, "_client_params") and ctx.session._client_params:
|
||||
client_caps = ctx.session._client_params.capabilities
|
||||
logger.debug(
|
||||
f"Client advertised capabilities: "
|
||||
f"roots={client_caps.roots is not None}, "
|
||||
f"sampling={client_caps.sampling is not None}, "
|
||||
f"experimental={client_caps.experimental is not None}"
|
||||
)
|
||||
|
||||
if not client_has_sampling:
|
||||
logger.info(
|
||||
f"Client does not support sampling (query: '{query}'), "
|
||||
f"returning {len(search_response.results)} documents"
|
||||
)
|
||||
return SamplingSearchResponse(
|
||||
query=query,
|
||||
generated_answer=(
|
||||
f"[Sampling not supported by client]\n\n"
|
||||
f"Your MCP client doesn't support answer generation. "
|
||||
f"Found {search_response.total_found} relevant documents. "
|
||||
f"Please review the sources below."
|
||||
),
|
||||
sources=search_response.results,
|
||||
total_found=search_response.total_found,
|
||||
search_method="semantic_sampling_unsupported",
|
||||
success=True,
|
||||
)
|
||||
|
||||
# 4. Construct context from retrieved documents
|
||||
context_parts = []
|
||||
for idx, result in enumerate(search_response.results, 1):
|
||||
context_parts.append(
|
||||
@@ -273,7 +345,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
|
||||
context = "\n".join(context_parts)
|
||||
|
||||
# 4. Construct prompt - reuse user's query, add context and instructions
|
||||
# 5. Construct prompt - reuse user's query, add context and instructions
|
||||
prompt = (
|
||||
f"{query}\n\n"
|
||||
f"Here are relevant documents from Nextcloud (notes, calendar events, deck cards, files, contacts):\n\n"
|
||||
@@ -282,31 +354,35 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
f"Cite the document numbers when referencing specific information."
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
f"Requesting sampling for query: {query} "
|
||||
f"({len(search_response.results)} documents retrieved)"
|
||||
logger.info(
|
||||
f"Initiating sampling request: query_length={len(query)}, "
|
||||
f"documents={len(search_response.results)}, "
|
||||
f"prompt_length={len(prompt)}, max_tokens={max_answer_tokens}"
|
||||
)
|
||||
|
||||
# 5. Request LLM completion via MCP sampling
|
||||
try:
|
||||
sampling_result = await ctx.session.create_message(
|
||||
messages=[
|
||||
SamplingMessage(
|
||||
role="user",
|
||||
content=TextContent(type="text", text=prompt),
|
||||
)
|
||||
],
|
||||
max_tokens=max_answer_tokens,
|
||||
temperature=0.7,
|
||||
model_preferences=ModelPreferences(
|
||||
hints=[ModelHint(name="claude-3-5-sonnet")],
|
||||
intelligencePriority=0.8,
|
||||
speedPriority=0.5,
|
||||
),
|
||||
include_context="thisServer",
|
||||
)
|
||||
# 6. Request LLM completion via MCP sampling with timeout
|
||||
import anyio
|
||||
|
||||
# 6. Extract answer from sampling response
|
||||
try:
|
||||
with anyio.fail_after(30):
|
||||
sampling_result = await ctx.session.create_message(
|
||||
messages=[
|
||||
SamplingMessage(
|
||||
role="user",
|
||||
content=TextContent(type="text", text=prompt),
|
||||
)
|
||||
],
|
||||
max_tokens=max_answer_tokens,
|
||||
temperature=0.7,
|
||||
model_preferences=ModelPreferences(
|
||||
hints=[ModelHint(name="claude-3-5-sonnet")],
|
||||
intelligencePriority=0.8,
|
||||
speedPriority=0.5,
|
||||
),
|
||||
include_context="thisServer",
|
||||
)
|
||||
|
||||
# 7. Extract answer from sampling response
|
||||
if sampling_result.content.type == "text":
|
||||
generated_answer = sampling_result.content.text
|
||||
else:
|
||||
@@ -318,7 +394,8 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
|
||||
logger.info(
|
||||
f"Sampling successful: model={sampling_result.model}, "
|
||||
f"stop_reason={sampling_result.stopReason}"
|
||||
f"stop_reason={sampling_result.stopReason}, "
|
||||
f"answer_length={len(generated_answer)}"
|
||||
)
|
||||
|
||||
return SamplingSearchResponse(
|
||||
@@ -332,23 +409,78 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
success=True,
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
# Fallback: Return documents without generated answer
|
||||
except TimeoutError:
|
||||
logger.warning(
|
||||
f"Sampling failed ({type(e).__name__}: {e}), "
|
||||
f"Sampling request timed out after 30 seconds for query: '{query}', "
|
||||
f"returning search results only"
|
||||
)
|
||||
return SamplingSearchResponse(
|
||||
query=query,
|
||||
generated_answer=(
|
||||
f"[Sampling request timed out]\n\n"
|
||||
f"The answer generation took too long (>30s). "
|
||||
f"Found {search_response.total_found} relevant documents. "
|
||||
f"Please review the sources below or try a simpler query."
|
||||
),
|
||||
sources=search_response.results,
|
||||
total_found=search_response.total_found,
|
||||
search_method="semantic_sampling_timeout",
|
||||
success=True,
|
||||
)
|
||||
|
||||
except McpError as e:
|
||||
# Expected MCP protocol errors (user rejection, unsupported, etc.)
|
||||
error_msg = str(e)
|
||||
|
||||
if "rejected" in error_msg.lower() or "denied" in error_msg.lower():
|
||||
# User explicitly declined - this is normal, not an error
|
||||
logger.info(f"User declined sampling request for query: '{query}'")
|
||||
search_method = "semantic_sampling_user_declined"
|
||||
user_message = "User declined to generate an answer"
|
||||
elif "not supported" in error_msg.lower():
|
||||
# Client doesn't support sampling - also normal
|
||||
logger.info(f"Sampling not supported by client for query: '{query}'")
|
||||
search_method = "semantic_sampling_unsupported"
|
||||
user_message = "Sampling not supported by this client"
|
||||
else:
|
||||
# Other MCP protocol errors
|
||||
logger.warning(
|
||||
f"MCP error during sampling for query '{query}': {error_msg}"
|
||||
)
|
||||
search_method = "semantic_sampling_mcp_error"
|
||||
user_message = f"Sampling unavailable: {error_msg}"
|
||||
|
||||
return SamplingSearchResponse(
|
||||
query=query,
|
||||
generated_answer=(
|
||||
f"[Sampling unavailable: {str(e)}]\n\n"
|
||||
f"[{user_message}]\n\n"
|
||||
f"Found {search_response.total_found} relevant documents. "
|
||||
f"Please review the sources below."
|
||||
),
|
||||
sources=search_response.results,
|
||||
total_found=search_response.total_found,
|
||||
search_method="semantic_sampling_fallback",
|
||||
search_method=search_method,
|
||||
success=True,
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
# Truly unexpected errors - these SHOULD have tracebacks
|
||||
logger.error(
|
||||
f"Unexpected error during sampling for query '{query}': "
|
||||
f"{type(e).__name__}: {e}",
|
||||
exc_info=True,
|
||||
)
|
||||
|
||||
return SamplingSearchResponse(
|
||||
query=query,
|
||||
generated_answer=(
|
||||
f"[Unexpected error during sampling]\n\n"
|
||||
f"Found {search_response.total_found} relevant documents. "
|
||||
f"Please review the sources below."
|
||||
),
|
||||
sources=search_response.results,
|
||||
total_found=search_response.total_found,
|
||||
search_method="semantic_sampling_error",
|
||||
success=True,
|
||||
)
|
||||
|
||||
@@ -413,7 +545,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
|
||||
# Count documents in collection
|
||||
count_result = await qdrant_client.count(
|
||||
collection_name=settings.qdrant_collection
|
||||
collection_name=settings.get_collection_name()
|
||||
)
|
||||
indexed_count = count_result.count
|
||||
|
||||
|
||||
@@ -100,7 +100,7 @@ async def process_document(doc_task: DocumentTask, nc_client: NextcloudClient):
|
||||
# Handle deletion
|
||||
if doc_task.operation == "delete":
|
||||
await qdrant_client.delete(
|
||||
collection_name=settings.qdrant_collection,
|
||||
collection_name=settings.get_collection_name(),
|
||||
points_selector=Filter(
|
||||
must=[
|
||||
FieldCondition(
|
||||
@@ -209,7 +209,7 @@ async def _index_document(
|
||||
|
||||
# Upsert to Qdrant
|
||||
await qdrant_client.upsert(
|
||||
collection_name=settings.qdrant_collection,
|
||||
collection_name=settings.get_collection_name(),
|
||||
points=points,
|
||||
wait=True,
|
||||
)
|
||||
|
||||
@@ -59,30 +59,57 @@ async def get_qdrant_client() -> AsyncQdrantClient:
|
||||
logger.warning("No Qdrant mode configured, defaulting to :memory:")
|
||||
_qdrant_client = AsyncQdrantClient(":memory:")
|
||||
|
||||
# Ensure collection exists
|
||||
collection_name = settings.qdrant_collection
|
||||
# Get collection name (auto-generated from deployment ID + model)
|
||||
collection_name = settings.get_collection_name()
|
||||
|
||||
# Import here to avoid circular dependency
|
||||
from nextcloud_mcp_server.embedding import get_embedding_service
|
||||
|
||||
embedding_service = get_embedding_service()
|
||||
dimension = embedding_service.get_dimension()
|
||||
expected_dimension = embedding_service.get_dimension()
|
||||
|
||||
try:
|
||||
await _qdrant_client.get_collection(collection_name)
|
||||
logger.info(f"Using existing Qdrant collection: {collection_name}")
|
||||
except Exception:
|
||||
# Get existing collection
|
||||
collection_info = await _qdrant_client.get_collection(collection_name)
|
||||
actual_dimension = collection_info.config.params.vectors.size
|
||||
|
||||
# Validate dimension matches
|
||||
if actual_dimension != expected_dimension:
|
||||
raise ValueError(
|
||||
f"Dimension mismatch for collection '{collection_name}':\n"
|
||||
f" Expected: {expected_dimension} (from embedding model '{settings.ollama_embedding_model}')\n"
|
||||
f" Found: {actual_dimension}\n"
|
||||
f"This usually means you changed the embedding model.\n"
|
||||
f"Solutions:\n"
|
||||
f" 1. Delete the old collection: Collection will be recreated with new dimensions\n"
|
||||
f" 2. Set QDRANT_COLLECTION to use a different collection name\n"
|
||||
f" 3. Revert OLLAMA_EMBEDDING_MODEL to the original model"
|
||||
)
|
||||
|
||||
logger.info(
|
||||
f"Using existing Qdrant collection: {collection_name} "
|
||||
f"(dimension={actual_dimension}, model={settings.ollama_embedding_model})"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
# Check if it's a dimension mismatch error (re-raise it)
|
||||
if isinstance(e, ValueError):
|
||||
raise
|
||||
|
||||
# Collection doesn't exist, create it
|
||||
await _qdrant_client.create_collection(
|
||||
collection_name=collection_name,
|
||||
vectors_config=VectorParams(
|
||||
size=dimension,
|
||||
size=expected_dimension,
|
||||
distance=Distance.COSINE,
|
||||
),
|
||||
)
|
||||
logger.info(
|
||||
f"Created Qdrant collection: {collection_name} "
|
||||
f"(dimension={dimension}, distance=COSINE)"
|
||||
f"Created Qdrant collection: {collection_name}\n"
|
||||
f" Dimension: {expected_dimension}\n"
|
||||
f" Model: {settings.ollama_embedding_model}\n"
|
||||
f" Distance: COSINE\n"
|
||||
f"Background sync will index all documents with this embedding model."
|
||||
)
|
||||
|
||||
return _qdrant_client
|
||||
|
||||
@@ -96,7 +96,7 @@ async def scan_user_documents(
|
||||
nc_client: Authenticated Nextcloud client
|
||||
initial_sync: If True, send all documents (first-time sync)
|
||||
"""
|
||||
logger.info(f"Scanning documents for user: {user_id}")
|
||||
logger.debug(f"Scanning documents for user: {user_id}")
|
||||
|
||||
# Fetch all notes from Nextcloud
|
||||
notes = [note async for note in nc_client.notes.get_all_notes()]
|
||||
@@ -127,7 +127,7 @@ async def scan_user_documents(
|
||||
# Get indexed state from Qdrant
|
||||
qdrant_client = await get_qdrant_client()
|
||||
scroll_result = await qdrant_client.scroll(
|
||||
collection_name=get_settings().qdrant_collection,
|
||||
collection_name=get_settings().get_collection_name(),
|
||||
scroll_filter=Filter(
|
||||
must=[
|
||||
FieldCondition(key="user_id", match=MatchValue(value=user_id)),
|
||||
|
||||
Reference in New Issue
Block a user