Files
nextcloud-mcp-server/README.md
T
Chris Coutinho e575c8e57b feat(vector): Support multiple embedding models with auto-generated collection names
This PR enables safe switching between embedding models and multi-server
deployments by implementing auto-generated Qdrant collection names based on
deployment ID and model name.

## Problem

Previously, all deployments used a single hardcoded collection name
"nextcloud_content", which caused two critical issues:

1. **Dimension mismatches when switching models**: Changing
   OLLAMA_EMBEDDING_MODEL (e.g., nomic-embed-text at 768D → all-minilm at
   384D) would cause runtime errors as vectors couldn't be inserted into a
   collection with incompatible dimensions.

2. **Collection collisions in multi-server setups**: Multiple MCP servers
   sharing a single Qdrant instance would overwrite each other's data,
   making horizontal scaling impossible.

## Solution

### Auto-Generated Collection Naming

Collections are now automatically named using the pattern:
\`{deployment-id}-{model-name}\`

**Deployment ID**: Uses \`OTEL_SERVICE_NAME\` if configured (and not default
value), otherwise falls back to \`hostname\` for simple Docker deployments.

**Model Name**: From \`OLLAMA_EMBEDDING_MODEL\` with path separators sanitized.

**Examples**:
- \`my-mcp-server-nomic-embed-text\` (with OTEL_SERVICE_NAME=my-mcp-server)
- \`mcp-container-all-minilm\` (simple Docker, hostname=mcp-container)

**Override**: Users can still set \`QDRANT_COLLECTION\` explicitly to bypass
auto-generation for backward compatibility.

### Dimension Validation

Added startup validation that checks collection dimensions match the
embedding service. If a mismatch is detected, the server fails fast with a
clear error message explaining:
- Expected vs actual dimensions
- Likely cause (model change)
- Solutions (delete collection, use different name, or revert model)

### Improved Sampling Error Handling

Enhanced MCP sampling rejection handling to treat user rejections as normal
behavior rather than errors:

- **User rejections** ("rejected", "denied") → INFO log, no traceback
- **Unsupported clients** → INFO log, no traceback
- **Other MCP errors** → WARNING log, no traceback
- **Unexpected errors** → ERROR log WITH traceback

This aligns with the MCP specification where clients SHOULD prompt users for
approval/denial of sampling requests.

## Changes

### Core Implementation

- **nextcloud_mcp_server/config.py**: Added \`get_collection_name()\` method
  with deployment ID detection and model name sanitization
- **nextcloud_mcp_server/vector/qdrant_client.py**: Dimension validation on
  collection open with helpful error messages
- **nextcloud_mcp_server/vector/{scanner,processor}.py**: Updated to use
  \`get_collection_name()\`
- **nextcloud_mcp_server/auth/userinfo_routes.py**: Vector sync status uses
  \`get_collection_name()\`
- **nextcloud_mcp_server/server/semantic.py**:
  - Updated semantic search tools to use \`get_collection_name()\`
  - Improved sampling rejection error handling (McpError vs Exception)

### Documentation

- **docs/semantic-search-architecture.md**: New comprehensive architecture
  document (557 lines) covering background sync, semantic search flow, RAG
  implementation, and deployment modes
- **docs/configuration.md**: Added detailed "Qdrant Collection Naming"
  section with examples and multi-server deployment guidance
- **docker-compose.yml**: Added comments explaining collection naming behavior
- **README.md**: Updated semantic search descriptions to clarify
  experimental status, Notes-only support, and infrastructure requirements

## Migration Guide

**For existing single-server deployments:**

Option 1 (Recommended): Use explicit collection name for continuity
\`\`\`bash
QDRANT_COLLECTION=nextcloud_content  # Keep existing collection
\`\`\`

Option 2: Allow auto-generation and re-embed
\`\`\`bash
# Remove QDRANT_COLLECTION override
# New collection will be created based on deployment ID + model
# Requires re-embedding all documents (may take time)
\`\`\`

**For new multi-server deployments:**

Set unique OTEL service names per server:
\`\`\`bash
# Server 1
OTEL_SERVICE_NAME=mcp-prod
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# → Collection: "mcp-prod-nomic-embed-text"

# Server 2
OTEL_SERVICE_NAME=mcp-staging
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# → Collection: "mcp-staging-nomic-embed-text"
\`\`\`

## Benefits

 **Safe model switching**: Each model gets its own collection, preventing
   dimension mismatch errors
 **Multi-server support**: Multiple MCP servers can share one Qdrant
   instance without conflicts
 **Clear ownership**: Collection names show which deployment and model owns
   the data
 **Better error messages**: Dimension validation provides actionable
   guidance
 **Backward compatible**: Existing deployments can continue using
   \`QDRANT_COLLECTION\` override

## Testing

Validated with:
- Single-server deployments (default hostname-based naming)
- Multi-server deployments (OTEL service name-based naming)
- Model switching scenarios (dimension validation)
- Collection override scenarios (backward compatibility)

Next steps: Testing various Ollama embedding models to investigate optimal
chunk sizes and performance characteristics.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 01:18:30 +01:00

202 lines
9.1 KiB
Markdown

# Nextcloud MCP Server
[![Docker Image](https://img.shields.io/badge/docker-ghcr.io/cbcoutinho/nextcloud--mcp--server-blue)](https://github.com/cbcoutinho/nextcloud-mcp-server/pkgs/container/nextcloud-mcp-server)
**A production-ready MCP server that connects AI assistants to your Nextcloud instance.**
Enable Large Language Models like Claude, GPT, and Gemini to interact with your Nextcloud data through a secure API. Create notes, manage calendars, organize contacts, work with files, and more - all through natural language conversations.
This is a **dedicated standalone MCP server** designed for external MCP clients like Claude Code and IDEs. It runs independently of Nextcloud (Docker, VM, Kubernetes, or local) and provides deep CRUD operations across Nextcloud apps.
> [!NOTE]
> **Looking for AI features inside Nextcloud?** Nextcloud also provides [Context Agent](https://github.com/nextcloud/context_agent), which powers the Assistant app and runs as an ExApp inside Nextcloud. See [docs/comparison-context-agent.md](docs/comparison-context-agent.md) for a detailed comparison of use cases.
## Quick Start
Get up and running in 60 seconds using Docker:
```bash
# 1. Create a minimal configuration
cat > .env << EOF
NEXTCLOUD_HOST=https://your.nextcloud.instance.com
NEXTCLOUD_USERNAME=your_username
NEXTCLOUD_PASSWORD=your_app_password
EOF
# 2. Start the server
docker run -p 127.0.0.1:8000:8000 --env-file .env --rm \
ghcr.io/cbcoutinho/nextcloud-mcp-server:latest
# 3. Test the connection
curl http://127.0.0.1:8000/health
```
**Next Steps:**
- Create an app password in Nextcloud: Settings → Security → Devices & sessions
- Connect your MCP client (Claude Desktop, IDEs, `mcp dev`, etc.)
- See [docs/installation.md](docs/installation.md) for other deployment options (local, Kubernetes)
## Key Features
- **90+ MCP Tools** - Comprehensive API coverage across 8 Nextcloud apps
- **MCP Resources** - Structured data URIs for browsing Nextcloud data
- **Semantic Search (Experimental)** - Optional vector-powered search for Notes (requires Qdrant + Ollama)
- **Document Processing** - OCR and text extraction from PDFs, DOCX, images with progress notifications
- **Flexible Deployment** - Docker, Kubernetes (Helm), VM, or local installation
- **Production-Ready Auth** - Basic Auth with app passwords (recommended) or OAuth2/OIDC (experimental)
- **Multiple Transports** - SSE, HTTP, and streamable-http support
## Supported Apps
| App | Tools | Capabilities |
|-----|-------|--------------|
| **Notes** | 7 | Full CRUD, keyword search, semantic search |
| **Calendar** | 20+ | Events, todos (tasks), recurring events, attendees, availability |
| **Contacts** | 8 | Full CardDAV support, address books |
| **Files (WebDAV)** | 12 | Filesystem access, OCR/document processing |
| **Deck** | 15 | Boards, stacks, cards, labels, assignments |
| **Cookbook** | 13 | Recipe management, URL import (schema.org) |
| **Tables** | 5 | Row operations on Nextcloud Tables |
| **Sharing** | 10+ | Create and manage shares |
| **Semantic Search** | 2+ | Vector search for Notes (experimental, opt-in, requires infrastructure) |
Want to see another Nextcloud app supported? [Open an issue](https://github.com/cbcoutinho/nextcloud-mcp-server/issues) or contribute a pull request!
## Authentication
> [!IMPORTANT]
> **OAuth2/OIDC is experimental** and requires a manual patch to the `user_oidc` app:
> - **Required patch**: Bearer token support ([issue #1221](https://github.com/nextcloud/user_oidc/issues/1221))
> - **Impact**: Without the patch, most app-specific APIs fail with 401 errors
> - **Recommendation**: Use Basic Auth for production until upstream patches are merged
>
> See [docs/oauth-upstream-status.md](docs/oauth-upstream-status.md) for patch status and workarounds.
**Recommended:** Basic Auth with app-specific passwords provides secure, production-ready authentication. See [docs/authentication.md](docs/authentication.md) for setup details and OAuth configuration.
### Authentication Modes
The server supports two authentication modes:
**Single-User Mode (BasicAuth):**
- One set of credentials shared by all MCP clients
- Simple setup: username + app password in environment variables
- All clients access Nextcloud as the same user
- Best for: Personal use, development, single-user deployments
**Multi-User Mode (OAuth):**
- Each MCP client authenticates separately with their own Nextcloud account
- Per-user scopes and permissions (clients only see tools they're authorized for)
- More secure: tokens expire, credentials never shared with server
- Best for: Teams, multi-user deployments, production environments with multiple users
See [docs/authentication.md](docs/authentication.md) for detailed setup instructions.
## Semantic Search
The server provides an experimental RAG pipeline to enable _Semantic Search_ that enables MCP clients to find information in Nextcloud based on **meaning** rather than just keywords. Instead of matching "machine learning" only when those exact words appear, it understands that "neural networks," "AI models," and "deep learning" are semantically related concepts.
**Example:**
- **Keyword search**: Query "car" only finds notes containing "car"
- **Semantic search**: Query "car" also finds notes about "automobile," "vehicle," "sedan," "transportation"
This enables natural language queries and helps discover related content across your Nextcloud notes.
> [!NOTE]
> **Semantic Search is experimental and opt-in:**
> - Disabled by default (`VECTOR_SYNC_ENABLED=false`)
> - Currently supports Notes app only (multi-app support planned)
> - Requires additional infrastructure: vector database + embedding service
> - Answer generation (`nc_semantic_search_answer`) requires MCP client sampling support
>
> See [docs/semantic-search-architecture.md](docs/semantic-search-architecture.md) for architecture details and [docs/configuration.md](docs/configuration.md) for setup instructions.
## Documentation
### Getting Started
- **[Installation](docs/installation.md)** - Docker, Kubernetes, local, or VM deployment
- **[Configuration](docs/configuration.md)** - Environment variables and advanced options
- **[Authentication](docs/authentication.md)** - Basic Auth vs OAuth2/OIDC setup
- **[Running the Server](docs/running.md)** - Start, manage, and troubleshoot
### Features
- **[App Documentation](docs/)** - Notes, Calendar, Contacts, WebDAV, Deck, Cookbook, Tables
- **[Document Processing](docs/configuration.md#document-processing)** - OCR and text extraction setup
- **[Semantic Search Architecture](docs/semantic-search-architecture.md)** - Experimental vector search (Notes only, opt-in)
### Advanced Topics
- **[OAuth Architecture](docs/oauth-architecture.md)** - How OAuth works (experimental)
- **[OAuth Quick Start](docs/quickstart-oauth.md)** - 5-minute OAuth setup
- **[OAuth Setup Guide](docs/oauth-setup.md)** - Detailed OAuth configuration
- **[Troubleshooting](docs/troubleshooting.md)** - Common issues and solutions
- **[Comparison with Context Agent](docs/comparison-context-agent.md)** - When to use each approach
## Examples
### Create a Note
```
AI: "Create a note called 'Meeting Notes' with today's agenda"
→ Uses nc_notes_create_note tool
```
### Import Recipes
```
AI: "Import the recipe from https://www.example.com/recipe/chocolate-cake"
→ Uses nc_cookbook_import_recipe tool with schema.org metadata extraction
```
### Schedule Meetings
```
AI: "Schedule a team meeting for next Tuesday at 2pm"
→ Uses nc_calendar_create_event tool
```
### Manage Files
```
AI: "Create a folder called 'Project X' and move all PDFs there"
→ Uses nc_webdav_create_directory and nc_webdav_move tools
```
### Semantic Search (Experimental, Opt-in)
```
AI: "Find notes related to machine learning concepts"
→ Uses nc_semantic_search to find semantically similar notes (requires Qdrant + Ollama setup)
```
**Note:** For AI-generated answers with citations, use `nc_semantic_search_answer` (requires MCP client with sampling support).
## Contributing
Contributions are welcome!
- Report bugs or request features: [GitHub Issues](https://github.com/cbcoutinho/nextcloud-mcp-server/issues)
- Submit improvements: [Pull Requests](https://github.com/cbcoutinho/nextcloud-mcp-server/pulls)
- Development guidelines: [CLAUDE.md](CLAUDE.md)
## Security
[![MseeP.ai Security Assessment](https://mseep.net/pr/cbcoutinho-nextcloud-mcp-server-badge.png)](https://mseep.ai/app/cbcoutinho-nextcloud-mcp-server)
This project takes security seriously:
- Production-ready Basic Auth with app-specific passwords
- OAuth2/OIDC support (experimental, requires upstream patches)
- Per-user access tokens
- No credential storage in OAuth mode
- Regular security assessments
Found a security issue? Please report it privately to the maintainers.
## License
This project is licensed under the AGPL-3.0 License. See [LICENSE](./LICENSE) for details.
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=cbcoutinho/nextcloud-mcp-server&type=Date)](https://www.star-history.com/#cbcoutinho/nextcloud-mcp-server&Date)
## References
- [Model Context Protocol](https://github.com/modelcontextprotocol)
- [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk)
- [Nextcloud](https://nextcloud.com/)