Add nc_notes_semantic_search_answer tool that combines semantic search with MCP sampling to generate natural language answers from retrieved Nextcloud Notes. This enables Retrieval-Augmented Generation (RAG) patterns without requiring a server-side LLM. Key features: - Client-side LLM generation via ctx.session.create_message() - Graceful fallback when sampling unavailable - Proper source citations in generated answers - No results optimization (skips sampling when no docs found) - Comprehensive unit and integration tests Implementation details: - SamplingSearchResponse model with generated_answer and sources - Fixed prompt template with document context and citation instructions - Model preferences hint Claude Sonnet for balanced performance - Falls back to returning documents without answer on sampling failure Updates: - Add ADR-008 documenting sampling architecture decision - Add MCP sampling pattern guidance to CLAUDE.md - Update README.md and docs/notes.md (7 → 9 tools) - Add 4 unit tests and 6 integration tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
16 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Coding Conventions
async/await Patterns
- Use anyio + asyncio hybrid - Both libraries are available
- pytest runs in
anyiomode (anyio_mode = "auto"in pyproject.toml) - asyncio used in auth modules (refresh_token_storage.py, token_exchange.py, token_broker.py)
- anyio used in calendar.py, client_registration.py, app.py
- Prefer standard async/await syntax without explicit library imports when possible
- pytest runs in
Type Hints
- Use Python 3.10+ union syntax:
str | Noneinstead ofOptional[str] - Use lowercase generics:
dict[str, Any]instead ofDict[str, Any] - Type all function signatures - Parameters and return types
- No explicit type checker configured - Ruff handles linting only
Code Quality
- Run ruff before committing:
uv run ruff check uv run ruff format - Ruff configuration in pyproject.toml (extends select: ["I"] for import sorting)
Error Handling
- Use custom decorators:
@retry_on_429for rate limiting (see base_client.py) - Standard exceptions:
HTTPStatusErrorfrom httpx,McpErrorfor MCP-specific errors - Logging patterns:
logger.debug()for expected 404s and normal operationslogger.warning()for retries and non-critical issueslogger.error()for actual errors
Testing Patterns
- Use existing fixtures from
tests/conftest.py(2888 lines of test infrastructure) - Session-scoped fixtures handle anyio/pytest-asyncio incompatibility
- Mocked unit tests use
mocker.AsyncMock(spec=httpx.AsyncClient) - pytest-timeout: 180s default per test
- Mark tests appropriately:
@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.oauth,@pytest.mark.smoke
Architectural Patterns
- Base classes:
BaseNextcloudClientfor all API clients - Pydantic responses: All MCP tools return Pydantic models inheriting from
BaseResponse - Decorators:
@require_scopes,@require_provisioningfor access control - Context pattern:
await get_client(ctx)to access authenticated NextcloudClient (async!) - FastMCP decorators:
@mcp.tool(),@mcp.resource() - Token acquisition:
get_client()handles both pass-through and token exchange modes- Pass-through (default): Simple, stateless (ENABLE_TOKEN_EXCHANGE=false)
- Token exchange (opt-in): RFC 8693 delegation (ENABLE_TOKEN_EXCHANGE=true)
Project Structure
nextcloud_mcp_server/client/- HTTP clients for Nextcloud APIsnextcloud_mcp_server/server/- MCP tool/resource definitionsnextcloud_mcp_server/auth/- OAuth/OIDC authenticationnextcloud_mcp_server/models/- Pydantic response modelstests/- Layered test suite (unit, smoke, integration, load)
Development Commands (Quick Reference)
Testing
# Fast feedback (recommended)
uv run pytest tests/unit/ -v # Unit tests (~5s)
uv run pytest -m smoke -v # Smoke tests (~30-60s)
# Integration tests
uv run pytest -m "integration and not oauth" -v # Without OAuth (~2-3min)
uv run pytest -m oauth -v # OAuth only (~3min)
uv run pytest # Full suite (~4-5min)
# Coverage
uv run pytest --cov
# Specific tests after changes
uv run pytest tests/server/test_mcp.py -k "notes" -v
uv run pytest tests/client/notes/test_notes_api.py -v
Important: After code changes, rebuild the correct container:
- Single-user tests:
docker-compose up --build -d mcp - OAuth tests:
docker-compose up --build -d mcp-oauth - Keycloak tests:
docker-compose up --build -d mcp-keycloak
Running the Server
# Local development
export $(grep -v '^#' .env | xargs)
mcp run --transport sse nextcloud_mcp_server.app:mcp
# Docker development (rebuilds after code changes)
docker-compose up --build -d mcp # Single-user (port 8000)
docker-compose up --build -d mcp-oauth # Nextcloud OAuth (port 8001)
docker-compose up --build -d mcp-keycloak # Keycloak OAuth (port 8002)
Environment Setup
uv sync # Install dependencies
uv sync --group dev # Install with dev dependencies
Load Testing
# Quick test (default: 10 workers, 30 seconds)
uv run python -m tests.load.benchmark
# Custom concurrency and duration
uv run python -m tests.load.benchmark -c 20 -d 60
# Export results for analysis
uv run python -m tests.load.benchmark --output results.json --verbose
Expected Performance: 50-200 RPS for mixed workload, p50 <100ms, p95 <500ms, p99 <1000ms.
Database Inspection
Credentials: root/password, nextcloud/password, database: nextcloud
# Connect to database
docker compose exec db mariadb -u root -ppassword nextcloud
# Check OAuth clients
docker compose exec db mariadb -u root -ppassword nextcloud -e \
"SELECT id, name, token_type FROM oc_oidc_clients ORDER BY id DESC LIMIT 10;"
# Check OAuth client scopes
docker compose exec db mariadb -u root -ppassword nextcloud -e \
"SELECT c.id, c.name, s.scope FROM oc_oidc_clients c LEFT JOIN oc_oidc_client_scopes s ON c.id = s.client_id WHERE c.name LIKE '%MCP%';"
# Check OAuth access tokens
docker compose exec db mariadb -u root -ppassword nextcloud -e \
"SELECT id, client_id, user_id, created_at FROM oc_oidc_access_tokens ORDER BY created_at DESC LIMIT 10;"
Important Tables:
oc_oidc_clients- OAuth client registrations (DCR)oc_oidc_client_scopes- Client allowed scopesoc_oidc_access_tokens- Issued access tokensoc_oidc_authorization_codes- Authorization codesoc_oidc_registration_tokens- RFC 7592 registration tokensoc_oidc_redirect_uris- Redirect URIs
Architecture Quick Reference
For detailed architecture, see:
docs/comparison-context-agent.md- Overall architecturedocs/oauth-architecture.md- OAuth integration patternsdocs/ADR-004-progressive-consent.md- Progressive consent implementation
Core Components:
nextcloud_mcp_server/app.py- FastMCP server entry pointnextcloud_mcp_server/client/- HTTP clients (Notes, Calendar, Contacts, Tables, WebDAV)nextcloud_mcp_server/server/- MCP tool/resource definitionsnextcloud_mcp_server/auth/- OAuth/OIDC authentication
Supported Apps: Notes, Calendar (CalDAV + VTODO tasks), Contacts (CardDAV), Tables, WebDAV, Deck, Cookbook
Key Patterns:
NextcloudClientorchestrates all app-specific clientsBaseNextcloudClientprovides common HTTP functionality + retry logic- MCP tools use context pattern:
get_client(ctx)→NextcloudClient - All operations are async using httpx
Progressive Consent Architecture (ADR-004)
Important: Progressive consent is a mechanism for granting access, not a feature flag. The architecture is always present in OAuth mode. Whether provisioning tools are available is controlled by ENABLE_OFFLINE_ACCESS.
What is Progressive Consent?
- Dual OAuth flow architecture that separates client authentication (Flow 1) from resource provisioning (Flow 2)
- Flow 1: MCP client authenticates directly to IdP with resource scopes (notes:, calendar:, etc.)
- Token audience: "mcp-server"
- Client receives resource-scoped token for MCP session
- Flow 2: Server explicitly provisions Nextcloud access via separate login (only when
ENABLE_OFFLINE_ACCESS=true)- Server requests: openid, profile, email, offline_access
- Token audience: "nextcloud"
- Server receives refresh token for offline access
- Client never sees this token
- Provides clear separation between session tokens and offline access tokens
Modes:
- Pass-through mode (
ENABLE_OFFLINE_ACCESS=false, default):- No Flow 2 provisioning
- Server uses client's token to access Nextcloud (pass-through)
- No provisioning tools available
- Suitable for stateless, client-driven operations
- Offline access mode (
ENABLE_OFFLINE_ACCESS=true):- Flow 2 provisioning available
- Server stores refresh tokens for background operations
- Provisioning tools available:
provision_nextcloud_access,check_logged_in - Suitable for background jobs and server-initiated operations
When to use OAuth mode:
- Multi-user deployments
- Background jobs requiring offline access (with
ENABLE_OFFLINE_ACCESS=true) - Enhanced security with separate authorization contexts
- Explicit user control over resource access
When to use BasicAuth instead:
- Simple single-user deployments
- Local development and testing
Key features:
- No scope escalation - client gets exactly what it requests
- User explicitly authorizes via
provision_nextcloud_accesstool - Clear security boundaries between MCP session and Nextcloud access
MCP Response Patterns (CRITICAL)
Never return raw List[Dict] from MCP tools - FastMCP mangles them into dicts with numeric string keys.
Correct Pattern:
- Client methods return
List[Dict](raw data) - MCP tools convert to Pydantic models and wrap in response object
- Response models inherit from
BaseResponse, includeresultsfield + metadata
Reference implementations:
nextcloud_mcp_server/models/notes.py:80-SearchNotesResponsenextcloud_mcp_server/models/webdav.py:113-SearchFilesResponsenextcloud_mcp_server/server/{notes,webdav}.py- Tool examples
Testing: Extract data["results"] from MCP responses, not data directly.
MCP Sampling for RAG (ADR-008)
What is MCP Sampling? MCP sampling allows servers to request LLM completions from their clients. This enables Retrieval-Augmented Generation (RAG) patterns where the server retrieves context and the client's LLM generates answers.
When to use sampling:
- Generating natural language answers from retrieved documents
- Synthesizing information from multiple sources
- Creating summaries with citations
Implementation Pattern (see ADR-008 for details):
from mcp.types import ModelHint, ModelPreferences, SamplingMessage, TextContent
@mcp.tool()
@require_scopes("notes:read")
async def nc_notes_semantic_search_answer(
query: str, ctx: Context, limit: int = 5, max_answer_tokens: int = 500
) -> SamplingSearchResponse:
# 1. Retrieve documents
search_response = await nc_notes_semantic_search(query, ctx, limit)
# 2. Check for no results (don't waste sampling call)
if not search_response.results:
return SamplingSearchResponse(
query=query,
generated_answer="No relevant documents found.",
sources=[], total_found=0, success=True
)
# 3. Construct prompt with retrieved context
prompt = f"{query}\n\nDocuments:\n{format_sources(search_response.results)}\n\nProvide answer with citations."
# 4. Request LLM completion via sampling
try:
result = await ctx.session.create_message(
messages=[SamplingMessage(role="user", content=TextContent(type="text", text=prompt))],
max_tokens=max_answer_tokens,
temperature=0.7,
model_preferences=ModelPreferences(
hints=[ModelHint(name="claude-3-5-sonnet")],
intelligencePriority=0.8,
speedPriority=0.5,
),
include_context="thisServer",
)
return SamplingSearchResponse(
query=query,
generated_answer=result.content.text,
sources=search_response.results,
model_used=result.model,
stop_reason=result.stopReason,
success=True
)
except Exception as e:
# Fallback: Return documents without generated answer
return SamplingSearchResponse(
query=query,
generated_answer=f"[Sampling unavailable: {e}]\n\nFound {len(search_response.results)} documents.",
sources=search_response.results,
search_method="semantic_sampling_fallback",
success=True
)
Key Points:
- No server-side LLM: Server has no API keys, client controls which model is used
- Graceful degradation: Tool always returns useful results even if sampling fails
- User control: MCP clients SHOULD prompt users to approve sampling requests
- No results optimization: Skip sampling call when no documents found
- Fixed prompts: Prompts are not user-configurable to avoid injection risks
Reference: See nc_notes_semantic_search_answer in nextcloud_mcp_server/server/notes.py:517 and ADR-008 for complete implementation.
Testing Best Practices (MANDATORY)
Always Run Tests
- Run tests to completion before considering any task complete
- Rebuild the correct container after code changes (see Development Commands above)
- If tests require modifications, ask for permission before proceeding
Use Existing Fixtures
See tests/conftest.py for 2888 lines of test infrastructure:
nc_mcp_client- MCP client for tool/resource testing (usesmcpcontainer)nc_mcp_oauth_client- MCP client for OAuth testing (usesmcp-oauthcontainer)nc_client- Direct NextcloudClient for setup/cleanuptemporary_note,temporary_addressbook,temporary_contact- Auto-cleanup
Writing Mocked Unit Tests
For client-layer response parsing tests, use mocked HTTP responses:
async def test_notes_api_get_note(mocker):
"""Test that get_note correctly parses the API response."""
mock_response = create_mock_note_response(
note_id=123, title="Test Note", content="Test content",
category="Test", etag="abc123"
)
mock_make_request = mocker.patch.object(
NotesClient, "_make_request", return_value=mock_response
)
client = NotesClient(mocker.AsyncMock(spec=httpx.AsyncClient), "testuser")
note = await client.get_note(note_id=123)
assert note["id"] == 123
mock_make_request.assert_called_once_with("GET", "/apps/notes/api/v1/notes/123")
Mock helpers in tests/conftest.py: create_mock_response(), create_mock_note_response(), create_mock_error_response()
When to use: Response parsing, error handling, request parameter building When NOT to use: CalDAV/CardDAV/WebDAV protocols, OAuth flows, end-to-end MCP testing
OAuth Testing
OAuth tests use Playwright browser automation to complete flows programmatically.
Test Environment:
- Three MCP containers:
mcp(single-user),mcp-oauth(Nextcloud OIDC),mcp-keycloak(external IdP) - OAuth tests require
NEXTCLOUD_HOST,NEXTCLOUD_USERNAME,NEXTCLOUD_PASSWORDenvironment variables - Playwright configuration:
--browser firefox --headedfor debugging - Install browsers:
uv run playwright install firefox
OAuth fixtures: nc_oauth_client, nc_mcp_oauth_client, alice_oauth_token, bob_oauth_token, etc.
Shared OAuth Client: All test users authenticate using a single OAuth client (created via DCR, deleted at session end via RFC 7592). Matches production behavior.
Run OAuth tests:
uv run pytest -m oauth -v # All OAuth tests
uv run pytest tests/server/oauth/ --browser firefox -v
uv run pytest tests/server/oauth/test_oauth_core.py --browser firefox --headed -v
Keycloak OAuth Testing
Validates ADR-002 architecture for external identity providers and offline access patterns.
Architecture: MCP Client → Keycloak (OAuth) → MCP Server → Nextcloud user_oidc (validates token) → APIs
Setup:
docker-compose up -d keycloak app mcp-keycloak
curl http://localhost:8888/realms/nextcloud-mcp/.well-known/openid-configuration
docker compose exec app php occ user_oidc:provider keycloak
Credentials: admin/admin (Keycloak realm: nextcloud-mcp)
For detailed Keycloak setup, see:
docs/oauth-setup.md- OAuth configurationdocs/ADR-002-vector-sync-authentication.md- Offline access architecturedocs/audience-validation-setup.md- Token audience validationdocs/keycloak-multi-client-validation.md- Realm-level validation
Integration Testing with Docker
Nextcloud: docker compose exec app php occ ... for occ commands
MariaDB: docker compose exec db mariadb -u [user] -p [password] [database] for queries
For detailed setup, see:
docs/installation.md- Installation guidedocs/configuration.md- Configuration optionsdocs/authentication.md- Authentication modesdocs/running.md- Running the server