Update ADRs to reflect that vector database and semantic search support multiple Nextcloud apps (notes, calendar, deck, files, contacts) rather than being notes-specific. Introduce semantic:read/write OAuth scopes to replace app-specific scope requirements for cross-app search. Changes: - ADR-007: Add plugin architecture (DocumentScanner, DocumentProcessor, DocumentVerifier) for multi-app vector sync - ADR-008: Rename tools from nc_notes_semantic_* to nc_semantic_*, update scope from notes:read to semantic:read - ADR-009: NEW - Document decision to use generic semantic:read scope with dual-phase authorization instead of requiring all app scopes - oauth-architecture.md: Add semantic:read/write scope documentation - README.md: Move semantic search to dedicated section separate from Notes This is a breaking change that correctly positions semantic search as a cross-app capability before broader adoption. Existing deployments will need to re-authenticate with the new semantic:read scope. Relates to user request to decouple vector database from notes-only model and establish proper OAuth scope boundaries for multi-app semantic search. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
12 KiB
ADR-009: Generic semantic:read OAuth Scope for Multi-App Vector Search
Status: Proposed Date: 2025-01-11 Depends On: ADR-007 (Background Vector Sync), ADR-008 (MCP Sampling for Semantic Search)
Context
ADR-007 established a background vector synchronization architecture that indexes content from multiple Nextcloud apps (notes, calendar events, deck cards, files, contacts) into a unified vector database. ADR-008 introduced semantic search tools (nc_semantic_search, nc_semantic_search_answer) that query this vector database and use MCP sampling to generate natural language answers.
The question is: What OAuth scopes should protect semantic search operations?
Option 1: App-Specific Scopes
Require users to have scopes for each app they want to search:
@mcp.tool()
@require_scopes("notes:read", "calendar:read", "deck:read", "files:read", "contacts:read")
async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse:
"""Search across all indexed apps"""
Advantages:
- Granular control - users explicitly consent to searching each app
- Aligns with app-specific authorization model
- Clear security boundary - can only search apps you can access
Disadvantages:
- Brittle user experience: If a user grants only
notes:readbut the tool requires all 5 scopes, the tool becomes invisible/unusable - All-or-nothing enforcement: Can't search notes alone - must grant all scopes or none
- Poor progressive consent: User can't start with notes search and later add calendar
- Scope inflation: Every new app adds another required scope
- Mismatched semantics: User thinks "I want to search my notes" but must grant calendar, deck, files, contacts just to make the tool appear
Option 2: Single Generic Scope (Chosen)
Introduce a new semantic search-specific scope:
@mcp.tool()
@require_scopes("semantic:read")
async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse:
"""Search across all indexed apps"""
Advantages:
- Simple authorization: One scope grants semantic search capability
- Progressive enablement: User grants
semantic:read, searches notes initially, then enables calendar indexing later - Logical grouping: Semantic search is a cross-app feature, deserving its own scope
- Future-proof: New apps can be added to vector sync without changing OAuth scopes
- Matches user mental model: "I want semantic search" → grant
semantic:read(not "I want semantic search" → grant 5 unrelated app scopes)
Considerations:
- User could search apps they can't directly access via app-specific tools
- Mitigation: Dual-phase authorization (Phase 1: scope check passes with
semantic:read, Phase 2: verify user can access each returned document via app-specific permissions)
- Mitigation: Dual-phase authorization (Phase 1: scope check passes with
- Less granular than app-specific scopes
- Counterpoint: Semantic search is inherently cross-app - forcing per-app authorization defeats its purpose
Option 3: Hybrid Approach (Rejected)
Support both: semantic search works with either semantic:read OR all app-specific scopes:
@mcp.tool()
@require_scopes("semantic:read", alternative_scopes=["notes:read", "calendar:read", ...])
async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse:
"""Search across all indexed apps"""
Rejected Because:
- Adds complexity to scope validation logic
- Unclear to users which scopes they should grant
- Alternative scopes still suffer from all-or-nothing problem
- No significant benefit over Option 2 with dual-phase authorization
Decision
We will introduce two new OAuth scopes specifically for semantic search operations:
semantic:read: Query vector database, perform semantic search, generate answerssemantic:write: Enable/disable background vector synchronization, manage indexing settings
These scopes are independent of app-specific scopes (notes:read, calendar:read, etc.).
Tool Scope Assignments
Read Operations:
@mcp.tool()
@require_scopes("semantic:read")
async def nc_semantic_search(query: str, ctx: Context, limit: int = 10, score_threshold: float = 0.7) -> SemanticSearchResponse:
"""Semantic search across all indexed Nextcloud apps"""
@mcp.tool()
@require_scopes("semantic:read")
async def nc_semantic_search_answer(query: str, ctx: Context, limit: int = 5, max_answer_tokens: int = 500) -> SamplingSearchResponse:
"""Semantic search with LLM-generated answer via MCP sampling"""
@mcp.tool()
@require_scopes("semantic:read")
async def nc_get_vector_sync_status(ctx: Context) -> VectorSyncStatusResponse:
"""Get current vector synchronization status (indexed count, pending count, status)"""
Write Operations:
@mcp.tool()
@require_scopes("semantic:write")
async def nc_enable_vector_sync(ctx: Context) -> VectorSyncResponse:
"""Enable background vector synchronization for this user"""
@mcp.tool()
@require_scopes("semantic:write")
async def nc_disable_vector_sync(ctx: Context) -> VectorSyncResponse:
"""Disable background vector synchronization"""
Dual-Phase Authorization
To ensure users can only access documents they have permission to view, semantic search implements dual-phase authorization:
Phase 1: Scope Check (MCP Server)
- User must have
semantic:readscope to call semantic search tools - This grants permission to query the vector database
Phase 2: Document Verification (Per-Result Filtering)
- For each returned document, verify user has access via app-specific permissions
- Uses
DocumentVerifierinterface per app:- Notes: Call
/apps/notes/api/v1/notes/{id}- if 404/403, exclude from results - Calendar: Call
/remote.php/dav/calendars/username/calendar/event.ics- if 404/403, exclude - Deck: Call
/apps/deck/api/v1.0/boards/{board_id}/stacks/{stack_id}/cards/{card_id}- if 404/403, exclude - Files: Call
/remote.php/dav/files/username/pathwith PROPFIND - if 404/403, exclude - Contacts: Call
/remote.php/dav/addressbooks/username/addressbook/contact.vcf- if 404/403, exclude
- Notes: Call
This two-phase approach ensures:
- Semantic search is a distinct capability (like "global search") requiring explicit consent
- Results are filtered to only include documents the user can access
- No privilege escalation - users can't discover content they shouldn't see
Implementation: See ADR-007 Phase 3 (Document Verification) and DocumentVerifier interface.
Scope Discovery
The new scopes will be:
- Advertised via PRM endpoint (
/.well-known/oauth-protected-resource/mcp) - Dynamically discovered from
@require_scopesdecorators on semantic search tools - Documented in OAuth architecture (oauth-architecture.md)
- Included in default client registration scopes
Consequences
Benefits
User Experience:
- Simple authorization: one scope for semantic search capability
- Progressive enablement: grant
semantic:read, enable indexing for apps later - Natural mental model: "semantic search" is a distinct feature deserving its own scope
Security:
- Dual-phase authorization prevents privilege escalation
- Users explicitly consent to cross-app search capability
- Per-document verification ensures users only see accessible content
Maintainability:
- Adding new apps to vector sync doesn't require OAuth scope changes
- Clear separation between app access (notes:read) and search capability (semantic:read)
- Logical grouping of related operations (search, sync status, enable/disable)
Future-Proof:
- Can add new document types without breaking existing OAuth flows
- Supports future semantic features (recommendations, clustering) under same scope
- Aligns with potential future Nextcloud semantic capabilities
Trade-offs
Less Granular Than App-Specific Scopes:
- User can't grant "semantic search notes only"
- Semantic search is all-or-nothing across enabled apps
- Mitigation: Dual-phase verification ensures users only see documents they can access
New Scope to Learn:
- Users must understand
semantic:readis distinct from app scopes - MCP clients must present scope clearly during consent
- Mitigation: Clear scope descriptions in OAuth consent UI and documentation
Backend Complexity:
- Requires dual-phase authorization implementation
- DocumentVerifier interface needed for each app
- Benefit: Enforces proper security regardless of scope model
Migration Impact
Breaking Change: Existing deployments using notes-specific semantic search will break.
Before (OLD - Breaking):
@mcp.tool()
@require_scopes("notes:read")
async def nc_notes_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse:
"""Semantic search notes"""
After (NEW):
@mcp.tool()
@require_scopes("semantic:read")
async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse:
"""Semantic search across all apps"""
Migration Path:
- Deploy server with new
semantic:readscope - Users re-authenticate, granting
semantic:readscope - Semantic search tools become visible/usable again
- No data loss: Vector database and indexed documents remain unchanged
Backward Compatibility: None. This is an intentional breaking change to correct the scope model before broader adoption.
Alternatives Considered
Keep Notes-Specific Scopes
Approach: Continue using notes:read for semantic search, even when searching other apps.
Rejected Because:
- Semantically incorrect - searching calendar events is not "reading notes"
- Confuses users - why does searching calendar require notes:read?
- Doesn't scale - what scope for multi-app search?
Create Per-App Semantic Scopes
Approach: Introduce notes:semantic, calendar:semantic, deck:semantic, etc.
Rejected Because:
- Scope proliferation - doubles the number of scopes
- Defeats purpose of unified vector search
- Users would need to grant 5+ scopes for cross-app search
- No clear benefit over dual-phase authorization with
semantic:read
Require All App Scopes (Already Rejected in Option 1)
Approach: Require notes:read AND calendar:read AND deck:read AND files:read AND contacts:read
Rejected Because: Unusable UX (see Option 1 disadvantages above)
Related Decisions
ADR-007: Background Vector Sync provides the indexing architecture that semantic scopes protect. The DocumentVerifier interface from ADR-007 Phase 3 implements dual-phase authorization.
ADR-008: MCP Sampling for semantic search uses semantic:read to protect the sampling-enhanced search tool.
ADR-004: Progressive Consent architecture supports users granting semantic:read initially, then enabling per-app indexing via semantic:write (enable_vector_sync with app selection).
Implementation Checklist
- Create ADR-009 document (this file)
- Update
oauth-architecture.mdto documentsemantic:readandsemantic:writescopes ✅ - Update
README.mdto show Semantic Search as separate tool category ✅ - Update ADR-007 to reference
semantic:*scopes instead ofsync:*✅ - Update ADR-008 to use
semantic:readinstead ofnotes:read✅ - Implement DocumentVerifier interface for all apps (notes, calendar, deck, files, contacts)
- Update semantic search tools to use
@require_scopes("semantic:read") - Update vector sync tools to use
@require_scopes("semantic:write") - Add dual-phase authorization to semantic search implementation
- Test OAuth flow with
semantic:readscope - Update scope discovery in PRM endpoint
- Document migration path for existing deployments