feat: implement semantic search tool and fix vector sync issues (ADR-007 Phase 3)
Completes the ADR-007 implementation by adding user-facing semantic search functionality. Previous phases implemented scanner and processor for background indexing; this adds the query interface. Changes: - Add nc_notes_semantic_search MCP tool for natural language queries - Fix Qdrant point IDs to use UUIDs instead of strings (was causing 400 errors) - Reduce scan interval default from 1 hour to 5 minutes for faster updates - Add SemanticSearchResult and SemanticSearchNotesResponse models - Implement dual-phase authorization (Qdrant filter + Nextcloud API verification) The semantic search enables finding notes by meaning rather than exact keywords, using vector embeddings to understand query intent. Point ID fix resolves critical bug where all document indexing failed with "invalid point ID" errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -6,6 +6,7 @@ Processes documents from queue: fetches content, generates embeddings, stores in
|
||||
import asyncio
|
||||
import logging
|
||||
import time
|
||||
import uuid
|
||||
|
||||
import anyio
|
||||
from httpx import HTTPStatusError
|
||||
@@ -187,9 +188,14 @@ async def _index_document(
|
||||
points = []
|
||||
|
||||
for i, (chunk, embedding) in enumerate(zip(chunks, embeddings)):
|
||||
# Generate deterministic UUID for point ID
|
||||
# Using uuid5 with DNS namespace and combining doc info
|
||||
point_name = f"{doc_task.doc_type}:{doc_task.doc_id}:chunk:{i}"
|
||||
point_id = str(uuid.uuid5(uuid.NAMESPACE_DNS, point_name))
|
||||
|
||||
points.append(
|
||||
PointStruct(
|
||||
id=f"{doc_task.doc_type}_{doc_task.doc_id}_{i}",
|
||||
id=point_id,
|
||||
vector=embedding,
|
||||
payload={
|
||||
"user_id": doc_task.user_id,
|
||||
|
||||
Reference in New Issue
Block a user