Commit Graph

24 Commits

Author SHA1 Message Date
Chris Coutinho 327d843f64 feat: Implement per-chunk vector visualization with context expansion
Major improvements to vector visualization page:
- Refactor PCA to display individual chunks instead of averaged documents
- Add context expansion module for fetching surrounding text from notes and PDFs
- Update deduplication to use (doc_id, doc_type, chunk_start, chunk_end) keys
- Fix Alpine.js rendering with chunk-specific keys including offsets
- Refactor authentication helper to return NextcloudClient for better reuse
- Add async context manager support to NextcloudClient

Technical details:
- viz_routes.py: Fetch specific chunk vectors instead of averaging per document
- context.py: New module supporting both notes and PDF text extraction via PyMuPDF
- search algorithms: Extract page_number, chunk_index, total_chunks from Qdrant
- vector-viz.js/html: Use chunk positions in expansion tracking keys

This enables users to see which specific chunks match their query
and view them with surrounding context in the PCA visualization.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 11:22:20 +01:00
Chris Coutinho b8010270c1 fix: Add async/await, PDF metadata, and type safety fixes
This commit addresses multiple issues with async operations, PDF metadata
extraction, and type safety in document processing and search.

## Async/Await Fixes
- processor.py:259 - Added await for chunker.chunk_text(content)
- processor.py:270 - Added await for bm25_service.encode_batch(chunk_texts)
- tests/unit/test_document_chunker.py - Converted all 12 test methods to async

## PDF Metadata Enhancement
- pymupdf.py:143 - Added file_size metadata extraction
- pymupdf.py:145-206 - Refactored to extract text page-by-page
  - Manually loop through pages instead of using page_chunks=True
  - Generate page_boundaries metadata for precise page tracking
  - Works around pymupdf.layout.activate() breaking page_chunks=True
- processor.py:32-66 - Added assign_page_numbers() helper function
  - Assigns page numbers to chunks based on overlap with page boundaries
  - Handles chunks spanning multiple pages
- processor.py:298-300 - Call assign_page_numbers() for PDF files

## Type Safety Fixes
- bm25_hybrid.py:184 - Removed int() conversion of doc_id
- semantic.py:131 - Removed int() conversion of doc_id
- viz_routes.py:275 - Removed int() conversion of doc_id
- Added comments documenting that doc_id can be int (notes) or str (file paths)

## Testing
- All 18 tests passing (12 unit + 6 integration)
- No type errors in modified files
- Container logs show successful processing
- Vector viz searches working correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 02:37:07 +01:00
Chris Coutinho f4759e424d feat: add webhook management UI and BeforeNodeDeletedEvent support
Added comprehensive webhook management capabilities including:

Webhook Client & API:
- Added WebhooksClient for Nextcloud webhooks API integration
- Create, list, update, and delete webhooks programmatically
- Support for event filters in webhook registration

Webhook Presets:
- Added preset system for common webhook configurations
- notes_sync: BeforeNodeDeletedEvent for Notes file operations
- calendar_sync: Calendar events (create, update, delete)
- deck_sync: Deck card operations
- files_sync: File system changes
- forms_sync: Form submissions (conditional)
- Filter presets by installed apps

Admin UI:
- Added multi-pane app view with tabs (User Info, Vector Sync, Webhooks)
- Webhooks tab for admin users only
- Enable/disable preset webhooks via UI
- View currently registered webhooks
- Uses htmx for dynamic loading and Alpine.js for tab state
- Admin permission checking via OCS API

CLI Improvements:
- Refactored CLI to separate module (cli.py)
- Updated entry point in pyproject.toml

BeforeNodeDeletedEvent Fix:
- Updated ADR-010 to document NodeDeletedEvent issue
- BeforeNodeDeletedEvent includes node.id before deletion
- NodeDeletedEvent lacks node.id (file already deleted)
- Implemented per Nextcloud maintainer recommendation

Testing:
- Added comprehensive webhook client tests
- Added webhook preset filtering tests
- Added admin permission tests

Configuration:
- Updated docker-compose.yml Qdrant settings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 20:35:08 +01:00
Chris Coutinho a6e5f3d8ff refactor: simplify OpenTelemetry tracing configuration
Simplifies the OpenTelemetry tracing setup by removing the redundant
OTEL_ENABLED flag and using the presence of OTEL_EXPORTER_OTLP_ENDPOINT
to determine if tracing should be enabled. This follows the standard
OpenTelemetry environment variable conventions more closely.

Changes:
- Remove OTEL_ENABLED/tracing_enabled flag in favor of checking if
  OTEL_EXPORTER_OTLP_ENDPOINT is set
- Add OTEL_EXPORTER_VERIFY_SSL configuration option for OTLP endpoints
  with self-signed certificates (defaults to false for development)
- Move HTTPXClientInstrumentor initialization to module level to ensure
  httpx calls are traced across all Nextcloud API requests
- Add tracing spans to vector sync operations (scan_user_documents)
- Fix authorization header logging to only warn about missing headers
  in OAuth mode (BasicAuth mode doesn't use Authorization headers)
- Update observability documentation to reflect simplified configuration
- Refactor Dockerfile to use --no-editable flag for uv sync

Breaking changes:
- OTEL_ENABLED environment variable is removed
- Tracing is now automatically enabled when OTEL_EXPORTER_OTLP_ENDPOINT
  is set

Migration guide:
- Remove OTEL_ENABLED=true from environment configuration
- Tracing will be enabled automatically if OTEL_EXPORTER_OTLP_ENDPOINT
  is configured

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 22:48:37 +01:00
Chris Coutinho 92e18825bc feat(caldav): Add support for tasks 2025-10-19 18:02:43 +02:00
Chris Coutinho 31ffeba69b chore: Move timeout to recipe import 2025-10-18 23:12:31 +02:00
Chris Coutinho 37164dbdbc chore: sort imports 2025-10-18 22:02:25 +02:00
Chris Coutinho 83917b3786 perf(notes): Improve notes search performance using async iterators 2025-10-18 22:02:19 +02:00
Chris Coutinho 8e7191e0ea fix: Increase HTTP client timeout to 30s
The default 5s timeout was too short for Nextcloud Cookbook app to fetch and process recipes from external URLs, causing intermittent test failures with ReadTimeout errors.

Fixes intermittent CI failures in cookbook import tests.
2025-10-17 04:41:28 +02:00
Chris Coutinho 9de59db718 feat(cookbook): Add full Cookbook app support with 13 tools and 2 resources
- Import recipes from URLs using schema.org metadata
- Full CRUD operations for recipes
- Search, categorize, and organize recipes
- Manage keywords/tags and categories
- Configure app settings and trigger reindexing
2025-10-17 03:08:16 +02:00
Chris Coutinho 85f8522085 feat: Add Groups API client 2025-10-15 03:43:25 +02:00
Chris Coutinho a38c795124 feat: add sharing API client and server tools 2025-10-15 02:59:26 +02:00
Chris Coutinho 898c2e72ae Merge remote-tracking branch 'origin/master' into feature/user-api 2025-10-14 23:43:03 +02:00
Chris Coutinho 13e4915e38 test: Remove unused pytest fixtures 2025-10-14 01:23:39 +02:00
Chris Coutinho 2b11718c43 test: continue working on oauth client 2025-10-14 01:23:30 +02:00
Chris Coutinho 33b962a7fc test: Setup interactive browser test 2025-10-14 01:23:30 +02:00
Chris Coutinho 4d7e4b9a4b feat(server): Experimental support for OAuth2/OIDC authentication 2025-10-14 01:22:15 +02:00
Chris Coutinho 961f23b5ea feat(users): Initialize user API client 2025-09-11 09:42:42 +02:00
Chris Coutinho 167053578d feat(deck): Initialize Deck app client/server 2025-09-11 00:10:25 +02:00
Chris Coutinho 3836534205 fix(client): Strip cookies from responses to avoid falsely raising CSRF errors 2025-08-08 21:03:16 +02:00
Chris Coutinho 37b1057d2a feat(contacts): Initialize Contacts App 2025-08-03 14:15:37 +02:00
Chris Coutinho 8956945e9d chore: sort imports 2025-08-01 12:21:32 +02:00
Neovasky 7291c930c4 feat(calendar): add comprehensive Calendar app support via CalDAV protocol
- Add complete CalDAV client implementation following NextCloud patterns
- Implement 11 comprehensive calendar MCP tools:
  * nc_calendar_list_calendars - list available calendars
  * nc_calendar_create_event - full event creation with recurrence, reminders, attendees
  * nc_calendar_list_events - enhanced with advanced filtering capabilities
  * nc_calendar_get_event - detailed event information retrieval
  * nc_calendar_update_event - comprehensive event modification
  * nc_calendar_delete_event - event removal
  * nc_calendar_create_meeting - quick meeting creation with smart defaults
  * nc_calendar_get_upcoming_events - upcoming events in next N days
  * nc_calendar_find_availability - intelligent scheduling with conflict detection
  * nc_calendar_bulk_operations - batch update/delete/move operations
  * nc_calendar_manage_calendar - calendar creation and management

- Add CalDAV and iCalendar dependencies to support calendar operations
- Implement comprehensive integration tests (11 test cases covering all scenarios)
- Update documentation with complete calendar tools reference and usage examples

Resolves #74
2025-07-27 00:25:31 -04:00
Chris Coutinho e50be7db07 chore: Move clients into separate submodule 2025-07-07 00:06:24 +02:00