2147fc1696
Refactors PR #190's hardcoded Unstructured.io integration into a flexible, extensible plugin system supporting multiple text extraction engines. - **`DocumentProcessor` ABC**: Abstract interface for all processors - **`ProcessorRegistry`**: Central registry for discovery and routing - **`ProcessingResult`**: Standardized output format across processors - **`UnstructuredProcessor`**: Refactored from `UnstructuredClient` - **`TesseractProcessor`**: Local OCR for images (lightweight alternative) - **`CustomHTTPProcessor`**: Generic wrapper for custom HTTP APIs - New `get_document_processor_config()` returns structured config - Supports enabling/disabling individual processors - Per-processor configuration via environment variables - **Breaking Change**: `ENABLE_UNSTRUCTURED_PARSING` replaced with: - `ENABLE_DOCUMENT_PROCESSING=true/false` (master switch) - `ENABLE_UNSTRUCTURED=true/false` (per-processor) - `ENABLE_TESSERACT=true/false` - `ENABLE_CUSTOM_PROCESSOR=true/false` - `parse_document()` now uses `ProcessorRegistry` - Auto-selects appropriate processor based on MIME type - Processor priority system (Unstructured=10, Tesseract=5, Custom=1) - `initialize_document_processors()` registers processors at startup - Integrated into both BasicAuth and OAuth lifespans - Graceful degradation if processors fail to initialize ```env ENABLE_DOCUMENT_PROCESSING=false ENABLE_UNSTRUCTURED=false UNSTRUCTURED_API_URL=http://unstructured:8000 UNSTRUCTURED_STRATEGY=auto # auto|fast|hi_res UNSTRUCTURED_LANGUAGES=eng,deu ENABLE_TESSERACT=false TESSERACT_LANG=eng ENABLE_CUSTOM_PROCESSOR=false CUSTOM_PROCESSOR_URL=http://localhost:9000/process CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg ``` - **Removed**: `tests/test_unstructured_config.py` (legacy tests) - **Added**: `tests/unit/test_document_processor_config.py` - 7 unit tests for new config system - Tests individual and multi-processor configurations - **Added**: - `nextcloud_mcp_server/document_processors/__init__.py` - `nextcloud_mcp_server/document_processors/base.py` - `nextcloud_mcp_server/document_processors/registry.py` - `nextcloud_mcp_server/document_processors/unstructured.py` - `nextcloud_mcp_server/document_processors/tesseract.py` - `nextcloud_mcp_server/document_processors/custom_http.py` - `tests/unit/test_document_processor_config.py` - **Modified**: - `nextcloud_mcp_server/config.py` - New plugin config system - `nextcloud_mcp_server/app.py` - Processor initialization - `nextcloud_mcp_server/utils/document_parser.py` - Uses registry - `nextcloud_mcp_server/server/webdav.py` - Import updates - `env.sample` - New configuration format - `docker-compose.yml` - (profile changes from previous work) - **Removed**: - `nextcloud_mcp_server/client/unstructured_client.py` - Replaced by UnstructuredProcessor - `tests/test_unstructured_config.py` - Replaced with new tests ✅ **Extensible**: Add processors without modifying core code ✅ **Testable**: Mock processors for unit tests ✅ **Configurable**: Enable only needed processors ✅ **Flexible**: Choose fast (Tesseract) vs accurate (Unstructured) ✅ **Opt-in**: Disabled by default, no mandatory dependencies Users upgrading from PR #190 need to update environment variables: ```bash ENABLE_UNSTRUCTURED_PARSING=true ENABLE_DOCUMENT_PROCESSING=true ENABLE_UNSTRUCTURED=true ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
156 lines
5.0 KiB
Markdown
156 lines
5.0 KiB
Markdown
# Test Suite Reorganization Summary
|
|
|
|
## Completed: 2025-10-24
|
|
|
|
### Changes Implemented
|
|
|
|
#### 1. Added Test Layer Markers
|
|
**File**: `pyproject.toml`
|
|
|
|
Added four test markers to enable selective test execution:
|
|
- `@pytest.mark.unit` - Fast unit tests with mocked dependencies
|
|
- `@pytest.mark.integration` - Integration tests requiring Docker containers
|
|
- `@pytest.mark.oauth` - OAuth tests requiring Playwright (slowest)
|
|
- `@pytest.mark.smoke` - Critical path smoke tests
|
|
|
|
#### 2. Created Unit Test Suite
|
|
**Directory**: `tests/unit/`
|
|
|
|
Added fast unit tests (~5 seconds total):
|
|
- `test_scope_decorator.py` (5 tests) - Scope decorator metadata logic
|
|
- `test_response_models.py` (6 tests) - Pydantic model serialization
|
|
|
|
**Total**: 11 unit tests
|
|
|
|
#### 3. Reorganized OAuth Tests
|
|
**Directory**: `tests/server/oauth/`
|
|
|
|
Moved all OAuth-related tests to dedicated subdirectory:
|
|
- Created `test_oauth_core.py` - consolidated basic OAuth connectivity tests
|
|
- Moved 7 OAuth test files to `oauth/` subdirectory
|
|
- Fixed relative imports (`..conftest` → `...conftest`)
|
|
|
|
**Files**:
|
|
- `test_oauth_core.py` - Basic OAuth connectivity & JWT operations (8 tests)
|
|
- `test_scope_authorization.py` - Scope filtering & enforcement (16 tests)
|
|
- `test_introspection_authorization.py` - Token introspection auth (5 tests)
|
|
- `test_dcr_token_type.py` - Dynamic client registration (3 tests)
|
|
- `test_oauth_notes_permissions.py` - Notes app permissions (4 tests)
|
|
- `test_oauth_deck_permissions.py` - Deck app permissions (4 tests)
|
|
- `test_oauth_file_permissions.py` - Files app permissions (4 tests)
|
|
|
|
**Total**: ~48 OAuth tests
|
|
|
|
#### 4. Created Smoke Test Suite
|
|
**Directory**: `tests/smoke/`
|
|
|
|
Added critical path validation tests (~30-60 seconds):
|
|
- `test_smoke.py` (5 tests) - Essential functionality validation
|
|
- MCP connectivity
|
|
- Notes CRUD
|
|
- Calendar basic operations
|
|
- WebDAV basic operations
|
|
- OAuth connectivity
|
|
|
|
#### 5. Updated Documentation
|
|
**File**: `CLAUDE.md`
|
|
|
|
Added comprehensive test execution guide:
|
|
```bash
|
|
# Fast feedback (unit tests) - ~5 seconds
|
|
uv run pytest tests/unit/ -v
|
|
|
|
# Smoke tests - ~30-60 seconds
|
|
uv run pytest -m smoke -v
|
|
|
|
# Integration without OAuth - ~2-3 minutes
|
|
uv run pytest -m "integration and not oauth" -v
|
|
|
|
# Full suite - ~4-5 minutes
|
|
uv run pytest
|
|
|
|
# OAuth only - ~3 minutes
|
|
uv run pytest -m oauth -v
|
|
```
|
|
|
|
Added test structure diagram and marker documentation.
|
|
|
|
### Test Suite Metrics
|
|
|
|
**Before Reorganization**:
|
|
- ~235 tests, all integration
|
|
- No fast feedback loop
|
|
- All tests take ~5-7 minutes
|
|
- OAuth tests scattered across 9 files
|
|
|
|
**After Reorganization**:
|
|
- 234 tests total (11 unit + 5 smoke + ~218 integration)
|
|
- **Fast feedback**: unit tests in ~5 seconds
|
|
- **Quick validation**: smoke tests in ~30-60 seconds
|
|
- **Focused testing**: integration without OAuth in ~2-3 minutes
|
|
- **Full suite**: ~4-5 minutes
|
|
- OAuth tests consolidated in dedicated directory
|
|
|
|
### Feedback Time Improvements
|
|
|
|
| Test Type | Count | Time | Use Case |
|
|
|-----------|-------|------|----------|
|
|
| Unit only | 11 | ~5s | Logic changes, model updates |
|
|
| Smoke only | 5 | ~30-60s | Critical path validation |
|
|
| Integration (no OAuth) | ~172 | ~2-3min | API/MCP changes |
|
|
| OAuth only | 48 | ~3min | OAuth feature work |
|
|
| **Full suite** | **234** | **~4-5min** | **Pre-commit validation** |
|
|
|
|
### Key Benefits
|
|
|
|
1. **Fast Development Feedback**
|
|
- Unit tests run in 5 seconds vs. 5+ minutes
|
|
- Immediate validation for logic changes
|
|
|
|
2. **Efficient CI/CD**
|
|
- Can run unit tests on every commit
|
|
- Run smoke tests for pull requests
|
|
- Full suite for merge to main
|
|
|
|
3. **Better Organization**
|
|
- OAuth tests grouped together
|
|
- Clear test purpose from directory structure
|
|
- Easier to navigate and maintain
|
|
|
|
4. **Selective Execution**
|
|
- Skip slow OAuth tests during development
|
|
- Run only relevant test layer
|
|
- Faster iteration cycles
|
|
|
|
### Migration Notes
|
|
|
|
- **No breaking changes** to existing tests
|
|
- All tests continue to work as before
|
|
- Legacy commands still supported (`-m integration`, etc.)
|
|
- OAuth tests moved to subdirectory, imports updated
|
|
- Removed duplicate tests consolidated into `test_oauth_core.py`
|
|
|
|
### Next Steps (Optional Future Work)
|
|
|
|
1. **Further Consolidation**: Merge remaining OAuth permission tests
|
|
2. **More Unit Tests**: Add unit tests for client initialization, search logic
|
|
3. **Client/Server Deduplication**: Reduce overlap between client and server tests
|
|
4. **CI Pipeline**: Configure GitHub Actions to run test layers separately
|
|
5. **Performance**: Optimize fixtures to reduce setup time
|
|
|
|
### Commands Reference
|
|
|
|
```bash
|
|
# Development workflow
|
|
uv run pytest tests/unit/ -v # Check logic changes
|
|
uv run pytest -m smoke -v # Quick validation
|
|
uv run pytest -m "integration and not oauth" -v # Full validation without slow tests
|
|
|
|
# Before committing
|
|
uv run pytest # Run everything
|
|
|
|
# Working on OAuth features
|
|
uv run pytest tests/server/oauth/ -v # OAuth tests only
|
|
uv run pytest -m oauth --browser firefox --headed -v # Debug OAuth with visible browser
|
|
```
|