Files
nextcloud-mcp-server/TEST_REORGANIZATION_SUMMARY.md
T
Chris Coutinho 2147fc1696 refactor: Transform document parsing into pluggable processor architecture
Refactors PR #190's hardcoded Unstructured.io integration into a flexible,
extensible plugin system supporting multiple text extraction engines.

- **`DocumentProcessor` ABC**: Abstract interface for all processors
- **`ProcessorRegistry`**: Central registry for discovery and routing
- **`ProcessingResult`**: Standardized output format across processors

- **`UnstructuredProcessor`**: Refactored from `UnstructuredClient`
- **`TesseractProcessor`**: Local OCR for images (lightweight alternative)
- **`CustomHTTPProcessor`**: Generic wrapper for custom HTTP APIs

- New `get_document_processor_config()` returns structured config
- Supports enabling/disabling individual processors
- Per-processor configuration via environment variables
- **Breaking Change**: `ENABLE_UNSTRUCTURED_PARSING` replaced with:
  - `ENABLE_DOCUMENT_PROCESSING=true/false` (master switch)
  - `ENABLE_UNSTRUCTURED=true/false` (per-processor)
  - `ENABLE_TESSERACT=true/false`
  - `ENABLE_CUSTOM_PROCESSOR=true/false`

- `parse_document()` now uses `ProcessorRegistry`
- Auto-selects appropriate processor based on MIME type
- Processor priority system (Unstructured=10, Tesseract=5, Custom=1)

- `initialize_document_processors()` registers processors at startup
- Integrated into both BasicAuth and OAuth lifespans
- Graceful degradation if processors fail to initialize

```env
ENABLE_DOCUMENT_PROCESSING=false

ENABLE_UNSTRUCTURED=false
UNSTRUCTURED_API_URL=http://unstructured:8000
UNSTRUCTURED_STRATEGY=auto  # auto|fast|hi_res
UNSTRUCTURED_LANGUAGES=eng,deu

ENABLE_TESSERACT=false
TESSERACT_LANG=eng

ENABLE_CUSTOM_PROCESSOR=false
CUSTOM_PROCESSOR_URL=http://localhost:9000/process
CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg
```

- **Removed**: `tests/test_unstructured_config.py` (legacy tests)
- **Added**: `tests/unit/test_document_processor_config.py`
  - 7 unit tests for new config system
  - Tests individual and multi-processor configurations

- **Added**:
  - `nextcloud_mcp_server/document_processors/__init__.py`
  - `nextcloud_mcp_server/document_processors/base.py`
  - `nextcloud_mcp_server/document_processors/registry.py`
  - `nextcloud_mcp_server/document_processors/unstructured.py`
  - `nextcloud_mcp_server/document_processors/tesseract.py`
  - `nextcloud_mcp_server/document_processors/custom_http.py`
  - `tests/unit/test_document_processor_config.py`

- **Modified**:
  - `nextcloud_mcp_server/config.py` - New plugin config system
  - `nextcloud_mcp_server/app.py` - Processor initialization
  - `nextcloud_mcp_server/utils/document_parser.py` - Uses registry
  - `nextcloud_mcp_server/server/webdav.py` - Import updates
  - `env.sample` - New configuration format
  - `docker-compose.yml` - (profile changes from previous work)

- **Removed**:
  - `nextcloud_mcp_server/client/unstructured_client.py` - Replaced by UnstructuredProcessor
  - `tests/test_unstructured_config.py` - Replaced with new tests

 **Extensible**: Add processors without modifying core code
 **Testable**: Mock processors for unit tests
 **Configurable**: Enable only needed processors
 **Flexible**: Choose fast (Tesseract) vs accurate (Unstructured)
 **Opt-in**: Disabled by default, no mandatory dependencies

Users upgrading from PR #190 need to update environment variables:
```bash
ENABLE_UNSTRUCTURED_PARSING=true

ENABLE_DOCUMENT_PROCESSING=true
ENABLE_UNSTRUCTURED=true
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 19:28:35 +02:00

156 lines
5.0 KiB
Markdown

# Test Suite Reorganization Summary
## Completed: 2025-10-24
### Changes Implemented
#### 1. Added Test Layer Markers
**File**: `pyproject.toml`
Added four test markers to enable selective test execution:
- `@pytest.mark.unit` - Fast unit tests with mocked dependencies
- `@pytest.mark.integration` - Integration tests requiring Docker containers
- `@pytest.mark.oauth` - OAuth tests requiring Playwright (slowest)
- `@pytest.mark.smoke` - Critical path smoke tests
#### 2. Created Unit Test Suite
**Directory**: `tests/unit/`
Added fast unit tests (~5 seconds total):
- `test_scope_decorator.py` (5 tests) - Scope decorator metadata logic
- `test_response_models.py` (6 tests) - Pydantic model serialization
**Total**: 11 unit tests
#### 3. Reorganized OAuth Tests
**Directory**: `tests/server/oauth/`
Moved all OAuth-related tests to dedicated subdirectory:
- Created `test_oauth_core.py` - consolidated basic OAuth connectivity tests
- Moved 7 OAuth test files to `oauth/` subdirectory
- Fixed relative imports (`..conftest``...conftest`)
**Files**:
- `test_oauth_core.py` - Basic OAuth connectivity & JWT operations (8 tests)
- `test_scope_authorization.py` - Scope filtering & enforcement (16 tests)
- `test_introspection_authorization.py` - Token introspection auth (5 tests)
- `test_dcr_token_type.py` - Dynamic client registration (3 tests)
- `test_oauth_notes_permissions.py` - Notes app permissions (4 tests)
- `test_oauth_deck_permissions.py` - Deck app permissions (4 tests)
- `test_oauth_file_permissions.py` - Files app permissions (4 tests)
**Total**: ~48 OAuth tests
#### 4. Created Smoke Test Suite
**Directory**: `tests/smoke/`
Added critical path validation tests (~30-60 seconds):
- `test_smoke.py` (5 tests) - Essential functionality validation
- MCP connectivity
- Notes CRUD
- Calendar basic operations
- WebDAV basic operations
- OAuth connectivity
#### 5. Updated Documentation
**File**: `CLAUDE.md`
Added comprehensive test execution guide:
```bash
# Fast feedback (unit tests) - ~5 seconds
uv run pytest tests/unit/ -v
# Smoke tests - ~30-60 seconds
uv run pytest -m smoke -v
# Integration without OAuth - ~2-3 minutes
uv run pytest -m "integration and not oauth" -v
# Full suite - ~4-5 minutes
uv run pytest
# OAuth only - ~3 minutes
uv run pytest -m oauth -v
```
Added test structure diagram and marker documentation.
### Test Suite Metrics
**Before Reorganization**:
- ~235 tests, all integration
- No fast feedback loop
- All tests take ~5-7 minutes
- OAuth tests scattered across 9 files
**After Reorganization**:
- 234 tests total (11 unit + 5 smoke + ~218 integration)
- **Fast feedback**: unit tests in ~5 seconds
- **Quick validation**: smoke tests in ~30-60 seconds
- **Focused testing**: integration without OAuth in ~2-3 minutes
- **Full suite**: ~4-5 minutes
- OAuth tests consolidated in dedicated directory
### Feedback Time Improvements
| Test Type | Count | Time | Use Case |
|-----------|-------|------|----------|
| Unit only | 11 | ~5s | Logic changes, model updates |
| Smoke only | 5 | ~30-60s | Critical path validation |
| Integration (no OAuth) | ~172 | ~2-3min | API/MCP changes |
| OAuth only | 48 | ~3min | OAuth feature work |
| **Full suite** | **234** | **~4-5min** | **Pre-commit validation** |
### Key Benefits
1. **Fast Development Feedback**
- Unit tests run in 5 seconds vs. 5+ minutes
- Immediate validation for logic changes
2. **Efficient CI/CD**
- Can run unit tests on every commit
- Run smoke tests for pull requests
- Full suite for merge to main
3. **Better Organization**
- OAuth tests grouped together
- Clear test purpose from directory structure
- Easier to navigate and maintain
4. **Selective Execution**
- Skip slow OAuth tests during development
- Run only relevant test layer
- Faster iteration cycles
### Migration Notes
- **No breaking changes** to existing tests
- All tests continue to work as before
- Legacy commands still supported (`-m integration`, etc.)
- OAuth tests moved to subdirectory, imports updated
- Removed duplicate tests consolidated into `test_oauth_core.py`
### Next Steps (Optional Future Work)
1. **Further Consolidation**: Merge remaining OAuth permission tests
2. **More Unit Tests**: Add unit tests for client initialization, search logic
3. **Client/Server Deduplication**: Reduce overlap between client and server tests
4. **CI Pipeline**: Configure GitHub Actions to run test layers separately
5. **Performance**: Optimize fixtures to reduce setup time
### Commands Reference
```bash
# Development workflow
uv run pytest tests/unit/ -v # Check logic changes
uv run pytest -m smoke -v # Quick validation
uv run pytest -m "integration and not oauth" -v # Full validation without slow tests
# Before committing
uv run pytest # Run everything
# Working on OAuth features
uv run pytest tests/server/oauth/ -v # OAuth tests only
uv run pytest -m oauth --browser firefox --headed -v # Debug OAuth with visible browser
```