Refactors PR #190's hardcoded Unstructured.io integration into a flexible, extensible plugin system supporting multiple text extraction engines. - **`DocumentProcessor` ABC**: Abstract interface for all processors - **`ProcessorRegistry`**: Central registry for discovery and routing - **`ProcessingResult`**: Standardized output format across processors - **`UnstructuredProcessor`**: Refactored from `UnstructuredClient` - **`TesseractProcessor`**: Local OCR for images (lightweight alternative) - **`CustomHTTPProcessor`**: Generic wrapper for custom HTTP APIs - New `get_document_processor_config()` returns structured config - Supports enabling/disabling individual processors - Per-processor configuration via environment variables - **Breaking Change**: `ENABLE_UNSTRUCTURED_PARSING` replaced with: - `ENABLE_DOCUMENT_PROCESSING=true/false` (master switch) - `ENABLE_UNSTRUCTURED=true/false` (per-processor) - `ENABLE_TESSERACT=true/false` - `ENABLE_CUSTOM_PROCESSOR=true/false` - `parse_document()` now uses `ProcessorRegistry` - Auto-selects appropriate processor based on MIME type - Processor priority system (Unstructured=10, Tesseract=5, Custom=1) - `initialize_document_processors()` registers processors at startup - Integrated into both BasicAuth and OAuth lifespans - Graceful degradation if processors fail to initialize ```env ENABLE_DOCUMENT_PROCESSING=false ENABLE_UNSTRUCTURED=false UNSTRUCTURED_API_URL=http://unstructured:8000 UNSTRUCTURED_STRATEGY=auto # auto|fast|hi_res UNSTRUCTURED_LANGUAGES=eng,deu ENABLE_TESSERACT=false TESSERACT_LANG=eng ENABLE_CUSTOM_PROCESSOR=false CUSTOM_PROCESSOR_URL=http://localhost:9000/process CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg ``` - **Removed**: `tests/test_unstructured_config.py` (legacy tests) - **Added**: `tests/unit/test_document_processor_config.py` - 7 unit tests for new config system - Tests individual and multi-processor configurations - **Added**: - `nextcloud_mcp_server/document_processors/__init__.py` - `nextcloud_mcp_server/document_processors/base.py` - `nextcloud_mcp_server/document_processors/registry.py` - `nextcloud_mcp_server/document_processors/unstructured.py` - `nextcloud_mcp_server/document_processors/tesseract.py` - `nextcloud_mcp_server/document_processors/custom_http.py` - `tests/unit/test_document_processor_config.py` - **Modified**: - `nextcloud_mcp_server/config.py` - New plugin config system - `nextcloud_mcp_server/app.py` - Processor initialization - `nextcloud_mcp_server/utils/document_parser.py` - Uses registry - `nextcloud_mcp_server/server/webdav.py` - Import updates - `env.sample` - New configuration format - `docker-compose.yml` - (profile changes from previous work) - **Removed**: - `nextcloud_mcp_server/client/unstructured_client.py` - Replaced by UnstructuredProcessor - `tests/test_unstructured_config.py` - Replaced with new tests ✅ **Extensible**: Add processors without modifying core code ✅ **Testable**: Mock processors for unit tests ✅ **Configurable**: Enable only needed processors ✅ **Flexible**: Choose fast (Tesseract) vs accurate (Unstructured) ✅ **Opt-in**: Disabled by default, no mandatory dependencies Users upgrading from PR #190 need to update environment variables: ```bash ENABLE_UNSTRUCTURED_PARSING=true ENABLE_DOCUMENT_PROCESSING=true ENABLE_UNSTRUCTURED=true ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
10 KiB
DCR Client Deletion Investigation
Summary
✅ RESOLVED - As of 2025-10-24, Dynamic Client Registration (DCR) via RFC 7591 and RFC 7592 client deletion now work correctly in Nextcloud's OIDC server!
Historical Note: This document was originally created to investigate DCR deletion failures. The issue has been resolved by merging two feature branches (feature/user-consent-complete and feature/dcr-jwt-scopes) that implement RFC 7592 support.
Resolution Summary (2025-10-24)
What Now Works ✅
- Client Registration (RFC 7591): Successfully creates OAuth clients with custom scopes and token types
- Registration Access Token: ✅ Now included in registration response per RFC 7592
- Registration Client URI: ✅ Now included in registration response per RFC 7592
- Client Deletion (RFC 7592): ✅ Now works with Bearer token authentication
- Token Acquisition: Registered clients can obtain access tokens via authorization code flow
- API Access: Tokens work correctly for accessing Nextcloud APIs
Test Evidence
The test test_new_dcr_registration_includes_access_token in tests/server/oauth/test_dcr_new_implementation.py confirms:
Registration Response:
{
"client_id": "wynkPur15ibby0Ma2FUOMyv4JdmtxqlRepvGmERrE36RYmquuExma1srAgDG1rKZ",
"client_secret": "agaZU3WdffOy4o6TS4vZ...",
"registration_access_token": "uKycqheAzw2UMZUL58Ir...",
"registration_client_uri": "http://localhost:8080/apps/oidc/register/wynkPur15ibby0Ma2FUOMyv4JdmtxqlRepvGmERrE36RYmquuExma1srAgDG1rKZ",
...
}
Deletion Test:
- Endpoint:
DELETE /apps/oidc/register/{client_id} - Authentication:
Authorization: Bearer {registration_access_token} - Response: 204 No Content ✅
Implementation Details
The resolution required:
- Merging
feature/user-consent-completeandfeature/dcr-jwt-scopesbranches - Adding missing classes to composer autoload files:
OCA\OIDCIdentityProvider\Db\RegistrationTokenOCA\OIDCIdentityProvider\Db\RegistrationTokenMapperOCA\OIDCIdentityProvider\Service\RegistrationTokenService
- Fixing method calls in
DynamicRegistrationController.php:- Changed
findByClientId()togetByClientId()for RedirectUriMapper - Removed logout redirect URI deletion (not client-specific in schema)
- Changed
- Database migration applied automatically (
oc_oidc_reg_tokenstable created)
Files Modified
third_party/oidc/composer/composer/autoload_classmap.php- Added 3 new class mappingsthird_party/oidc/composer/composer/autoload_static.php- Added 3 new class mappingsthird_party/oidc/lib/Controller/DynamicRegistrationController.php- Fixed deletion logicthird_party/oidc/lib/Db/LogoutRedirectUriMapper.php- AddeddeleteByClientId()method
Technical Details
Registration Response Analysis
When registering a client via POST to /apps/oidc/register, the response includes:
{
"client_name": "DCR Lifecycle Test Client",
"client_id": "eVdV1obTHUhtQiBOLnDcOucZE3sQA6J7JgzsDFsnpgzLkWSNEPXHJbpSfjLUU5ot",
"client_secret": "iqNeH5inrdTPh6hYGOmvlML7SWqHPHpMZp9CQlNHNnKGf6VZ8pSeaSC1EBrDRmyd",
"redirect_uris": ["http://localhost:8081"],
"token_endpoint_auth_method": "client_secret_post",
"response_types": ["code"],
"grant_types": ["authorization_code"],
"id_token_signed_response_alg": "RS256",
"application_type": "web",
"client_id_issued_at": 1761286688,
"client_secret_expires_at": 1761290288,
"scope": "openid profile email notes:read",
"token_type": "Bearer"
}
Missing: registration_access_token and registration_client_uri
Deletion Attempt Analysis
Attempting DELETE to /apps/oidc/register/{client_id} with various authentication methods:
Method 1: HTTP Basic Auth
- Authentication: HTTP Basic Auth with
client_idas username,client_secretas password - Response: 401 Unauthorized
- Response Body:
{"message":""}
Method 2: Credentials in JSON Body
- Authentication: JSON body with
client_idandclient_secret - Response: N/A (httpx.AsyncClient.delete() doesn't support
jsonparameter)
Method 3: Credentials in Query Parameters
- Authentication: Query params
?client_id=...&client_secret=... - Response: 500 Internal Server Error (server-side exception when parsing query params)
Method 4: No Authentication (Baseline)
- Authentication: None
- Response: 401 Unauthorized
- Response Body:
{"error":"invalid_client","error_description":"Client authentication failed."}
Conclusion: The 401 error occurs with HTTP Basic Auth (the standard RFC 7592 method). Query parameters cause a 500 error (not supported). No authentication returns 401 as expected.
RFC 7592 Requirements (Not Met)
According to RFC 7592 Section 3, the registration endpoint MUST return:
registration_access_token: A token for subsequent management operations (read, update, delete)registration_client_uri: The URI for managing this client
The client delete request should then use:
DELETE /apps/oidc/register/{client_id}
Authorization: Bearer {registration_access_token}
Root Cause Analysis
Possible Causes
-
Nextcloud OIDC Server Implementation Gap
- The OIDC server (likely based on third-party library) may not fully implement RFC 7592
- Registration (RFC 7591) is implemented, but management operations (RFC 7592) are not
-
Middleware Blocking
- Nextcloud middleware may be blocking unauthenticated DELETE requests to
/apps/oidc/* - The 401 error suggests authentication is being checked but failing
- Nextcloud middleware may be blocking unauthenticated DELETE requests to
-
Missing Feature
- Client deletion may simply not be implemented in the current OIDC app version
- The endpoint exists but returns 401 regardless of credentials
Impact on Test Fixtures
Current Fixture Behavior
The shared_oauth_client_credentials and shared_jwt_oauth_client_credentials fixtures in tests/conftest.py (lines 947-1112) attempt to clean up registered clients using:
success = await delete_client(
nextcloud_url=nextcloud_host,
client_id=client_id,
client_secret=client_secret,
)
This cleanup always fails (returns False) due to the 401 error, but the failure is handled gracefully with a warning:
except Exception as e:
logger.warning(
f"Error cleaning up shared OAuth client {client_id[:16]}...: {e}"
)
Consequences
- OAuth Clients Accumulate: Every test session registers 2 OAuth clients that are never deleted
- No Functional Impact: Tests continue to work because:
- Clients have 1-hour expiration (
client_secret_expires_at) - New clients are registered for each session
- Old clients expire automatically
- Clients have 1-hour expiration (
- Database Bloat: Over time, the
oc_oauth2_clientstable may accumulate expired clients
Recommendations
Short Term (Current Approach)
- Keep Current Warning-Based Approach: The fixtures already handle deletion failure gracefully
- Document Expected Behavior: Add comments explaining that deletion is expected to fail
- Accept Client Accumulation: Rely on automatic expiration (1 hour)
Long Term (If DCR Deletion Needed)
- Check Nextcloud OIDC App Version: Verify if newer versions support RFC 7592 deletion
- File Bug Report: Report missing
registration_access_tokento Nextcloud OIDC project - Alternative Cleanup: Use Nextcloud admin API to delete OAuth clients directly
- Requires admin credentials
- Bypass OIDC app's DCR endpoint
- Example:
occ oauth:clients:delete {client_id}
Recommended Fixture Update
@pytest.fixture(scope="session")
async def shared_oauth_client_credentials(anyio_backend, oauth_callback_server):
"""
... existing docstring ...
Note:
Client deletion via RFC 7592 is not supported by Nextcloud OIDC server
(missing registration_access_token). Clients will expire after 1 hour
automatically. Manual cleanup via admin API may be needed in production.
"""
# ... registration code ...
yield (...)
# Cleanup: Attempt deletion (expected to fail due to RFC 7592 limitation)
try:
logger.info(f"Attempting cleanup of shared OAuth client: {client_id[:16]}...")
success = await delete_client(
nextcloud_url=nextcloud_host,
client_id=client_id,
client_secret=client_secret,
)
if success:
logger.info(f"✅ Successfully deleted client: {client_id[:16]}...")
else:
logger.warning(
f"⚠️ Client deletion not supported by Nextcloud OIDC server. "
f"Client {client_id[:16]}... will expire automatically in 1 hour."
)
except Exception as e:
logger.warning(
f"⚠️ Error during client cleanup (expected): {e}. "
f"Client will expire automatically."
)
Test File Status
Created tests/server/oauth/test_dcr_lifecycle.py with 4 comprehensive tests:
- ✅
test_dcr_register_and_delete_lifecycle- Documents full lifecycle (fails at deletion step as expected) - ✅
test_dcr_delete_with_wrong_credentials- Verifies authentication behavior - ✅
test_dcr_delete_nonexistent_client- Tests error handling - ✅
test_dcr_deletion_is_idempotent- Tests repeated deletion attempts
All tests currently fail at the deletion step, which is expected given the RFC 7592 limitation.
Next Steps
- Update fixture comments to document expected deletion failure
- Mark deletion tests as expected failures using
@pytest.mark.xfail - Consider removing deletion tests if they don't provide value (since deletion doesn't work)
- Investigate Nextcloud admin API as alternative cleanup method for CI/CD environments
- Monitor Nextcloud OIDC app updates for RFC 7592 support
References
- RFC 7591 - OAuth 2.0 Dynamic Client Registration Protocol
- RFC 7592 - OAuth 2.0 Dynamic Client Registration Management Protocol
- Nextcloud OIDC App: Check
docker-compose.ymlfor app location - Test Evidence:
tests/server/oauth/test_dcr_lifecycle.pyline 254-256 (401 response details)