Files
nextcloud-mcp-server/DCR_DELETION_INVESTIGATION.md
T
Chris Coutinho 2147fc1696 refactor: Transform document parsing into pluggable processor architecture
Refactors PR #190's hardcoded Unstructured.io integration into a flexible,
extensible plugin system supporting multiple text extraction engines.

- **`DocumentProcessor` ABC**: Abstract interface for all processors
- **`ProcessorRegistry`**: Central registry for discovery and routing
- **`ProcessingResult`**: Standardized output format across processors

- **`UnstructuredProcessor`**: Refactored from `UnstructuredClient`
- **`TesseractProcessor`**: Local OCR for images (lightweight alternative)
- **`CustomHTTPProcessor`**: Generic wrapper for custom HTTP APIs

- New `get_document_processor_config()` returns structured config
- Supports enabling/disabling individual processors
- Per-processor configuration via environment variables
- **Breaking Change**: `ENABLE_UNSTRUCTURED_PARSING` replaced with:
  - `ENABLE_DOCUMENT_PROCESSING=true/false` (master switch)
  - `ENABLE_UNSTRUCTURED=true/false` (per-processor)
  - `ENABLE_TESSERACT=true/false`
  - `ENABLE_CUSTOM_PROCESSOR=true/false`

- `parse_document()` now uses `ProcessorRegistry`
- Auto-selects appropriate processor based on MIME type
- Processor priority system (Unstructured=10, Tesseract=5, Custom=1)

- `initialize_document_processors()` registers processors at startup
- Integrated into both BasicAuth and OAuth lifespans
- Graceful degradation if processors fail to initialize

```env
ENABLE_DOCUMENT_PROCESSING=false

ENABLE_UNSTRUCTURED=false
UNSTRUCTURED_API_URL=http://unstructured:8000
UNSTRUCTURED_STRATEGY=auto  # auto|fast|hi_res
UNSTRUCTURED_LANGUAGES=eng,deu

ENABLE_TESSERACT=false
TESSERACT_LANG=eng

ENABLE_CUSTOM_PROCESSOR=false
CUSTOM_PROCESSOR_URL=http://localhost:9000/process
CUSTOM_PROCESSOR_TYPES=application/pdf,image/jpeg
```

- **Removed**: `tests/test_unstructured_config.py` (legacy tests)
- **Added**: `tests/unit/test_document_processor_config.py`
  - 7 unit tests for new config system
  - Tests individual and multi-processor configurations

- **Added**:
  - `nextcloud_mcp_server/document_processors/__init__.py`
  - `nextcloud_mcp_server/document_processors/base.py`
  - `nextcloud_mcp_server/document_processors/registry.py`
  - `nextcloud_mcp_server/document_processors/unstructured.py`
  - `nextcloud_mcp_server/document_processors/tesseract.py`
  - `nextcloud_mcp_server/document_processors/custom_http.py`
  - `tests/unit/test_document_processor_config.py`

- **Modified**:
  - `nextcloud_mcp_server/config.py` - New plugin config system
  - `nextcloud_mcp_server/app.py` - Processor initialization
  - `nextcloud_mcp_server/utils/document_parser.py` - Uses registry
  - `nextcloud_mcp_server/server/webdav.py` - Import updates
  - `env.sample` - New configuration format
  - `docker-compose.yml` - (profile changes from previous work)

- **Removed**:
  - `nextcloud_mcp_server/client/unstructured_client.py` - Replaced by UnstructuredProcessor
  - `tests/test_unstructured_config.py` - Replaced with new tests

 **Extensible**: Add processors without modifying core code
 **Testable**: Mock processors for unit tests
 **Configurable**: Enable only needed processors
 **Flexible**: Choose fast (Tesseract) vs accurate (Unstructured)
 **Opt-in**: Disabled by default, no mandatory dependencies

Users upgrading from PR #190 need to update environment variables:
```bash
ENABLE_UNSTRUCTURED_PARSING=true

ENABLE_DOCUMENT_PROCESSING=true
ENABLE_UNSTRUCTURED=true
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 19:28:35 +02:00

251 lines
10 KiB
Markdown

# DCR Client Deletion Investigation
## Summary
**RESOLVED** - As of 2025-10-24, Dynamic Client Registration (DCR) via RFC 7591 **and** RFC 7592 client deletion now work correctly in Nextcloud's OIDC server!
**Historical Note**: This document was originally created to investigate DCR deletion failures. The issue has been resolved by merging two feature branches (`feature/user-consent-complete` and `feature/dcr-jwt-scopes`) that implement RFC 7592 support.
## Resolution Summary (2025-10-24)
### What Now Works ✅
- **Client Registration** (RFC 7591): Successfully creates OAuth clients with custom scopes and token types
- **Registration Access Token**: ✅ Now included in registration response per RFC 7592
- **Registration Client URI**: ✅ Now included in registration response per RFC 7592
- **Client Deletion** (RFC 7592): ✅ Now works with Bearer token authentication
- **Token Acquisition**: Registered clients can obtain access tokens via authorization code flow
- **API Access**: Tokens work correctly for accessing Nextcloud APIs
### Test Evidence
The test `test_new_dcr_registration_includes_access_token` in `tests/server/oauth/test_dcr_new_implementation.py` confirms:
**Registration Response:**
```json
{
"client_id": "wynkPur15ibby0Ma2FUOMyv4JdmtxqlRepvGmERrE36RYmquuExma1srAgDG1rKZ",
"client_secret": "agaZU3WdffOy4o6TS4vZ...",
"registration_access_token": "uKycqheAzw2UMZUL58Ir...",
"registration_client_uri": "http://localhost:8080/apps/oidc/register/wynkPur15ibby0Ma2FUOMyv4JdmtxqlRepvGmERrE36RYmquuExma1srAgDG1rKZ",
...
}
```
**Deletion Test:**
- Endpoint: `DELETE /apps/oidc/register/{client_id}`
- Authentication: `Authorization: Bearer {registration_access_token}`
- Response: **204 No Content**
### Implementation Details
The resolution required:
1. Merging `feature/user-consent-complete` and `feature/dcr-jwt-scopes` branches
2. Adding missing classes to composer autoload files:
- `OCA\OIDCIdentityProvider\Db\RegistrationToken`
- `OCA\OIDCIdentityProvider\Db\RegistrationTokenMapper`
- `OCA\OIDCIdentityProvider\Service\RegistrationTokenService`
3. Fixing method calls in `DynamicRegistrationController.php`:
- Changed `findByClientId()` to `getByClientId()` for RedirectUriMapper
- Removed logout redirect URI deletion (not client-specific in schema)
4. Database migration applied automatically (`oc_oidc_reg_tokens` table created)
### Files Modified
- `third_party/oidc/composer/composer/autoload_classmap.php` - Added 3 new class mappings
- `third_party/oidc/composer/composer/autoload_static.php` - Added 3 new class mappings
- `third_party/oidc/lib/Controller/DynamicRegistrationController.php` - Fixed deletion logic
- `third_party/oidc/lib/Db/LogoutRedirectUriMapper.php` - Added `deleteByClientId()` method
## Technical Details
### Registration Response Analysis
When registering a client via POST to `/apps/oidc/register`, the response includes:
```json
{
"client_name": "DCR Lifecycle Test Client",
"client_id": "eVdV1obTHUhtQiBOLnDcOucZE3sQA6J7JgzsDFsnpgzLkWSNEPXHJbpSfjLUU5ot",
"client_secret": "iqNeH5inrdTPh6hYGOmvlML7SWqHPHpMZp9CQlNHNnKGf6VZ8pSeaSC1EBrDRmyd",
"redirect_uris": ["http://localhost:8081"],
"token_endpoint_auth_method": "client_secret_post",
"response_types": ["code"],
"grant_types": ["authorization_code"],
"id_token_signed_response_alg": "RS256",
"application_type": "web",
"client_id_issued_at": 1761286688,
"client_secret_expires_at": 1761290288,
"scope": "openid profile email notes:read",
"token_type": "Bearer"
}
```
**Missing:** `registration_access_token` and `registration_client_uri`
### Deletion Attempt Analysis
Attempting DELETE to `/apps/oidc/register/{client_id}` with various authentication methods:
#### Method 1: HTTP Basic Auth
- **Authentication**: HTTP Basic Auth with `client_id` as username, `client_secret` as password
- **Response**: 401 Unauthorized
- **Response Body**: `{"message":""}`
#### Method 2: Credentials in JSON Body
- **Authentication**: JSON body with `client_id` and `client_secret`
- **Response**: N/A (httpx.AsyncClient.delete() doesn't support `json` parameter)
#### Method 3: Credentials in Query Parameters
- **Authentication**: Query params `?client_id=...&client_secret=...`
- **Response**: 500 Internal Server Error (server-side exception when parsing query params)
#### Method 4: No Authentication (Baseline)
- **Authentication**: None
- **Response**: 401 Unauthorized
- **Response Body**: `{"error":"invalid_client","error_description":"Client authentication failed."}`
**Conclusion**: The 401 error occurs with HTTP Basic Auth (the standard RFC 7592 method). Query parameters cause a 500 error (not supported). No authentication returns 401 as expected.
### RFC 7592 Requirements (Not Met)
According to [RFC 7592 Section 3](https://www.rfc-editor.org/rfc/rfc7592.html#section-3), the registration endpoint MUST return:
1. **`registration_access_token`**: A token for subsequent management operations (read, update, delete)
2. **`registration_client_uri`**: The URI for managing this client
The client delete request should then use:
```http
DELETE /apps/oidc/register/{client_id}
Authorization: Bearer {registration_access_token}
```
## Root Cause Analysis
### Possible Causes
1. **Nextcloud OIDC Server Implementation Gap**
- The OIDC server (likely based on third-party library) may not fully implement RFC 7592
- Registration (RFC 7591) is implemented, but management operations (RFC 7592) are not
2. **Middleware Blocking**
- Nextcloud middleware may be blocking unauthenticated DELETE requests to `/apps/oidc/*`
- The 401 error suggests authentication is being checked but failing
3. **Missing Feature**
- Client deletion may simply not be implemented in the current OIDC app version
- The endpoint exists but returns 401 regardless of credentials
## Impact on Test Fixtures
### Current Fixture Behavior
The `shared_oauth_client_credentials` and `shared_jwt_oauth_client_credentials` fixtures in `tests/conftest.py` (lines 947-1112) attempt to clean up registered clients using:
```python
success = await delete_client(
nextcloud_url=nextcloud_host,
client_id=client_id,
client_secret=client_secret,
)
```
This cleanup **always fails** (returns `False`) due to the 401 error, but the failure is handled gracefully with a warning:
```python
except Exception as e:
logger.warning(
f"Error cleaning up shared OAuth client {client_id[:16]}...: {e}"
)
```
### Consequences
1. **OAuth Clients Accumulate**: Every test session registers 2 OAuth clients that are never deleted
2. **No Functional Impact**: Tests continue to work because:
- Clients have 1-hour expiration (`client_secret_expires_at`)
- New clients are registered for each session
- Old clients expire automatically
3. **Database Bloat**: Over time, the `oc_oauth2_clients` table may accumulate expired clients
## Recommendations
### Short Term (Current Approach)
1. **Keep Current Warning-Based Approach**: The fixtures already handle deletion failure gracefully
2. **Document Expected Behavior**: Add comments explaining that deletion is expected to fail
3. **Accept Client Accumulation**: Rely on automatic expiration (1 hour)
### Long Term (If DCR Deletion Needed)
1. **Check Nextcloud OIDC App Version**: Verify if newer versions support RFC 7592 deletion
2. **File Bug Report**: Report missing `registration_access_token` to Nextcloud OIDC project
3. **Alternative Cleanup**: Use Nextcloud admin API to delete OAuth clients directly
- Requires admin credentials
- Bypass OIDC app's DCR endpoint
- Example: `occ oauth:clients:delete {client_id}`
### Recommended Fixture Update
```python
@pytest.fixture(scope="session")
async def shared_oauth_client_credentials(anyio_backend, oauth_callback_server):
"""
... existing docstring ...
Note:
Client deletion via RFC 7592 is not supported by Nextcloud OIDC server
(missing registration_access_token). Clients will expire after 1 hour
automatically. Manual cleanup via admin API may be needed in production.
"""
# ... registration code ...
yield (...)
# Cleanup: Attempt deletion (expected to fail due to RFC 7592 limitation)
try:
logger.info(f"Attempting cleanup of shared OAuth client: {client_id[:16]}...")
success = await delete_client(
nextcloud_url=nextcloud_host,
client_id=client_id,
client_secret=client_secret,
)
if success:
logger.info(f"✅ Successfully deleted client: {client_id[:16]}...")
else:
logger.warning(
f"⚠️ Client deletion not supported by Nextcloud OIDC server. "
f"Client {client_id[:16]}... will expire automatically in 1 hour."
)
except Exception as e:
logger.warning(
f"⚠️ Error during client cleanup (expected): {e}. "
f"Client will expire automatically."
)
```
## Test File Status
Created `tests/server/oauth/test_dcr_lifecycle.py` with 4 comprehensive tests:
1.`test_dcr_register_and_delete_lifecycle` - Documents full lifecycle (fails at deletion step as expected)
2.`test_dcr_delete_with_wrong_credentials` - Verifies authentication behavior
3.`test_dcr_delete_nonexistent_client` - Tests error handling
4.`test_dcr_deletion_is_idempotent` - Tests repeated deletion attempts
**All tests currently fail at the deletion step**, which is expected given the RFC 7592 limitation.
## Next Steps
1. **Update fixture comments** to document expected deletion failure
2. **Mark deletion tests as expected failures** using `@pytest.mark.xfail`
3. **Consider removing deletion tests** if they don't provide value (since deletion doesn't work)
4. **Investigate Nextcloud admin API** as alternative cleanup method for CI/CD environments
5. **Monitor Nextcloud OIDC app updates** for RFC 7592 support
## References
- [RFC 7591 - OAuth 2.0 Dynamic Client Registration Protocol](https://www.rfc-editor.org/rfc/rfc7591.html)
- [RFC 7592 - OAuth 2.0 Dynamic Client Registration Management Protocol](https://www.rfc-editor.org/rfc/rfc7592.html)
- Nextcloud OIDC App: Check `docker-compose.yml` for app location
- Test Evidence: `tests/server/oauth/test_dcr_lifecycle.py` line 254-256 (401 response details)