Files
nextcloud-mcp-server/tests/load/README_OAUTH.md
T
2025-10-18 22:02:24 +02:00

16 KiB

OAuth Multi-User Load Testing Framework

Comprehensive multi-user benchmarking system for testing OAuth-authenticated Nextcloud MCP server with realistic collaborative workflows.

Quick Start

# 1. Ensure docker-compose is running
docker-compose up -d

# 2. Run a benchmark with 2 users for 30 seconds
uv run python -m tests.load.oauth_benchmark --users 2 --duration 30

# 3. Clean up test users (IMPORTANT - always run after benchmark)
uv run python -m tests.load.cleanup_loadtest_users

# Optional: Verify cleanup
uv run python -m tests.load.cleanup_loadtest_users --dry-run

Overview

This framework extends the basic load testing infrastructure to support:

  • Multiple OAuth-authenticated users running concurrently
  • Coordinated workflows spanning multiple users (sharing, collaboration, permissions)
  • Per-user metrics tracking individual user performance
  • Workflow-specific metrics measuring cross-user operation latencies
  • Realistic scenarios mimicking actual user collaboration patterns
  • Concurrent user creation - all users created and authenticated in parallel for fast setup

Architecture

Components

tests/load/
├── oauth_pool.py          # OAuth user pool management
├── oauth_workloads.py     # Multi-user workflow definitions
├── oauth_metrics.py       # Enhanced metrics collection
├── oauth_benchmark.py     # Main CLI entry point
└── README_OAUTH.md        # This file

Key Classes

OAuthUserPool (oauth_pool.py)

  • Manages N OAuth-authenticated users
  • Handles token acquisition and storage
  • Creates and manages MCP sessions per user
  • Tracks per-user operation statistics

UserSessionWrapper (oauth_pool.py)

  • Wraps MCP ClientSession for a specific user
  • Automatic operation tracking
  • Convenient tool/resource access methods

Workflow (oauth_workloads.py)

  • Base class for multi-user coordinated workflows
  • Step-by-step execution with timing
  • Comprehensive error handling and reporting

OAuthBenchmarkMetrics (oauth_metrics.py)

  • Per-user operation counts and latencies
  • Workflow completion rates and timings
  • Baseline operation statistics
  • Detailed reporting and JSON export

Available Workflows

1. NoteShareWorkflow

Scenario: Alice creates a note and shares it with Bob, who then reads it.

Steps:

  1. User A creates a note
  2. User A shares note with User B (read-only permissions)
  3. User B lists their shared notes (measures propagation delay)
  4. User B reads the shared note

Metrics: Creation latency, share propagation time, read latency

2. CollaborativeEditWorkflow

Scenario: Multiple users concurrently edit the same note.

Steps:

  1. Owner creates a note
  2. All users read the note simultaneously
  3. All users append content concurrently
  4. Owner verifies final state

Metrics: Concurrent read latency, concurrent write conflicts, final state consistency

3. FileShareAndDownloadWorkflow

Scenario: Alice uploads a file, shares it with Bob, who then downloads it.

Steps:

  1. User A creates a file via WebDAV
  2. User A shares file with User B (read-only)
  3. User B lists their shares
  4. User B downloads the file

Metrics: Upload latency, share creation, download latency

4. MixedOAuthWorkload

Distribution:

  • 50% Baseline operations (individual user CRUD)
  • 30% Note sharing workflows
  • 15% Collaborative editing workflows
  • 5% File sharing workflows

Usage

Basic Usage

# 4 users, 60-second test with mixed workload
uv run python -m tests.load.oauth_benchmark --users 4 --duration 60

# 10 users, 5-minute test
uv run python -m tests.load.oauth_benchmark -u 10 -d 300

# Export results to JSON
uv run python -m tests.load.oauth_benchmark -u 5 -d 120 --output results.json

Advanced Options

# Sharing-focused workload
uv run python -m tests.load.oauth_benchmark --workload sharing -u 8 -d 180

# Collaborative editing workload
uv run python -m tests.load.oauth_benchmark --workload collaboration -u 6 -d 120

# Baseline operations only (no workflows)
uv run python -m tests.load.oauth_benchmark --workload baseline -u 10 -d 60

# Verbose logging for debugging
uv run python -m tests.load.oauth_benchmark -u 2 -d 30 --verbose

CLI Options

Option Short Default Description
--users -u 2 Number of concurrent users (max 4 with default config)
--duration -d 30.0 Test duration in seconds
--warmup -w 5.0 Warmup period before metrics collection (seconds)
--url http://127.0.0.1:8001/mcp MCP OAuth server URL
--output -o None JSON output file path
--workload mixed Workload type: mixed, sharing, collaboration, baseline
--verbose -v False Enable verbose logging

Default Test Users

The framework includes 4 pre-configured test users:

Username Display Name Groups Role
alice Alice Anderson owners Owner - full permissions
bob Bob Brown viewers Viewer - read-only
charlie Charlie Chen editors Editor - read/write
diana Diana Davis (none) No special permissions

Metrics Output

Console Report

================================================================================
OAUTH MULTI-USER BENCHMARK RESULTS
================================================================================

Duration: 120.45s
Total Users: 4
Total Workflows Executed: 247
Total Baseline Operations: 531

--------------------------------------------------------------------------------
WORKFLOW STATISTICS
--------------------------------------------------------------------------------
Workflow                         Total  Success     Rate        P50        P95
--------------------------------------------------------------------------------
note_share                          89       87    97.8%   0.2341s   0.4782s
collaborative_edit                  52       48    92.3%   0.5123s   0.9234s
file_share                          23       23   100.0%   0.3456s   0.6123s

--------------------------------------------------------------------------------
PER-USER STATISTICS
--------------------------------------------------------------------------------
User                  Total Ops    Success   Errors     Rate        P50
--------------------------------------------------------------------------------
alice                        234        229        5    97.9%   0.2456s
bob                          198        195        3    98.5%   0.2123s
charlie                      187        183        4    97.9%   0.2345s
diana                        159        157        2    98.7%   0.2234s

--------------------------------------------------------------------------------
BASELINE OPERATIONS
--------------------------------------------------------------------------------
Total Operations: 531
Success Rate: 98.1%
Latency: min=0.0234s, p50=0.1234s, p95=0.3456s, max=0.8123s
================================================================================

JSON Export

{
  "summary": {
    "duration": 120.45,
    "total_workflows": 247,
    "total_baseline_ops": 531,
    "total_users": 4
  },
  "workflows": {
    "note_share": {
      "total_executions": 89,
      "successful_executions": 87,
      "failed_executions": 2,
      "success_rate": 97.8,
      "latency": {
        "min": 0.1234,
        "max": 0.8765,
        "mean": 0.2891,
        "median": 0.2341,
        "p90": 0.4123,
        "p95": 0.4782,
        "p99": 0.7234
      },
      "step_latencies": {
        "create_note": {...},
        "share_note": {...},
        "list_shared_with_me": {...},
        "read_shared_note": {...}
      }
    }
  },
  "users": {
    "alice": {
      "total_operations": 234,
      "successful_operations": 229,
      "failed_operations": 5,
      "success_rate": 97.9,
      "latency": {...},
      "operations_breakdown": {...},
      "errors_breakdown": {...}
    }
  },
  "baseline": {...}
}

Implementation Status

Completed Components

Framework:

  • OAuth user pool management with dynamic user creation
  • User session wrappers with automatic tracking
  • Workflow base classes and framework
  • 3 example workflows (note share, collaborative edit, file share)
  • Enhanced metrics with per-user and workflow tracking
  • CLI interface with multiple workload options
  • Comprehensive reporting (console + JSON)

OAuth Integration:

  • Playwright browser automation for OAuth login
  • OAuth callback server for auth code capture
  • Token exchange with OIDC provider
  • OAuth token injection into MCP sessions via Authorization headers
  • Cancel scope error handling for reliable cleanup
  • Dynamic user creation and deletion via Nextcloud Users API

Implementation Details: The benchmark now successfully:

  1. Creates Nextcloud users dynamically with unique passwords
  2. Acquires OAuth tokens via automated Playwright browser flows
  3. Creates MCP client sessions with proper Authorization: Bearer {token} headers
  4. Executes coordinated multi-user workflows
  5. Tracks per-user and per-workflow metrics
  6. Provides standalone cleanup utility for test users

Key Fix (oauth_pool.py:163-164):

# Pass OAuth token as Authorization header
headers = {"Authorization": f"Bearer {profile.token}"}
streamable_context = streamablehttp_client(mcp_url, headers=headers)

Creating Custom Workflows

Example: Permission Escalation Workflow

class PermissionEscalationWorkflow(Workflow):
    """Test sharing permission changes."""

    def __init__(self):
        super().__init__("permission_escalation")

    async def execute(self, users: list[UserSessionWrapper]) -> WorkflowResult:
        self.start_time = time.time()

        if len(users) < 2:
            return self._finish(False, error="Requires 2+ users")

        owner, collaborator = users[0], users[1]

        # Step 1: Owner creates note
        create_result = await self._execute_step(
            "create_note",
            owner,
            lambda: owner.call_tool("nc_notes_create_note", {...})
        )

        # Step 2: Share read-only
        await self._execute_step(
            "share_readonly",
            owner,
            lambda: owner.call_tool("nc_share_create", {
                "permissions": 1  # Read-only
            })
        )

        # Step 3: Upgrade to edit permissions
        await self._execute_step(
            "upgrade_permissions",
            owner,
            lambda: owner.call_tool("nc_share_update", {
                "permissions": 15  # Read+update+create+delete
            })
        )

        # Step 4: Collaborator edits
        await self._execute_step(
            "collaborator_edit",
            collaborator,
            lambda: collaborator.call_tool("nc_notes_update_note", {...})
        )

        return self._finish(success=True)

Registering Custom Workflows

# In oauth_workloads.py
class MixedOAuthWorkload:
    def __init__(self, users: list[UserSessionWrapper]):
        self.users = users
        self.workflows = {
            "note_share": NoteShareWorkflow(),
            "collaborative_edit": CollaborativeEditWorkflow(),
            "file_share": FileShareAndDownloadWorkflow(),
            "permission_escalation": PermissionEscalationWorkflow(),  # Add your workflow
        }

Performance Expectations

Baseline Performance (basic auth, from existing benchmarks)

  • Throughput: 50-200 RPS for mixed workload
  • Latency: p50 <100ms, p95 <500ms, p99 <1000ms

OAuth Multi-User Expectations

  • Lower throughput: ~30-60% of baseline due to:
    • OAuth token validation overhead
    • Cross-user synchronization delays
    • Workflow coordination overhead
  • Higher p99 latency: Due to workflow step dependencies
  • Focus: End-to-end workflow completion time more important than raw RPS

Common Bottlenecks

  1. OAuth token validation: Per-request overhead
  2. Share propagation: Time for shares to become visible to recipients
  3. Concurrent edit conflicts: ETags and conflict resolution
  4. Permission checks: Cross-user access validation

Best Practices

  1. Start Small: Begin with 2-3 users to validate workflows
  2. Monitor Errors: Watch for permission errors and conflicts
  3. Adjust Delays: Tune sleep delays between operations based on server response
  4. Profile Workflows: Use step latencies to identify bottlenecks
  5. Export Results: Always export to JSON for historical comparison

Performance Optimizations

Concurrent User Creation

The benchmark creates and authenticates users concurrently for maximum performance:

Step 5: User Creation & OAuth Authentication

  • All N users are created in parallel using asyncio.gather()
  • Each user runs through the full OAuth flow simultaneously
  • Multiple Playwright browser contexts operate independently

Step 6: MCP Session Creation

  • All user sessions are created concurrently
  • OAuth tokens passed as Authorization headers to each session

Performance Impact:

  • Sequential (old): ~10-12s per user → 40-48s for 4 users
  • Concurrent (new): ~12-15s total for 4 users (3-4x speedup!)

Example output showing concurrent execution:

Step 5/6: Creating 4 users and acquiring OAuth tokens...
(Running concurrently for faster setup)

  [1/4] Creating user 'loadtest_user_1'...
  [2/4] Creating user 'loadtest_user_2'...
  [3/4] Creating user 'loadtest_user_3'...
  [4/4] Creating user 'loadtest_user_4'...
  ✓ User 'loadtest_user_4' authenticated
  ✓ User 'loadtest_user_2' authenticated
  ✓ User 'loadtest_user_1' authenticated
  ✓ User 'loadtest_user_3' authenticated

✓ Successfully created and authenticated 4 users

Implementation (oauth_benchmark.py:402-437):

# Create tasks for all users
tasks = [
    create_user_task(i, browser, callback_server.auth_states)
    for i in range(num_users)
]
# Run all concurrently
results = await asyncio.gather(*tasks, return_exceptions=True)

Cleanup

Important: Due to asyncio scoping issues with the MCP client library, automatic cleanup in the benchmark's finally block may not execute reliably. Always use the cleanup utility after running benchmarks.

Use the cleanup utility to remove test users:

# Dry run - see what would be deleted
uv run python -m tests.load.cleanup_loadtest_users --dry-run

# Delete all loadtest users
uv run python -m tests.load.cleanup_loadtest_users

# Delete users with custom prefix
uv run python -m tests.load.cleanup_loadtest_users --prefix mytest

Disable Automatic Cleanup

To keep test users after the benchmark for inspection:

uv run python -m tests.load.oauth_benchmark --users 2 --no-cleanup

Troubleshooting

Leftover Test Users

Symptom: Test users remain in Nextcloud after benchmark crashes

Solution: Run the cleanup utility:

uv run python -m tests.load.cleanup_loadtest_users

"User X not in pool" Error

  • Ensure user count doesn't exceed configured limits
  • Check that user creation succeeded in previous steps

High Error Rates

  • Increase delay between operations (await asyncio.sleep() in worker)
  • Check OAuth token validity
  • Verify MCP OAuth server is running and accessible (port 8001)
  • Rebuild mcp-oauth container after code changes: docker-compose up --build -d mcp-oauth

Workflows Failing

  • Check step-by-step latencies to identify failing steps
  • Verify users have correct permissions
  • Review server logs for errors

MCP Session Creation Fails (401 Unauthorized)

Solution: This issue has been fixed! OAuth tokens are now properly passed as Authorization headers when creating MCP sessions.

If you still see 401 errors:

  • Rebuild the mcp-oauth container: docker-compose up --build -d mcp-oauth
  • Verify OAuth tokens are being acquired successfully in verbose mode
  • Check that the token hasn't expired (use shorter test durations during troubleshooting)

Future Enhancements

  • Dynamic user creation (beyond 4 default users) - COMPLETED
  • OAuth token injection for MCP sessions - COMPLETED
  • Cancel scope error handling - COMPLETED
  • Concurrent user creation and authentication - COMPLETED (3-4x speedup!)
  • Workflow templates for common patterns
  • Real-time dashboard for live monitoring
  • Historical comparison and regression detection
  • Load ramping (gradual user increase)
  • Geographic distribution simulation (latency injection)
  • Improve cleanup reliability in finally block