nextcloud-mcp-server/docs/testing-client-sessions-architecture.md

# Testing Client Sessions Architecture

## Overview

This document compares different approaches to managing MCP client sessions in integration tests, addressing the fundamental incompatibility between pytest-asyncio's fixture management and anyio's structured concurrency requirements.

## The Problem

When using pytest-asyncio with anyio-based libraries (like the MCP Python SDK), session-scoped async generator fixtures encounter a fundamental issue:

1. **pytest-asyncio** runs fixture teardown in a **new asyncio task** using `runner.run()`
2. **anyio** requires that cancel scopes be entered and exited in the **same task**
3. This causes `RuntimeError: Attempted to exit cancel scope in a different task than it was entered in`

This is a **known limitation** documented in the anyio project and is not a bug in either pytest-asyncio or anyio, but rather an inherent incompatibility between their design philosophies.

## Solution Comparison

### Solution 1: Native Async Context Managers with Surgical Exception Handling ✅ **IMPLEMENTED**

**Approach**: Use native `async with` statements for clean code structure, but add targeted exception handling at the pytest fixture level to handle the expected teardown errors.

**Implementation**:

```python
async def create_mcp_client_session(
    url: str,
    token: str | None = None,
    client_name: str = "MCP",
) -> AsyncGenerator[ClientSession, Any]:
    """Uses native async context managers for clean LIFO cleanup."""
    headers = {"Authorization": f"Bearer {token}"} if token else None

    async with streamablehttp_client(url, headers=headers) as (read_stream, write_stream, _):
        async with ClientSession(read_stream, write_stream) as session:
            await session.initialize()
            yield session

@pytest.fixture(scope="session")
async def nc_mcp_client() -> AsyncGenerator[ClientSession, Any]:
    """Fixture with surgical exception handling for pytest-asyncio incompatibility."""
    try:
        async for session in create_mcp_client_session(
            url="http://localhost:8000/mcp", client_name="Basic MCP"
        ):
            yield session
    except RuntimeError as e:
        # Only catch the specific expected error during pytest teardown
        if "cancel scope" in str(e) and "different task" in str(e):
            logger.debug(f"Ignoring expected pytest-asyncio teardown issue: {e}")
        else:
            # Unexpected RuntimeError - re-raise to fail the test
            raise
```

**Pros**:
- ✅ Clean, idiomatic code using native Python context managers
- ✅ Exception handling is surgical - only catches the specific expected error
- ✅ Unexpected errors still propagate and fail tests
- ✅ Can use session-scoped fixtures for performance
- ✅ Easy to understand and maintain
- ✅ Minimal code changes from original implementation
- ✅ No external dependencies required

**Cons**:
- ⚠️ Still requires exception suppression (though targeted)
- ⚠️ String-based exception matching is somewhat fragile
- ⚠️ Must apply the pattern to each session-scoped fixture
- ⚠️ Doesn't solve the root cause

**Verdict**: **Recommended** - Best balance of code clarity, maintainability, and pragmatism.

---

### Solution 2: Task-Isolated Fixtures

**Approach**: Run each fixture's client session in an isolated anyio task group, allowing independent cleanup without cross-fixture interference.

**Implementation**:

```python
@pytest.fixture(scope="session")
async def nc_mcp_client() -> AsyncGenerator[ClientSession, Any]:
    """Fixture with task isolation for clean teardown."""
    import anyio

    session_holder = {"session": None}

    async def create_and_hold_session():
        """Runs in isolated task - creates session and keeps it alive."""
        async with streamablehttp_client("http://localhost:8000/mcp") as (read_stream, write_stream, _):
            async with ClientSession(read_stream, write_stream) as session:
                await session.initialize()
                session_holder["session"] = session

                # Keep session alive until cancelled
                try:
                    await anyio.sleep_forever()
                except anyio.get_cancelled_exc_class():
                    pass  # Expected cancellation

    async with anyio.create_task_group() as tg:
        tg.start_soon(create_and_hold_session)

        # Wait for session to be ready
        while session_holder["session"] is None:
            await anyio.sleep(0.1)

        yield session_holder["session"]

        # Task group cancellation ensures clean LIFO cleanup
        tg.cancel_scope.cancel()
```

**Pros**:
- ✅ No exception suppression needed
- ✅ Each fixture has its own isolated task scope
- ✅ More theoretically correct approach
- ✅ Can use session-scoped fixtures

**Cons**:
- ❌ Significantly more complex code
- ❌ Harder to understand for developers unfamiliar with anyio
- ❌ Requires understanding of task groups and cancel scopes
- ❌ More boilerplate per fixture
- ❌ Still doesn't solve the fundamental pytest-asyncio incompatibility
- ❌ Polling for session readiness is inelegant
- ❌ Higher cognitive overhead for maintenance

**Verdict**: **Not Recommended** - Complexity outweighs benefits. Consider only if exception handling is completely unacceptable.

---

### Solution 3: Function-Scoped Fixtures with Nested Context Managers

**Approach**: Change fixtures to function scope and rely on Python's context manager nesting for guaranteed LIFO cleanup.

**Implementation**:

```python
@pytest.fixture(scope="function")  # Changed from session
async def nc_mcp_client() -> AsyncGenerator[ClientSession, Any]:
    """Function-scoped fixture with natural LIFO cleanup."""
    async with streamablehttp_client("http://localhost:8000/mcp") as (read_stream, write_stream, _):
        async with ClientSession(read_stream, write_stream) as session:
            await session.initialize()
            yield session

# For tests needing multiple clients:
@pytest.fixture(scope="function")
async def multi_mcp_clients() -> AsyncGenerator[tuple[ClientSession, ClientSession], Any]:
    """Multiple clients with guaranteed LIFO cleanup through nesting."""
    async with streamablehttp_client("http://localhost:8000/mcp") as (read1, write1, _):
        async with ClientSession(read1, write1) as session1:
            await session1.initialize()

            async with streamablehttp_client("http://localhost:8001/mcp") as (read2, write2, _):
                async with ClientSession(read2, write2) as session2:
                    await session2.initialize()
                    yield session1, session2
    # Cleanup: session2 -> stream2 -> session1 -> stream1 (LIFO guaranteed)
```

**Pros**:
- ✅ No exception handling needed
- ✅ Simplest to understand
- ✅ Natural LIFO cleanup through Python's context managers
- ✅ Each test gets fresh clients (better isolation)
- ✅ No workarounds or hacks required

**Cons**:
- ❌ Significantly slower tests (new clients per test)
- ❌ Cannot share client state across tests
- ❌ More resource intensive
- ❌ Higher overhead for test suite execution
- ❌ May not be practical for expensive fixtures (e.g., OAuth tokens)
- ❌ Nested context managers become unwieldy with many clients

**Verdict**: **Good Alternative** - Consider for specific fixtures where session scope isn't critical, or for new test files where performance isn't a concern.

---

### Solution 4: Use pytest-trio Instead of pytest-asyncio (Future)

**Approach**: Replace pytest-asyncio with pytest-trio, which was designed with structured concurrency in mind.

**Implementation**:

```python
# pyproject.toml
[tool.pytest.ini_options]
# Remove: asyncio_mode = "auto"
# Add: trio_mode = "auto"

# Fixtures work naturally with trio
@pytest.fixture(scope="session")
async def nc_mcp_client() -> AsyncGenerator[ClientSession, Any]:
    async with streamablehttp_client("http://localhost:8000/mcp") as (read, write, _):
        async with ClientSession(read, write) as session:
            await session.initialize()
            yield session
```

**Pros**:
- ✅ No workarounds needed
- ✅ Designed for structured concurrency
- ✅ Theoretically cleanest solution
- ✅ Can use session-scoped fixtures naturally

**Cons**:
- ❌ Requires switching from asyncio to trio backend
- ❌ Major refactoring required
- ❌ May break existing code that assumes asyncio
- ❌ Dependency changes throughout project
- ❌ Team needs to learn trio ecosystem
- ❌ Less ecosystem support than asyncio

**Verdict**: **Not Practical** - Too disruptive for existing projects. Consider only for greenfield projects or major rewrites.

---

## Decision Matrix

| Solution | Code Clarity | Maintenance | Performance | Safety | Effort |
|----------|--------------|-------------|-------------|--------|--------|
| **Solution 1** (Implemented) | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Solution 2 (Task-Isolated) | ⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
| Solution 3 (Function-Scoped) | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Solution 4 (pytest-trio) | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐ |

## Implementation Details

### What Changed in Solution 1

1. **`create_mcp_client_session` function** (conftest.py:61-110):
   - Replaced manual `__aenter__`/`__aexit__` calls with native `async with` statements
   - Removed blanket exception suppression from cleanup logic
   - Added clear documentation about LIFO cleanup order
   - Simplified from ~60 lines to ~40 lines

2. **Session-scoped MCP client fixtures** (conftest.py:148-1269):
   - Added targeted exception handling wrapper
   - Only catches specific "cancel scope" + "different task" RuntimeError
   - All other exceptions propagate normally
   - Applied to: `nc_mcp_client`, `nc_mcp_oauth_client`, `alice_mcp_client`, `bob_mcp_client`, `charlie_mcp_client`, `diana_mcp_client`

3. **Documentation**:
   - Added comprehensive docstrings explaining the workaround
   - Referenced MCP SDK issue #577 for context
   - Documented why this is necessary and not a bug

### Benefits of This Implementation

1. **Clean Core Logic**: The `create_mcp_client_session` function is now clean, idiomatic Python with no workarounds
2. **Isolated Workaround**: Exception handling is confined to pytest fixture level where the issue actually occurs
3. **Surgical Exception Handling**: Only catches the specific expected error, not all RuntimeErrors
4. **Performance**: Maintains session-scoped fixtures for fast test execution
5. **Maintainability**: Easy to understand and modify
6. **Safety**: Real errors still cause test failures

## Testing Results

All tests pass cleanly with the implementation:

```bash
$ uv run pytest tests/server/test_mcp.py -v
============================================= test session starts ==============================================
tests/server/test_mcp.py::test_mcp_connectivity PASSED                                            [ 16%]
tests/server/test_mcp.py::test_mcp_notes_crud_workflow PASSED                                     [ 33%]
tests/server/test_mcp.py::test_mcp_notes_etag_conflict PASSED                                     [ 50%]
tests/server/test_mcp.py::test_mcp_webdav_workflow PASSED                                         [ 66%]
tests/server/test_mcp.py::test_mcp_resources_access PASSED                                        [ 83%]
tests/server/test_mcp.py::test_mcp_calendar_workflow PASSED                                       [100%]
============================================== 6 passed in 39.52s ==============================================
```

## Recommendations

### For This Project: Solution 1 ✅

The implemented solution (Solution 1) is the best fit because:
- Minimal disruption to existing tests
- Clean, maintainable code
- Good performance with session-scoped fixtures
- Targeted exception handling that doesn't hide real errors

### For New Test Files: Consider Solution 3

For new test files where performance isn't critical, consider using function-scoped fixtures (Solution 3):
- No workarounds needed
- Perfect code clarity
- Better test isolation

### For Greenfield Projects: Consider Solution 4

For new projects starting from scratch, consider pytest-trio instead of pytest-asyncio:
- Native structured concurrency support
- No workarounds needed
- Better alignment with modern async Python patterns

## Related Resources

- [MCP Python SDK Issue #577](https://github.com/modelcontextprotocol/python-sdk/issues/577) - Original issue report
- [Anyio Issue #345](https://github.com/agronholm/anyio/issues/345) - Discussion of fixture limitations
- [Nextcloud MCP Note 378555](nextcloud://notes/378555) - Detailed investigation notes
- pytest-asyncio documentation: https://pytest-asyncio.readthedocs.io/
- anyio structured concurrency guide: https://anyio.readthedocs.io/en/stable/basics.html

## Appendix: Why Can't This Be Fixed Upstream?

The incompatibility cannot be "fixed" in either pytest-asyncio or anyio without breaking their core design:

1. **pytest-asyncio** needs to manage fixture lifecycle across different scopes, requiring separate task creation for cleanup
2. **anyio** enforces structured concurrency guarantees by requiring same-task cancel scope entry/exit
3. These requirements are fundamentally incompatible

The maintainers of both projects are aware of this issue, and it's considered an acceptable trade-off given their respective design goals. The recommended approach is to handle it at the application level, as we've done here.