5b484c9226
Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities.
## Major Changes
### Unified Provider Architecture (ADR-015)
- Created `nextcloud_mcp_server/providers/` with unified Provider ABC
- Providers now support optional capabilities (embeddings and/or generation)
- Auto-detection registry with priority: Bedrock → Ollama → Simple
- Backward compatible - existing code continues to work
### New Providers
- **BedrockProvider**: Full Amazon Bedrock integration
- Embeddings: Titan Embed, Cohere Embed models
- Generation: Claude, Llama, Titan Text, Mistral models
- Model-specific request/response handling
- AWS credential chain integration
- **OllamaProvider**: Migrated with both capabilities support
- **AnthropicProvider**: Moved from test code to production providers
- **SimpleProvider**: Migrated in-memory fallback provider
### Breaking Changes
None - full backward compatibility maintained:
- `embedding.get_embedding_service()` still works
- RAG evaluation tests updated to use unified providers
- All existing tests pass (127 unit tests)
### Testing
- Added 9 comprehensive Bedrock unit tests with mocked boto3
- All existing unit tests pass
- Type checking (ty) and linting (ruff) pass
- Verified backward compatibility
### Documentation
- `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR
- `docs/bedrock-setup.md`: AWS setup guide with IAM permissions
- `CLAUDE.md`: Updated with provider architecture section
### Dependencies
- Added `boto3>=1.35.0` to dev dependencies (optional)
## Environment Variables
### Bedrock
- `AWS_REGION`: AWS region (e.g., "us-east-1")
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings
- `BEDROCK_GENERATION_MODEL`: Model ID for generation
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials
### Ollama
- `OLLAMA_BASE_URL`: API URL
- `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL`: Generation model
## AWS Bedrock Permissions Required
Minimal IAM policy:
```json
{
"Effect": "Allow",
"Action": ["bedrock:InvokeModel"],
"Resource": ["arn:aws:bedrock:*::foundation-model/*"]
}
```
See `docs/bedrock-setup.md` for detailed setup instructions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
90 lines
3.3 KiB
Python
90 lines
3.3 KiB
Python
"""LLM provider abstraction for RAG evaluation.
|
|
|
|
DEPRECATED: This module is maintained for backward compatibility with RAG evaluation tests.
|
|
New code should use nextcloud_mcp_server.providers directly.
|
|
|
|
Supports Ollama (local), Anthropic (cloud), and Bedrock (AWS) providers for both ground truth
|
|
generation and evaluation.
|
|
"""
|
|
|
|
import os
|
|
|
|
from nextcloud_mcp_server.providers import (
|
|
AnthropicProvider,
|
|
BedrockProvider,
|
|
OllamaProvider,
|
|
Provider,
|
|
)
|
|
|
|
|
|
def create_llm_provider(
|
|
provider: str | None = None,
|
|
ollama_base_url: str | None = None,
|
|
ollama_model: str | None = None,
|
|
anthropic_api_key: str | None = None,
|
|
anthropic_model: str | None = None,
|
|
bedrock_region: str | None = None,
|
|
bedrock_model: str | None = None,
|
|
) -> Provider:
|
|
"""Create an LLM provider from environment variables or arguments.
|
|
|
|
Args:
|
|
provider: Provider type ('ollama', 'anthropic', or 'bedrock').
|
|
Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
|
ollama_base_url: Ollama base URL. Defaults to RAG_EVAL_OLLAMA_BASE_URL or 'http://localhost:11434'
|
|
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.2:1b'
|
|
anthropic_api_key: Anthropic API key. Defaults to RAG_EVAL_ANTHROPIC_API_KEY env var
|
|
anthropic_model: Anthropic model. Defaults to RAG_EVAL_ANTHROPIC_MODEL or 'claude-3-5-sonnet-20241022'
|
|
bedrock_region: AWS region. Defaults to RAG_EVAL_BEDROCK_REGION or AWS_REGION env var
|
|
bedrock_model: Bedrock model ID. Defaults to RAG_EVAL_BEDROCK_MODEL or
|
|
'anthropic.claude-3-sonnet-20240229-v1:0'
|
|
|
|
Returns:
|
|
Provider instance
|
|
|
|
Raises:
|
|
ValueError: If provider is invalid or required credentials are missing
|
|
"""
|
|
# Get provider from args or env
|
|
provider = provider or os.environ.get("RAG_EVAL_PROVIDER", "ollama")
|
|
|
|
if provider == "ollama":
|
|
# Try RAG_EVAL_OLLAMA_BASE_URL, then OLLAMA_HOST, then default
|
|
base_url = (
|
|
ollama_base_url
|
|
or os.environ.get("RAG_EVAL_OLLAMA_BASE_URL")
|
|
or os.environ.get("OLLAMA_HOST")
|
|
or "http://localhost:11434"
|
|
)
|
|
model = ollama_model or os.environ.get("RAG_EVAL_OLLAMA_MODEL", "llama3.2:1b")
|
|
return OllamaProvider(
|
|
base_url=base_url, embedding_model=None, generation_model=model
|
|
)
|
|
|
|
elif provider == "anthropic":
|
|
api_key = anthropic_api_key or os.environ.get("RAG_EVAL_ANTHROPIC_API_KEY")
|
|
if not api_key:
|
|
raise ValueError(
|
|
"Anthropic API key required. Set RAG_EVAL_ANTHROPIC_API_KEY environment variable."
|
|
)
|
|
model = anthropic_model or os.environ.get(
|
|
"RAG_EVAL_ANTHROPIC_MODEL", "claude-3-5-sonnet-20241022"
|
|
)
|
|
return AnthropicProvider(api_key=api_key, model=model)
|
|
|
|
elif provider == "bedrock":
|
|
region = bedrock_region or os.environ.get(
|
|
"RAG_EVAL_BEDROCK_REGION", os.environ.get("AWS_REGION", "us-east-1")
|
|
)
|
|
model = bedrock_model or os.environ.get(
|
|
"RAG_EVAL_BEDROCK_MODEL", "anthropic.claude-3-sonnet-20240229-v1:0"
|
|
)
|
|
return BedrockProvider(
|
|
region_name=region, embedding_model=None, generation_model=model
|
|
)
|
|
|
|
else:
|
|
raise ValueError(
|
|
f"Invalid provider: {provider}. Must be 'ollama', 'anthropic', or 'bedrock'."
|
|
)
|