Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities.
## Major Changes
### Unified Provider Architecture (ADR-015)
- Created `nextcloud_mcp_server/providers/` with unified Provider ABC
- Providers now support optional capabilities (embeddings and/or generation)
- Auto-detection registry with priority: Bedrock → Ollama → Simple
- Backward compatible - existing code continues to work
### New Providers
- **BedrockProvider**: Full Amazon Bedrock integration
- Embeddings: Titan Embed, Cohere Embed models
- Generation: Claude, Llama, Titan Text, Mistral models
- Model-specific request/response handling
- AWS credential chain integration
- **OllamaProvider**: Migrated with both capabilities support
- **AnthropicProvider**: Moved from test code to production providers
- **SimpleProvider**: Migrated in-memory fallback provider
### Breaking Changes
None - full backward compatibility maintained:
- `embedding.get_embedding_service()` still works
- RAG evaluation tests updated to use unified providers
- All existing tests pass (127 unit tests)
### Testing
- Added 9 comprehensive Bedrock unit tests with mocked boto3
- All existing unit tests pass
- Type checking (ty) and linting (ruff) pass
- Verified backward compatibility
### Documentation
- `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR
- `docs/bedrock-setup.md`: AWS setup guide with IAM permissions
- `CLAUDE.md`: Updated with provider architecture section
### Dependencies
- Added `boto3>=1.35.0` to dev dependencies (optional)
## Environment Variables
### Bedrock
- `AWS_REGION`: AWS region (e.g., "us-east-1")
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings
- `BEDROCK_GENERATION_MODEL`: Model ID for generation
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials
### Ollama
- `OLLAMA_BASE_URL`: API URL
- `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL`: Generation model
## AWS Bedrock Permissions Required
Minimal IAM policy:
```json
{
"Effect": "Allow",
"Action": ["bedrock:InvokeModel"],
"Resource": ["arn:aws:bedrock:*::foundation-model/*"]
}
```
See `docs/bedrock-setup.md` for detailed setup instructions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
8.8 KiB
Amazon Bedrock Setup Guide
This guide covers how to configure the Nextcloud MCP Server to use Amazon Bedrock for embeddings and text generation.
Prerequisites
- AWS Account with access to Amazon Bedrock
- boto3 library installed:
pip install boto3oruv sync --group dev - Model Access - Request access to models in AWS Bedrock console
Required AWS Permissions
IAM Policy for Bedrock Access
The AWS IAM user or role needs the following permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockInvokeModels",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*"
]
}
]
}
Minimal Permissions (Production)
For production deployments, restrict to specific models:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockEmbeddings",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
]
},
{
"Sid": "BedrockGeneration",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
]
}
]
}
Additional Permissions (Optional)
For advanced use cases:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockListModels",
"Effect": "Allow",
"Action": [
"bedrock:ListFoundationModels",
"bedrock:GetFoundationModel"
],
"Resource": "*"
},
{
"Sid": "BedrockAsyncInvoke",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModelAsync",
"bedrock:GetAsyncInvoke",
"bedrock:ListAsyncInvokes"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*"
]
}
]
}
Model Access
Before using Bedrock models, you must request access in the AWS Console:
- Navigate to Amazon Bedrock → Model access
- Click Manage model access
- Select models you want to use:
- Embeddings: Amazon Titan Embed Text, Cohere Embed
- Text Generation: Anthropic Claude, Meta Llama, Amazon Titan Text
- Click Request model access
- Wait for approval (usually instant for most models)
Supported Models
Embedding Models
| Provider | Model ID | Dimensions | Best For |
|---|---|---|---|
| Amazon Titan | amazon.titan-embed-text-v1 |
1,536 | General purpose |
| Amazon Titan | amazon.titan-embed-text-v2:0 |
1,024 | Latest, improved quality |
| Cohere | cohere.embed-english-v3 |
1,024 | English text |
| Cohere | cohere.embed-multilingual-v3 |
1,024 | Multilingual |
Text Generation Models
| Provider | Model ID | Context | Best For |
|---|---|---|---|
| Anthropic | anthropic.claude-3-sonnet-20240229-v1:0 |
200K | Balanced performance |
| Anthropic | anthropic.claude-3-haiku-20240307-v1:0 |
200K | Fast, cost-effective |
| Anthropic | anthropic.claude-3-opus-20240229-v1:0 |
200K | Highest quality |
| Meta | meta.llama3-8b-instruct-v1:0 |
8K | Fast, open-source |
| Meta | meta.llama3-70b-instruct-v1:0 |
8K | High quality |
| Amazon | amazon.titan-text-express-v1 |
8K | Fast, low cost |
| Mistral | mistral.mistral-7b-instruct-v0:2 |
32K | Efficient |
Configuration
Environment Variables
Required:
AWS_REGION=us-east-1
Optional (at least one model required):
# For embeddings
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
# For text generation (RAG evaluation)
BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
AWS Credentials (choose one method):
Method 1: Environment Variables
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Method 2: AWS Credentials File (~/.aws/credentials)
[default]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Method 3: IAM Role (when running on AWS EC2/ECS/Lambda)
- No credentials needed, uses instance/task role automatically
Docker Configuration
Add to your docker-compose.yml:
services:
mcp:
environment:
- AWS_REGION=us-east-1
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
- BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
Or use AWS credentials file volume mount:
services:
mcp:
volumes:
- ~/.aws:/root/.aws:ro
environment:
- AWS_REGION=us-east-1
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
Usage Examples
Embeddings Only
export AWS_REGION=us-east-1
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret
uv run nextcloud-mcp-server
Both Embeddings and Generation
export AWS_REGION=us-east-1
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
export BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
# For RAG evaluation with Bedrock
export RAG_EVAL_PROVIDER=bedrock
export RAG_EVAL_BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
uv run python -m tests.rag_evaluation.evaluate
Programmatic Usage
from nextcloud_mcp_server.providers import BedrockProvider
# Embeddings only
provider = BedrockProvider(
region_name="us-east-1",
embedding_model="amazon.titan-embed-text-v2:0",
)
embeddings = await provider.embed_batch(["text1", "text2"])
# Both capabilities
provider = BedrockProvider(
region_name="us-east-1",
embedding_model="amazon.titan-embed-text-v2:0",
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
)
# Generate embeddings
embedding = await provider.embed("query text")
# Generate text
response = await provider.generate("Write a summary", max_tokens=500)
Cost Considerations
Embedding Costs (as of Jan 2025)
| Model | Price per 1K tokens |
|---|---|
| Titan Embed Text v2 | $0.0001 |
| Cohere Embed English v3 | $0.0001 |
Generation Costs (as of Jan 2025)
| Model | Input (per 1K tokens) | Output (per 1K tokens) |
|---|---|---|
| Claude 3 Haiku | $0.00025 | $0.00125 |
| Claude 3 Sonnet | $0.003 | $0.015 |
| Claude 3 Opus | $0.015 | $0.075 |
| Llama 3 8B | $0.0003 | $0.0006 |
| Titan Text Express | $0.0002 | $0.0006 |
Note: Prices vary by region. Check AWS Bedrock Pricing for current rates.
Troubleshooting
Error: "Executable doesn't exist" or boto3 not found
Solution:
uv sync --group dev # Installs boto3
Error: "AccessDeniedException"
Causes:
- IAM permissions missing
- Model access not requested
- Wrong AWS region
Solution:
- Verify IAM policy includes
bedrock:InvokeModel - Request model access in Bedrock console
- Check model is available in your region
Error: "ResourceNotFoundException"
Cause: Invalid model ID or model not available in region
Solution:
- Verify model ID matches exactly (case-sensitive)
- Check model availability in your AWS region
- Use
aws bedrock list-foundation-modelsto see available models
Error: "ThrottlingException"
Cause: Rate limit exceeded
Solution:
- Reduce request rate
- Request quota increase via AWS Support
- Use batch operations where possible
Security Best Practices
- Use IAM Roles when running on AWS infrastructure
- Rotate Access Keys regularly if using IAM users
- Restrict Permissions to only required models
- Enable CloudTrail for audit logging
- Use AWS Secrets Manager for credential management
- Monitor Costs with AWS Cost Explorer and Budgets
Regional Availability
Amazon Bedrock is available in:
- US East (N. Virginia):
us-east-1✅ Most models - US West (Oregon):
us-west-2✅ Most models - Asia Pacific (Singapore):
ap-southeast-1 - Asia Pacific (Tokyo):
ap-northeast-1 - Europe (Frankfurt):
eu-central-1
Note: Model availability varies by region. Check the AWS Bedrock documentation for current availability.