feat: add unified provider architecture with Amazon Bedrock support
Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities.
## Major Changes
### Unified Provider Architecture (ADR-015)
- Created `nextcloud_mcp_server/providers/` with unified Provider ABC
- Providers now support optional capabilities (embeddings and/or generation)
- Auto-detection registry with priority: Bedrock → Ollama → Simple
- Backward compatible - existing code continues to work
### New Providers
- **BedrockProvider**: Full Amazon Bedrock integration
- Embeddings: Titan Embed, Cohere Embed models
- Generation: Claude, Llama, Titan Text, Mistral models
- Model-specific request/response handling
- AWS credential chain integration
- **OllamaProvider**: Migrated with both capabilities support
- **AnthropicProvider**: Moved from test code to production providers
- **SimpleProvider**: Migrated in-memory fallback provider
### Breaking Changes
None - full backward compatibility maintained:
- `embedding.get_embedding_service()` still works
- RAG evaluation tests updated to use unified providers
- All existing tests pass (127 unit tests)
### Testing
- Added 9 comprehensive Bedrock unit tests with mocked boto3
- All existing unit tests pass
- Type checking (ty) and linting (ruff) pass
- Verified backward compatibility
### Documentation
- `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR
- `docs/bedrock-setup.md`: AWS setup guide with IAM permissions
- `CLAUDE.md`: Updated with provider architecture section
### Dependencies
- Added `boto3>=1.35.0` to dev dependencies (optional)
## Environment Variables
### Bedrock
- `AWS_REGION`: AWS region (e.g., "us-east-1")
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings
- `BEDROCK_GENERATION_MODEL`: Model ID for generation
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials
### Ollama
- `OLLAMA_BASE_URL`: API URL
- `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL`: Generation model
## AWS Bedrock Permissions Required
Minimal IAM policy:
```json
{
"Effect": "Allow",
"Action": ["bedrock:InvokeModel"],
"Resource": ["arn:aws:bedrock:*::foundation-model/*"]
}
```
See `docs/bedrock-setup.md` for detailed setup instructions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -61,8 +61,60 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
- `nextcloud_mcp_server/server/` - MCP tool/resource definitions
|
||||
- `nextcloud_mcp_server/auth/` - OAuth/OIDC authentication
|
||||
- `nextcloud_mcp_server/models/` - Pydantic response models
|
||||
- `nextcloud_mcp_server/providers/` - Unified LLM provider infrastructure (embeddings + generation)
|
||||
- `tests/` - Layered test suite (unit, smoke, integration, load)
|
||||
|
||||
### Provider Architecture (ADR-015)
|
||||
|
||||
**Unified Provider System** for embeddings and text generation:
|
||||
|
||||
**Location:** `nextcloud_mcp_server/providers/`
|
||||
- `base.py` - `Provider` ABC with optional capabilities
|
||||
- `registry.py` - Auto-detection and factory pattern
|
||||
- `ollama.py` - Ollama provider (embeddings + generation)
|
||||
- `anthropic.py` - Anthropic provider (generation only)
|
||||
- `bedrock.py` - Amazon Bedrock provider (embeddings + generation)
|
||||
- `simple.py` - Simple in-memory provider (embeddings only, fallback)
|
||||
|
||||
**Usage:**
|
||||
```python
|
||||
from nextcloud_mcp_server.providers import get_provider
|
||||
|
||||
provider = get_provider() # Auto-detects from environment
|
||||
|
||||
# Check capabilities
|
||||
if provider.supports_embeddings:
|
||||
embeddings = await provider.embed_batch(texts)
|
||||
|
||||
if provider.supports_generation:
|
||||
text = await provider.generate("prompt", max_tokens=500)
|
||||
```
|
||||
|
||||
**Environment Variables:**
|
||||
|
||||
Bedrock:
|
||||
- `AWS_REGION` - AWS region (e.g., "us-east-1")
|
||||
- `BEDROCK_EMBEDDING_MODEL` - Embedding model ID (e.g., "amazon.titan-embed-text-v2:0")
|
||||
- `BEDROCK_GENERATION_MODEL` - Generation model ID (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` - Optional, uses AWS credential chain
|
||||
|
||||
Ollama:
|
||||
- `OLLAMA_BASE_URL` - API URL (e.g., "http://localhost:11434")
|
||||
- `OLLAMA_EMBEDDING_MODEL` - Embedding model (default: "nomic-embed-text")
|
||||
- `OLLAMA_GENERATION_MODEL` - Generation model (e.g., "llama3.2:1b")
|
||||
- `OLLAMA_VERIFY_SSL` - SSL verification (default: "true")
|
||||
|
||||
Simple (fallback, no config needed):
|
||||
- `SIMPLE_EMBEDDING_DIMENSION` - Dimension (default: 384)
|
||||
|
||||
**Auto-Detection Priority:** Bedrock → Ollama → Simple
|
||||
|
||||
**Backward Compatibility:**
|
||||
- Old code using `nextcloud_mcp_server.embedding.get_embedding_service()` still works
|
||||
- `EmbeddingService` now wraps `get_provider()` internally
|
||||
|
||||
**For Details:** See `docs/ADR-015-unified-provider-architecture.md`
|
||||
|
||||
## Development Commands (Quick Reference)
|
||||
|
||||
### Testing
|
||||
|
||||
@@ -0,0 +1,380 @@
|
||||
# ADR-015: Unified Provider Architecture for Embeddings and Text Generation
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2025-01-16
|
||||
**Deciders:** Development Team
|
||||
**Related:** ADR-003 (Vector Database), ADR-008 (MCP Sampling), ADR-013 (RAG Evaluation)
|
||||
|
||||
## Context
|
||||
|
||||
Prior to this refactoring, the codebase had two separate provider systems:
|
||||
|
||||
1. **Embedding Providers** (`nextcloud_mcp_server/embedding/`)
|
||||
- Used `EmbeddingProvider` ABC with methods: `embed()`, `embed_batch()`, `get_dimension()`
|
||||
- Had auto-detection via `EmbeddingService._detect_provider()`
|
||||
- Used for semantic search and vector indexing (production)
|
||||
|
||||
2. **LLM Providers** (`tests/rag_evaluation/llm_providers.py`)
|
||||
- Used `LLMProvider` Protocol with method: `generate()`
|
||||
- Had separate factory function `create_llm_provider()`
|
||||
- Used only for RAG evaluation tests (not production)
|
||||
|
||||
This fragmentation created several problems:
|
||||
|
||||
### Problems with Dual Provider Systems
|
||||
|
||||
1. **Code Duplication**
|
||||
- Ollama configuration appeared in both `embedding/service.py` and `tests/rag_evaluation/llm_providers.py`
|
||||
- Similar provider detection logic in multiple places
|
||||
- Separate singleton patterns for each system
|
||||
|
||||
2. **Limited Extensibility**
|
||||
- Hard-coded provider detection in `EmbeddingService._detect_provider()`
|
||||
- No support for providers that offer both capabilities (like Bedrock)
|
||||
- Adding new providers required modifying multiple files
|
||||
|
||||
3. **Inconsistent Patterns**
|
||||
- BM25 provider didn't follow `EmbeddingProvider` ABC
|
||||
- Different method names across providers (`embed` vs `encode`)
|
||||
- ABC vs Protocol for type checking
|
||||
|
||||
4. **Difficult Scaling**
|
||||
- Adding Amazon Bedrock (our third provider) would exacerbate all issues
|
||||
- No clear path for future providers (OpenAI, Cohere, etc.)
|
||||
|
||||
### Amazon Bedrock Requirements
|
||||
|
||||
Bedrock naturally supports **both** embeddings and text generation:
|
||||
- **Embeddings**: `amazon.titan-embed-text-v1/v2`, `cohere.embed-*`
|
||||
- **Text Generation**: `anthropic.claude-*`, `meta.llama3-*`, `amazon.titan-text-*`
|
||||
- **Unified API**: Single `invoke_model()` method via bedrock-runtime
|
||||
|
||||
This made it the perfect opportunity to establish a unified provider architecture.
|
||||
|
||||
## Decision
|
||||
|
||||
We refactored the provider infrastructure to use a **unified Provider ABC** with optional capabilities:
|
||||
|
||||
### 1. Unified Provider Interface
|
||||
|
||||
**New Structure:**
|
||||
```
|
||||
nextcloud_mcp_server/providers/
|
||||
├── __init__.py
|
||||
├── base.py # Provider ABC with optional capabilities
|
||||
├── registry.py # Auto-detection and factory
|
||||
├── ollama.py # Supports both embedding + generation
|
||||
├── anthropic.py # Generation only
|
||||
├── bedrock.py # Supports both embedding + generation
|
||||
└── simple.py # Embedding only (testing fallback)
|
||||
```
|
||||
|
||||
**Base Class (`providers/base.py`):**
|
||||
```python
|
||||
class Provider(ABC):
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""Generate embedding (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""Generate batch embeddings (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_dimension(self) -> int:
|
||||
"""Get embedding dimension (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def close(self) -> None:
|
||||
"""Close provider and release resources."""
|
||||
pass
|
||||
```
|
||||
|
||||
### 2. Provider Registry
|
||||
|
||||
**Auto-Detection Priority** (`providers/registry.py`):
|
||||
```python
|
||||
class ProviderRegistry:
|
||||
@staticmethod
|
||||
def create_provider() -> Provider:
|
||||
# 1. Bedrock (AWS_REGION or BEDROCK_*_MODEL)
|
||||
# 2. Ollama (OLLAMA_BASE_URL)
|
||||
# 3. Simple (fallback)
|
||||
```
|
||||
|
||||
**Environment Variables:**
|
||||
|
||||
**Bedrock:**
|
||||
- `AWS_REGION`: AWS region (e.g., "us-east-1")
|
||||
- `AWS_ACCESS_KEY_ID`: AWS access key (optional, uses credential chain)
|
||||
- `AWS_SECRET_ACCESS_KEY`: AWS secret key (optional)
|
||||
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
|
||||
- `BEDROCK_GENERATION_MODEL`: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||
|
||||
**Ollama:**
|
||||
- `OLLAMA_BASE_URL`: Ollama API base URL (e.g., "http://localhost:11434")
|
||||
- `OLLAMA_EMBEDDING_MODEL`: Model for embeddings (default: "nomic-embed-text")
|
||||
- `OLLAMA_GENERATION_MODEL`: Model for text generation (e.g., "llama3.2:1b")
|
||||
- `OLLAMA_VERIFY_SSL`: Verify SSL certificates (default: "true")
|
||||
|
||||
**Simple (no configuration, fallback):**
|
||||
- `SIMPLE_EMBEDDING_DIMENSION`: Embedding dimension (default: 384)
|
||||
|
||||
### 3. Backward Compatibility
|
||||
|
||||
**Old Code Continues to Work:**
|
||||
```python
|
||||
# Old way (still works)
|
||||
from nextcloud_mcp_server.embedding import get_embedding_service
|
||||
|
||||
service = get_embedding_service() # Returns singleton Provider
|
||||
embeddings = await service.embed_batch(texts)
|
||||
```
|
||||
|
||||
**New Way (recommended):**
|
||||
```python
|
||||
# New way (cleaner)
|
||||
from nextcloud_mcp_server.providers import get_provider
|
||||
|
||||
provider = get_provider() # Returns singleton Provider
|
||||
embeddings = await provider.embed_batch(texts)
|
||||
|
||||
# Can also use generation if provider supports it
|
||||
if provider.supports_generation:
|
||||
text = await provider.generate("prompt")
|
||||
```
|
||||
|
||||
**Migration Path:**
|
||||
- `embedding/service.py` now wraps `providers.get_provider()` for compatibility
|
||||
- `tests/rag_evaluation/llm_providers.py` now uses unified providers
|
||||
- Old imports still work, marked as deprecated in docstrings
|
||||
|
||||
### 4. Amazon Bedrock Implementation
|
||||
|
||||
**Features:**
|
||||
- Supports both embeddings and text generation
|
||||
- Model-specific request/response handling for:
|
||||
- Titan Embed (amazon.titan-embed-text-*)
|
||||
- Cohere Embed (cohere.embed-*)
|
||||
- Claude (anthropic.claude-*)
|
||||
- Llama (meta.llama3-*)
|
||||
- Titan Text (amazon.titan-text-*)
|
||||
- Mistral (mistral.*)
|
||||
- Uses boto3 bedrock-runtime client
|
||||
- Graceful degradation if boto3 not installed
|
||||
- Async implementation matching existing patterns
|
||||
|
||||
**Model-Specific Handling:**
|
||||
```python
|
||||
# Bedrock embedding request (Titan)
|
||||
{"inputText": text}
|
||||
|
||||
# Bedrock generation request (Claude)
|
||||
{
|
||||
"anthropic_version": "bedrock-2023-05-31",
|
||||
"max_tokens": max_tokens,
|
||||
"temperature": 0.7,
|
||||
"messages": [{"role": "user", "content": prompt}]
|
||||
}
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Sustainable Provider Additions**
|
||||
- New providers only need to implement `Provider` ABC
|
||||
- Auto-detection via environment variables
|
||||
- No modifications to existing code required
|
||||
|
||||
2. **Code Consolidation**
|
||||
- Single provider interface instead of two
|
||||
- Unified configuration pattern
|
||||
- Eliminated duplication
|
||||
|
||||
3. **Better Extensibility**
|
||||
- Providers can support one or both capabilities
|
||||
- Clear capability detection via properties
|
||||
- Registry pattern simplifies auto-detection
|
||||
|
||||
4. **Improved Testing**
|
||||
- RAG evaluation can use any provider (Ollama, Anthropic, Bedrock)
|
||||
- Comprehensive unit tests for all providers
|
||||
- Mocked boto3 tests for Bedrock
|
||||
|
||||
5. **Production-Ready Bedrock Support**
|
||||
- Full embedding and generation support
|
||||
- Multiple model families supported
|
||||
- AWS credential chain integration
|
||||
|
||||
### Neutral
|
||||
|
||||
1. **Optional Boto3 Dependency**
|
||||
- boto3 is dev dependency only (not required for core functionality)
|
||||
- Bedrock provider gracefully fails if boto3 not installed
|
||||
- Users who want Bedrock must `pip install boto3`
|
||||
|
||||
2. **Capability Properties**
|
||||
- All providers must implement capability properties
|
||||
- Methods raise `NotImplementedError` if capability not supported
|
||||
- Clear error messages guide users to alternatives
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Migration Effort**
|
||||
- Existing code must be migrated to new imports (optional, backward compatible)
|
||||
- Documentation needs updating
|
||||
- Users must learn new environment variables
|
||||
|
||||
2. **Increased Complexity**
|
||||
- Provider base class has more methods (embedding + generation)
|
||||
- More environment variables to configure
|
||||
- Capability detection adds runtime checks
|
||||
|
||||
## Implementation
|
||||
|
||||
### Files Created
|
||||
|
||||
**New Provider Infrastructure:**
|
||||
- `nextcloud_mcp_server/providers/__init__.py`
|
||||
- `nextcloud_mcp_server/providers/base.py`
|
||||
- `nextcloud_mcp_server/providers/registry.py`
|
||||
- `nextcloud_mcp_server/providers/ollama.py`
|
||||
- `nextcloud_mcp_server/providers/anthropic.py`
|
||||
- `nextcloud_mcp_server/providers/bedrock.py`
|
||||
- `nextcloud_mcp_server/providers/simple.py`
|
||||
|
||||
**Tests:**
|
||||
- `tests/unit/providers/__init__.py`
|
||||
- `tests/unit/providers/test_bedrock.py` (9 unit tests)
|
||||
|
||||
**Documentation:**
|
||||
- `docs/ADR-015-unified-provider-architecture.md` (this file)
|
||||
|
||||
### Files Modified
|
||||
|
||||
**Backward Compatibility:**
|
||||
- `nextcloud_mcp_server/embedding/service.py` - Now wraps `get_provider()`
|
||||
- `tests/rag_evaluation/llm_providers.py` - Uses unified providers
|
||||
|
||||
**Dependencies:**
|
||||
- `pyproject.toml` - Added `boto3>=1.35.0` to dev dependencies
|
||||
|
||||
### Testing Results
|
||||
|
||||
**Unit Tests:** 127 passed (including 9 new Bedrock tests)
|
||||
**Type Checking:** All checks passed (ty)
|
||||
**Linting:** All checks passed (ruff)
|
||||
**Backward Compatibility:** Verified - existing embedding tests work
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Alternative 1: Keep Separate Provider Systems
|
||||
|
||||
**Pros:**
|
||||
- No refactoring needed
|
||||
- Simpler short-term
|
||||
|
||||
**Cons:**
|
||||
- Bedrock would need to be implemented twice
|
||||
- Continued code duplication
|
||||
- No long-term scalability
|
||||
|
||||
**Decision:** Rejected - technical debt would continue to grow
|
||||
|
||||
### Alternative 2: Separate Embedding and Generation Providers
|
||||
|
||||
Use composition instead of unified interface:
|
||||
```python
|
||||
class CombinedProvider:
|
||||
def __init__(self, embedding: EmbeddingProvider, generation: LLMProvider):
|
||||
self.embedding = embedding
|
||||
self.generation = generation
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Clearer separation of concerns
|
||||
- Simpler individual providers
|
||||
|
||||
**Cons:**
|
||||
- Bedrock and Ollama naturally do both - artificial separation
|
||||
- More complex configuration (two providers to configure)
|
||||
- More boilerplate code
|
||||
|
||||
**Decision:** Rejected - unified interface better matches provider capabilities
|
||||
|
||||
### Alternative 3: Plugin System
|
||||
|
||||
Dynamic provider registration via entry points:
|
||||
```python
|
||||
# setup.py
|
||||
entry_points={
|
||||
'nextcloud_mcp.providers': [
|
||||
'ollama = nextcloud_mcp_server.providers.ollama:OllamaProvider',
|
||||
'bedrock = nextcloud_mcp_server.providers.bedrock:BedrockProvider',
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Most extensible
|
||||
- Third-party providers possible
|
||||
|
||||
**Cons:**
|
||||
- Over-engineered for current needs
|
||||
- Added complexity
|
||||
- No immediate benefit
|
||||
|
||||
**Decision:** Deferred - can add later if needed
|
||||
|
||||
## Future Work
|
||||
|
||||
1. **Additional Providers**
|
||||
- OpenAI (embeddings + generation)
|
||||
- Cohere (embeddings + generation)
|
||||
- Google Vertex AI
|
||||
- Azure OpenAI
|
||||
|
||||
2. **Provider Features**
|
||||
- Streaming generation support
|
||||
- Batch API optimization (when available)
|
||||
- Model-specific optimizations
|
||||
- Cost tracking and metrics
|
||||
|
||||
3. **Configuration Improvements**
|
||||
- Provider profiles (development, production)
|
||||
- Model aliasing (e.g., "small", "large")
|
||||
- Fallback provider chains
|
||||
|
||||
4. **Testing**
|
||||
- Integration tests with real Bedrock endpoints
|
||||
- Performance benchmarking across providers
|
||||
- Cost comparison analysis
|
||||
|
||||
## References
|
||||
|
||||
- [boto3 Bedrock Runtime Documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
|
||||
- [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
|
||||
- ADR-003: Vector Database and Semantic Search
|
||||
- ADR-008: MCP Sampling for Semantic Search
|
||||
- ADR-013: RAG Evaluation Framework
|
||||
@@ -0,0 +1,338 @@
|
||||
# Amazon Bedrock Setup Guide
|
||||
|
||||
This guide covers how to configure the Nextcloud MCP Server to use Amazon Bedrock for embeddings and text generation.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **AWS Account** with access to Amazon Bedrock
|
||||
2. **boto3 library** installed: `pip install boto3` or `uv sync --group dev`
|
||||
3. **Model Access** - Request access to models in AWS Bedrock console
|
||||
|
||||
## Required AWS Permissions
|
||||
|
||||
### IAM Policy for Bedrock Access
|
||||
|
||||
The AWS IAM user or role needs the following permissions:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "BedrockInvokeModels",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel",
|
||||
"bedrock:InvokeModelWithResponseStream"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:*::foundation-model/*"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Minimal Permissions (Production)
|
||||
|
||||
For production deployments, restrict to specific models:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "BedrockEmbeddings",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
|
||||
]
|
||||
},
|
||||
{
|
||||
"Sid": "BedrockGeneration",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Additional Permissions (Optional)
|
||||
|
||||
For advanced use cases:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "BedrockListModels",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:ListFoundationModels",
|
||||
"bedrock:GetFoundationModel"
|
||||
],
|
||||
"Resource": "*"
|
||||
},
|
||||
{
|
||||
"Sid": "BedrockAsyncInvoke",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModelAsync",
|
||||
"bedrock:GetAsyncInvoke",
|
||||
"bedrock:ListAsyncInvokes"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:*::foundation-model/*"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Model Access
|
||||
|
||||
Before using Bedrock models, you must request access in the AWS Console:
|
||||
|
||||
1. Navigate to **Amazon Bedrock** → **Model access**
|
||||
2. Click **Manage model access**
|
||||
3. Select models you want to use:
|
||||
- **Embeddings:** Amazon Titan Embed Text, Cohere Embed
|
||||
- **Text Generation:** Anthropic Claude, Meta Llama, Amazon Titan Text
|
||||
4. Click **Request model access**
|
||||
5. Wait for approval (usually instant for most models)
|
||||
|
||||
## Supported Models
|
||||
|
||||
### Embedding Models
|
||||
|
||||
| Provider | Model ID | Dimensions | Best For |
|
||||
|----------|----------|------------|----------|
|
||||
| Amazon Titan | `amazon.titan-embed-text-v1` | 1,536 | General purpose |
|
||||
| Amazon Titan | `amazon.titan-embed-text-v2:0` | 1,024 | Latest, improved quality |
|
||||
| Cohere | `cohere.embed-english-v3` | 1,024 | English text |
|
||||
| Cohere | `cohere.embed-multilingual-v3` | 1,024 | Multilingual |
|
||||
|
||||
### Text Generation Models
|
||||
|
||||
| Provider | Model ID | Context | Best For |
|
||||
|----------|----------|---------|----------|
|
||||
| Anthropic | `anthropic.claude-3-sonnet-20240229-v1:0` | 200K | Balanced performance |
|
||||
| Anthropic | `anthropic.claude-3-haiku-20240307-v1:0` | 200K | Fast, cost-effective |
|
||||
| Anthropic | `anthropic.claude-3-opus-20240229-v1:0` | 200K | Highest quality |
|
||||
| Meta | `meta.llama3-8b-instruct-v1:0` | 8K | Fast, open-source |
|
||||
| Meta | `meta.llama3-70b-instruct-v1:0` | 8K | High quality |
|
||||
| Amazon | `amazon.titan-text-express-v1` | 8K | Fast, low cost |
|
||||
| Mistral | `mistral.mistral-7b-instruct-v0:2` | 32K | Efficient |
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
**Required:**
|
||||
```bash
|
||||
AWS_REGION=us-east-1
|
||||
```
|
||||
|
||||
**Optional (at least one model required):**
|
||||
```bash
|
||||
# For embeddings
|
||||
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
|
||||
# For text generation (RAG evaluation)
|
||||
BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
```
|
||||
|
||||
**AWS Credentials (choose one method):**
|
||||
|
||||
**Method 1: Environment Variables**
|
||||
```bash
|
||||
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
|
||||
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
```
|
||||
|
||||
**Method 2: AWS Credentials File** (`~/.aws/credentials`)
|
||||
```ini
|
||||
[default]
|
||||
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
|
||||
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
```
|
||||
|
||||
**Method 3: IAM Role** (when running on AWS EC2/ECS/Lambda)
|
||||
- No credentials needed, uses instance/task role automatically
|
||||
|
||||
### Docker Configuration
|
||||
|
||||
Add to your `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
mcp:
|
||||
environment:
|
||||
- AWS_REGION=us-east-1
|
||||
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
- BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
|
||||
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
|
||||
```
|
||||
|
||||
Or use AWS credentials file volume mount:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
mcp:
|
||||
volumes:
|
||||
- ~/.aws:/root/.aws:ro
|
||||
environment:
|
||||
- AWS_REGION=us-east-1
|
||||
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Embeddings Only
|
||||
|
||||
```bash
|
||||
export AWS_REGION=us-east-1
|
||||
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
export AWS_ACCESS_KEY_ID=your-key
|
||||
export AWS_SECRET_ACCESS_KEY=your-secret
|
||||
|
||||
uv run nextcloud-mcp-server
|
||||
```
|
||||
|
||||
### Both Embeddings and Generation
|
||||
|
||||
```bash
|
||||
export AWS_REGION=us-east-1
|
||||
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
export BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
|
||||
# For RAG evaluation with Bedrock
|
||||
export RAG_EVAL_PROVIDER=bedrock
|
||||
export RAG_EVAL_BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
|
||||
uv run python -m tests.rag_evaluation.evaluate
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
|
||||
```python
|
||||
from nextcloud_mcp_server.providers import BedrockProvider
|
||||
|
||||
# Embeddings only
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
)
|
||||
|
||||
embeddings = await provider.embed_batch(["text1", "text2"])
|
||||
|
||||
# Both capabilities
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
# Generate embeddings
|
||||
embedding = await provider.embed("query text")
|
||||
|
||||
# Generate text
|
||||
response = await provider.generate("Write a summary", max_tokens=500)
|
||||
```
|
||||
|
||||
## Cost Considerations
|
||||
|
||||
### Embedding Costs (as of Jan 2025)
|
||||
|
||||
| Model | Price per 1K tokens |
|
||||
|-------|---------------------|
|
||||
| Titan Embed Text v2 | $0.0001 |
|
||||
| Cohere Embed English v3 | $0.0001 |
|
||||
|
||||
### Generation Costs (as of Jan 2025)
|
||||
|
||||
| Model | Input (per 1K tokens) | Output (per 1K tokens) |
|
||||
|-------|----------------------|------------------------|
|
||||
| Claude 3 Haiku | $0.00025 | $0.00125 |
|
||||
| Claude 3 Sonnet | $0.003 | $0.015 |
|
||||
| Claude 3 Opus | $0.015 | $0.075 |
|
||||
| Llama 3 8B | $0.0003 | $0.0006 |
|
||||
| Titan Text Express | $0.0002 | $0.0006 |
|
||||
|
||||
**Note:** Prices vary by region. Check [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/) for current rates.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Error: "Executable doesn't exist" or boto3 not found
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
uv sync --group dev # Installs boto3
|
||||
```
|
||||
|
||||
### Error: "AccessDeniedException"
|
||||
|
||||
**Causes:**
|
||||
1. IAM permissions missing
|
||||
2. Model access not requested
|
||||
3. Wrong AWS region
|
||||
|
||||
**Solution:**
|
||||
1. Verify IAM policy includes `bedrock:InvokeModel`
|
||||
2. Request model access in Bedrock console
|
||||
3. Check model is available in your region
|
||||
|
||||
### Error: "ResourceNotFoundException"
|
||||
|
||||
**Cause:** Invalid model ID or model not available in region
|
||||
|
||||
**Solution:**
|
||||
- Verify model ID matches exactly (case-sensitive)
|
||||
- Check model availability in your AWS region
|
||||
- Use `aws bedrock list-foundation-models` to see available models
|
||||
|
||||
### Error: "ThrottlingException"
|
||||
|
||||
**Cause:** Rate limit exceeded
|
||||
|
||||
**Solution:**
|
||||
- Reduce request rate
|
||||
- Request quota increase via AWS Support
|
||||
- Use batch operations where possible
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
1. **Use IAM Roles** when running on AWS infrastructure
|
||||
2. **Rotate Access Keys** regularly if using IAM users
|
||||
3. **Restrict Permissions** to only required models
|
||||
4. **Enable CloudTrail** for audit logging
|
||||
5. **Use AWS Secrets Manager** for credential management
|
||||
6. **Monitor Costs** with AWS Cost Explorer and Budgets
|
||||
|
||||
## Regional Availability
|
||||
|
||||
Amazon Bedrock is available in:
|
||||
- **US East (N. Virginia)**: `us-east-1` ✅ Most models
|
||||
- **US West (Oregon)**: `us-west-2` ✅ Most models
|
||||
- **Asia Pacific (Singapore)**: `ap-southeast-1`
|
||||
- **Asia Pacific (Tokyo)**: `ap-northeast-1`
|
||||
- **Europe (Frankfurt)**: `eu-central-1`
|
||||
|
||||
**Note:** Model availability varies by region. Check the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html) for current availability.
|
||||
|
||||
## References
|
||||
|
||||
- [AWS Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
|
||||
- [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
|
||||
- [boto3 Bedrock Runtime API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
|
||||
- [Provider Architecture ADR](./ADR-015-unified-provider-architecture.md)
|
||||
@@ -1,57 +1,30 @@
|
||||
"""Embedding service with provider detection."""
|
||||
"""Embedding service with provider detection.
|
||||
|
||||
DEPRECATED: This module is maintained for backward compatibility.
|
||||
New code should use nextcloud_mcp_server.providers.get_provider() directly.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
from .base import EmbeddingProvider
|
||||
from nextcloud_mcp_server.providers import get_provider
|
||||
|
||||
from .bm25_provider import BM25SparseEmbeddingProvider
|
||||
from .ollama_provider import OllamaEmbeddingProvider
|
||||
from .simple_provider import SimpleEmbeddingProvider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class EmbeddingService:
|
||||
"""Unified embedding service with automatic provider detection."""
|
||||
"""
|
||||
Unified embedding service with automatic provider detection.
|
||||
|
||||
DEPRECATED: This class wraps the new unified provider infrastructure
|
||||
for backward compatibility. New code should use
|
||||
nextcloud_mcp_server.providers.get_provider() directly.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize embedding service with auto-detected provider."""
|
||||
self.provider = self._detect_provider()
|
||||
|
||||
def _detect_provider(self) -> EmbeddingProvider:
|
||||
"""
|
||||
Auto-detect available embedding provider.
|
||||
|
||||
Checks environment variables in order:
|
||||
1. OLLAMA_BASE_URL - Use Ollama provider (production)
|
||||
2. OPENAI_API_KEY - Use OpenAI provider (future)
|
||||
3. Fallback to SimpleEmbeddingProvider (testing/development)
|
||||
|
||||
Returns:
|
||||
Configured embedding provider
|
||||
"""
|
||||
# Ollama provider (production)
|
||||
ollama_url = os.getenv("OLLAMA_BASE_URL")
|
||||
if ollama_url:
|
||||
logger.info(f"Using Ollama embedding provider: {ollama_url}")
|
||||
return OllamaEmbeddingProvider(
|
||||
base_url=ollama_url,
|
||||
model=os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text"),
|
||||
verify_ssl=os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true",
|
||||
)
|
||||
|
||||
# OpenAI provider (future implementation)
|
||||
# openai_key = os.getenv("OPENAI_API_KEY")
|
||||
# if openai_key:
|
||||
# return OpenAIEmbeddingProvider(api_key=openai_key)
|
||||
|
||||
# Fallback to simple provider for development/testing
|
||||
logger.warning(
|
||||
"No embedding provider configured (OLLAMA_BASE_URL or OPENAI_API_KEY not set). "
|
||||
"Using SimpleEmbeddingProvider for testing/development. "
|
||||
"For production, configure an external embedding service."
|
||||
)
|
||||
return SimpleEmbeddingProvider(dimension=384)
|
||||
self.provider = get_provider()
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
|
||||
@@ -0,0 +1,18 @@
|
||||
"""Unified provider infrastructure for embeddings and text generation."""
|
||||
|
||||
from .anthropic import AnthropicProvider
|
||||
from .base import Provider
|
||||
from .bedrock import BedrockProvider
|
||||
from .ollama import OllamaProvider
|
||||
from .registry import get_provider, reset_provider
|
||||
from .simple import SimpleProvider
|
||||
|
||||
__all__ = [
|
||||
"Provider",
|
||||
"OllamaProvider",
|
||||
"AnthropicProvider",
|
||||
"SimpleProvider",
|
||||
"BedrockProvider",
|
||||
"get_provider",
|
||||
"reset_provider",
|
||||
]
|
||||
@@ -0,0 +1,97 @@
|
||||
"""Unified Anthropic provider for text generation."""
|
||||
|
||||
import logging
|
||||
|
||||
from anthropic import AsyncAnthropic
|
||||
|
||||
from .base import Provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class AnthropicProvider(Provider):
|
||||
"""
|
||||
Anthropic provider for text generation.
|
||||
|
||||
Supports Claude models via the Anthropic API.
|
||||
Note: Anthropic doesn't provide embedding models, only text generation.
|
||||
"""
|
||||
|
||||
def __init__(self, api_key: str, model: str = "claude-3-5-sonnet-20241022"):
|
||||
"""
|
||||
Initialize Anthropic provider.
|
||||
|
||||
Args:
|
||||
api_key: Anthropic API key
|
||||
model: Model name (e.g., "claude-3-5-sonnet-20241022")
|
||||
"""
|
||||
self.client = AsyncAnthropic(api_key=api_key)
|
||||
self.model = model
|
||||
|
||||
logger.info(f"Initialized Anthropic provider (model={model})")
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return False
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return True
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Anthropic doesn't provide embedding models
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||
)
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Anthropic doesn't provide embedding models
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||
)
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Anthropic doesn't provide embedding models
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||
)
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text using Anthropic API.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
"""
|
||||
message = await self.client.messages.create(
|
||||
model=self.model,
|
||||
max_tokens=max_tokens,
|
||||
temperature=0.7,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return message.content[0].text
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the client (no-op for Anthropic SDK)."""
|
||||
pass
|
||||
@@ -0,0 +1,91 @@
|
||||
"""Unified provider interface for embeddings and text generation."""
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
|
||||
|
||||
class Provider(ABC):
|
||||
"""
|
||||
Unified base class for LLM providers.
|
||||
|
||||
Providers can support embeddings, text generation, or both.
|
||||
Use capability properties to determine what features are available.
|
||||
"""
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support embeddings
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts (optimized).
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support embeddings
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension for this provider.
|
||||
|
||||
Returns:
|
||||
Vector dimension (e.g., 768 for nomic-embed-text)
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support embeddings
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support generation
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def close(self) -> None:
|
||||
"""Close the provider and release resources."""
|
||||
pass
|
||||
@@ -0,0 +1,397 @@
|
||||
"""Amazon Bedrock provider for embeddings and text generation."""
|
||||
|
||||
import json
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
try:
|
||||
import boto3
|
||||
from botocore.exceptions import BotoCoreError, ClientError
|
||||
|
||||
BOTO3_AVAILABLE = True
|
||||
except ImportError:
|
||||
BOTO3_AVAILABLE = False
|
||||
|
||||
from .base import Provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class BedrockProvider(Provider):
|
||||
"""
|
||||
Amazon Bedrock provider supporting both embeddings and text generation.
|
||||
|
||||
Uses AWS Bedrock Runtime API with boto3. Supports various model families:
|
||||
- Embeddings: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2, cohere.embed-*
|
||||
- Text Generation: anthropic.claude-*, meta.llama3-*, amazon.titan-text-*, mistral.*, etc.
|
||||
|
||||
Requires AWS credentials configured via:
|
||||
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION)
|
||||
- AWS credentials file (~/.aws/credentials)
|
||||
- IAM role (when running on AWS)
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
region_name: str | None = None,
|
||||
embedding_model: str | None = None,
|
||||
generation_model: str | None = None,
|
||||
aws_access_key_id: str | None = None,
|
||||
aws_secret_access_key: str | None = None,
|
||||
):
|
||||
"""
|
||||
Initialize Bedrock provider.
|
||||
|
||||
Args:
|
||||
region_name: AWS region (e.g., "us-east-1"). Defaults to AWS_REGION env var.
|
||||
embedding_model: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0").
|
||||
None disables embeddings.
|
||||
generation_model: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0").
|
||||
None disables generation.
|
||||
aws_access_key_id: AWS access key (optional, uses default credential chain if not provided)
|
||||
aws_secret_access_key: AWS secret key (optional, uses default credential chain if not provided)
|
||||
|
||||
Raises:
|
||||
ImportError: If boto3 is not installed
|
||||
"""
|
||||
if not BOTO3_AVAILABLE:
|
||||
raise ImportError(
|
||||
"boto3 is required for Bedrock provider. Install with: pip install boto3"
|
||||
)
|
||||
|
||||
self.embedding_model = embedding_model
|
||||
self.generation_model = generation_model
|
||||
self._dimension: int | None = None # Detected dynamically
|
||||
|
||||
# Initialize bedrock-runtime client
|
||||
client_kwargs: dict[str, Any] = {}
|
||||
if region_name:
|
||||
client_kwargs["region_name"] = region_name
|
||||
if aws_access_key_id:
|
||||
client_kwargs["aws_access_key_id"] = aws_access_key_id
|
||||
if aws_secret_access_key:
|
||||
client_kwargs["aws_secret_access_key"] = aws_secret_access_key
|
||||
|
||||
self.client = boto3.client("bedrock-runtime", **client_kwargs)
|
||||
|
||||
logger.info(
|
||||
f"Initialized Bedrock provider in region {region_name or 'default'} "
|
||||
f"(embedding_model={embedding_model}, generation_model={generation_model})"
|
||||
)
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return self.embedding_model is not None
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return self.generation_model is not None
|
||||
|
||||
def _create_embedding_request(self, text: str) -> dict[str, Any]:
|
||||
"""
|
||||
Create model-specific embedding request payload.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Request payload dict for the embedding model
|
||||
"""
|
||||
if not self.embedding_model:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
# Titan Embed models
|
||||
if self.embedding_model.startswith("amazon.titan-embed"):
|
||||
return {"inputText": text}
|
||||
|
||||
# Cohere Embed models
|
||||
elif self.embedding_model.startswith("cohere.embed"):
|
||||
return {"texts": [text], "input_type": "search_document"}
|
||||
|
||||
# Unknown model - try Titan format as default
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown embedding model format for {self.embedding_model}, "
|
||||
"using Titan format as default"
|
||||
)
|
||||
return {"inputText": text}
|
||||
|
||||
def _parse_embedding_response(self, response: dict[str, Any]) -> list[float]:
|
||||
"""
|
||||
Parse model-specific embedding response.
|
||||
|
||||
Args:
|
||||
response: Raw response from Bedrock
|
||||
|
||||
Returns:
|
||||
Embedding vector as list of floats
|
||||
"""
|
||||
# Titan Embed models
|
||||
if self.embedding_model and self.embedding_model.startswith(
|
||||
"amazon.titan-embed"
|
||||
):
|
||||
return response["embedding"]
|
||||
|
||||
# Cohere Embed models
|
||||
elif self.embedding_model and self.embedding_model.startswith("cohere.embed"):
|
||||
return response["embeddings"][0]
|
||||
|
||||
# Unknown model - try Titan format as default
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown embedding response format for {self.embedding_model}, "
|
||||
"trying Titan format"
|
||||
)
|
||||
return response.get("embedding", response.get("embeddings", [None])[0])
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
ClientError: If Bedrock API call fails
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
try:
|
||||
request_body = self._create_embedding_request(text)
|
||||
|
||||
response = self.client.invoke_model(
|
||||
modelId=self.embedding_model,
|
||||
body=json.dumps(request_body),
|
||||
accept="application/json",
|
||||
contentType="application/json",
|
||||
)
|
||||
|
||||
response_body = json.loads(response["body"].read())
|
||||
embedding = self._parse_embedding_response(response_body)
|
||||
|
||||
return embedding
|
||||
|
||||
except (BotoCoreError, ClientError) as e:
|
||||
logger.error(f"Bedrock embedding error: {e}")
|
||||
raise
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts.
|
||||
|
||||
Note: Current implementation sends requests sequentially.
|
||||
Future optimization could use asyncio for concurrent requests.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
ClientError: If Bedrock API call fails
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
embeddings = []
|
||||
for text in texts:
|
||||
embedding = await self.embed(text)
|
||||
embeddings.append(embedding)
|
||||
return embeddings
|
||||
|
||||
async def _detect_dimension(self):
|
||||
"""
|
||||
Detect embedding dimension by generating a test embedding.
|
||||
"""
|
||||
if self._dimension is None and self.supports_embeddings:
|
||||
logger.debug(
|
||||
f"Detecting embedding dimension for model {self.embedding_model}..."
|
||||
)
|
||||
test_embedding = await self.embed("test")
|
||||
self._dimension = len(test_embedding)
|
||||
logger.info(
|
||||
f"Detected embedding dimension: {self._dimension} "
|
||||
f"for model {self.embedding_model}"
|
||||
)
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension.
|
||||
|
||||
Returns:
|
||||
Vector dimension for the configured embedding model
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
RuntimeError: If dimension not detected yet (call _detect_dimension first)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
if self._dimension is None:
|
||||
raise RuntimeError(
|
||||
f"Embedding dimension not detected yet for model {self.embedding_model}. "
|
||||
"Call _detect_dimension() first or generate an embedding."
|
||||
)
|
||||
return self._dimension
|
||||
|
||||
def _create_generation_request(
|
||||
self, prompt: str, max_tokens: int
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
Create model-specific text generation request payload.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Request payload dict for the generation model
|
||||
"""
|
||||
if not self.generation_model:
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported - no generation_model configured"
|
||||
)
|
||||
|
||||
# Anthropic Claude models
|
||||
if self.generation_model.startswith("anthropic.claude"):
|
||||
return {
|
||||
"anthropic_version": "bedrock-2023-05-31",
|
||||
"max_tokens": max_tokens,
|
||||
"temperature": 0.7,
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
}
|
||||
|
||||
# Meta Llama models
|
||||
elif self.generation_model.startswith("meta.llama"):
|
||||
return {"prompt": prompt, "max_gen_len": max_tokens, "temperature": 0.7}
|
||||
|
||||
# Amazon Titan Text models
|
||||
elif self.generation_model.startswith("amazon.titan-text"):
|
||||
return {
|
||||
"inputText": prompt,
|
||||
"textGenerationConfig": {
|
||||
"maxTokenCount": max_tokens,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
}
|
||||
|
||||
# Mistral models
|
||||
elif self.generation_model.startswith("mistral"):
|
||||
return {"prompt": prompt, "max_tokens": max_tokens, "temperature": 0.7}
|
||||
|
||||
# Unknown model - try Claude format as default
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown generation model format for {self.generation_model}, "
|
||||
"using Claude format as default"
|
||||
)
|
||||
return {
|
||||
"anthropic_version": "bedrock-2023-05-31",
|
||||
"max_tokens": max_tokens,
|
||||
"temperature": 0.7,
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
}
|
||||
|
||||
def _parse_generation_response(self, response: dict[str, Any]) -> str:
|
||||
"""
|
||||
Parse model-specific text generation response.
|
||||
|
||||
Args:
|
||||
response: Raw response from Bedrock
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
"""
|
||||
# Anthropic Claude models
|
||||
if self.generation_model and self.generation_model.startswith(
|
||||
"anthropic.claude"
|
||||
):
|
||||
return response["content"][0]["text"]
|
||||
|
||||
# Meta Llama models
|
||||
elif self.generation_model and self.generation_model.startswith("meta.llama"):
|
||||
return response["generation"]
|
||||
|
||||
# Amazon Titan Text models
|
||||
elif self.generation_model and self.generation_model.startswith(
|
||||
"amazon.titan-text"
|
||||
):
|
||||
return response["results"][0]["outputText"]
|
||||
|
||||
# Mistral models
|
||||
elif self.generation_model and self.generation_model.startswith("mistral"):
|
||||
return response["outputs"][0]["text"]
|
||||
|
||||
# Unknown model - try common response fields
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown generation response format for {self.generation_model}, "
|
||||
"trying common fields"
|
||||
)
|
||||
# Try common response field names
|
||||
for field in ["text", "generation", "outputText", "completion"]:
|
||||
if field in response:
|
||||
return response[field]
|
||||
# Last resort: return JSON string
|
||||
return json.dumps(response)
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If generation not enabled (no generation_model)
|
||||
ClientError: If Bedrock API call fails
|
||||
"""
|
||||
if not self.supports_generation:
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported - no generation_model configured"
|
||||
)
|
||||
|
||||
try:
|
||||
request_body = self._create_generation_request(prompt, max_tokens)
|
||||
|
||||
response = self.client.invoke_model(
|
||||
modelId=self.generation_model,
|
||||
body=json.dumps(request_body),
|
||||
accept="application/json",
|
||||
contentType="application/json",
|
||||
)
|
||||
|
||||
response_body = json.loads(response["body"].read())
|
||||
text = self._parse_generation_response(response_body)
|
||||
|
||||
return text
|
||||
|
||||
except (BotoCoreError, ClientError) as e:
|
||||
logger.error(f"Bedrock generation error: {e}")
|
||||
raise
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the client (no-op for boto3 clients)."""
|
||||
pass
|
||||
@@ -0,0 +1,221 @@
|
||||
"""Unified Ollama provider for embeddings and text generation."""
|
||||
|
||||
import logging
|
||||
|
||||
import httpx
|
||||
|
||||
from .base import Provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class OllamaProvider(Provider):
|
||||
"""
|
||||
Ollama provider supporting both embeddings and text generation.
|
||||
|
||||
Supports TLS, SSL verification, and automatic model loading.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
base_url: str,
|
||||
embedding_model: str | None = None,
|
||||
generation_model: str | None = None,
|
||||
verify_ssl: bool = True,
|
||||
timeout: httpx.Timeout | None = None,
|
||||
):
|
||||
"""
|
||||
Initialize Ollama provider.
|
||||
|
||||
Args:
|
||||
base_url: Ollama API base URL (e.g., https://ollama.internal.example.com:443)
|
||||
embedding_model: Model for embeddings (e.g., "nomic-embed-text"). None disables embeddings.
|
||||
generation_model: Model for text generation (e.g., "llama3.2:1b"). None disables generation.
|
||||
verify_ssl: Verify SSL certificates (default: True)
|
||||
timeout: HTTP timeout configuration
|
||||
"""
|
||||
self.base_url = base_url.rstrip("/")
|
||||
self.embedding_model = embedding_model
|
||||
self.generation_model = generation_model
|
||||
self.verify_ssl = verify_ssl
|
||||
|
||||
if timeout is None:
|
||||
timeout = httpx.Timeout(timeout=120, connect=5)
|
||||
|
||||
self.client = httpx.AsyncClient(verify=verify_ssl, timeout=timeout)
|
||||
self._dimension: int | None = None # Detected dynamically for embeddings
|
||||
|
||||
logger.info(
|
||||
f"Initialized Ollama provider: {base_url} "
|
||||
f"(embedding_model={embedding_model}, generation_model={generation_model}, "
|
||||
f"verify_ssl={verify_ssl})"
|
||||
)
|
||||
|
||||
# Pre-check and auto-load models
|
||||
if embedding_model:
|
||||
self._check_model_is_loaded(embedding_model, autoload=True)
|
||||
if generation_model:
|
||||
self._check_model_is_loaded(generation_model, autoload=True)
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return self.embedding_model is not None
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return self.generation_model is not None
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
response = await self.client.post(
|
||||
f"{self.base_url}/api/embeddings",
|
||||
json={"model": self.embedding_model, "prompt": text},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()["embedding"]
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts (batched requests).
|
||||
|
||||
Note: Ollama doesn't have native batch API, so we send requests sequentially.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
embeddings = []
|
||||
for text in texts:
|
||||
embedding = await self.embed(text)
|
||||
embeddings.append(embedding)
|
||||
return embeddings
|
||||
|
||||
async def _detect_dimension(self):
|
||||
"""
|
||||
Detect embedding dimension by generating a test embedding.
|
||||
|
||||
This method queries the model to determine the actual dimension
|
||||
instead of relying on hardcoded values.
|
||||
"""
|
||||
if self._dimension is None and self.supports_embeddings:
|
||||
logger.debug(
|
||||
f"Detecting embedding dimension for model {self.embedding_model}..."
|
||||
)
|
||||
test_embedding = await self.embed("test")
|
||||
self._dimension = len(test_embedding)
|
||||
logger.info(
|
||||
f"Detected embedding dimension: {self._dimension} "
|
||||
f"for model {self.embedding_model}"
|
||||
)
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension.
|
||||
|
||||
Returns:
|
||||
Vector dimension for the configured embedding model
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
RuntimeError: If dimension not detected yet (call _detect_dimension first)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
if self._dimension is None:
|
||||
raise RuntimeError(
|
||||
f"Embedding dimension not detected yet for model {self.embedding_model}. "
|
||||
"Call _detect_dimension() first or generate an embedding."
|
||||
)
|
||||
return self._dimension
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If generation not enabled (no generation_model)
|
||||
"""
|
||||
if not self.supports_generation:
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported - no generation_model configured"
|
||||
)
|
||||
|
||||
response = await self.client.post(
|
||||
f"{self.base_url}/api/generate",
|
||||
json={
|
||||
"model": self.generation_model,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"num_predict": max_tokens,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
return data["response"]
|
||||
|
||||
def _check_model_is_loaded(self, model: str, autoload: bool = True):
|
||||
"""
|
||||
Check if model is loaded in Ollama, optionally auto-loading it.
|
||||
|
||||
Args:
|
||||
model: Model name to check
|
||||
autoload: Whether to automatically pull the model if not loaded
|
||||
"""
|
||||
response = httpx.get(f"{self.base_url}/api/tags")
|
||||
response.raise_for_status()
|
||||
|
||||
models = [m["name"] for m in response.json().get("models", [])]
|
||||
logger.info("Ollama has following models pre-loaded: %s", models)
|
||||
|
||||
if (model not in models) and autoload:
|
||||
logger.warning(
|
||||
"Model '%s' not yet available in ollama, attempting to pull now...",
|
||||
model,
|
||||
)
|
||||
response = httpx.post(f"{self.base_url}/api/pull", json={"model": model})
|
||||
response.raise_for_status()
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close HTTP client."""
|
||||
await self.client.aclose()
|
||||
@@ -0,0 +1,126 @@
|
||||
"""Provider registry and factory for auto-detection and instantiation."""
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
from .base import Provider
|
||||
from .bedrock import BedrockProvider
|
||||
from .ollama import OllamaProvider
|
||||
from .simple import SimpleProvider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ProviderRegistry:
|
||||
"""
|
||||
Registry for provider auto-detection and instantiation.
|
||||
|
||||
Checks environment variables in priority order and creates appropriate provider:
|
||||
1. Bedrock (AWS_REGION + BEDROCK_*_MODEL)
|
||||
2. Ollama (OLLAMA_BASE_URL)
|
||||
3. Simple (fallback for testing/development)
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def create_provider() -> Provider:
|
||||
"""
|
||||
Auto-detect and create provider based on environment variables.
|
||||
|
||||
Priority order:
|
||||
1. Bedrock - if AWS_REGION or BEDROCK_EMBEDDING_MODEL is set
|
||||
2. Ollama - if OLLAMA_BASE_URL is set
|
||||
3. Simple - fallback for testing/development
|
||||
|
||||
Returns:
|
||||
Provider instance
|
||||
|
||||
Environment Variables:
|
||||
Bedrock:
|
||||
- AWS_REGION: AWS region (e.g., "us-east-1")
|
||||
- AWS_ACCESS_KEY_ID: AWS access key (optional, uses credential chain)
|
||||
- AWS_SECRET_ACCESS_KEY: AWS secret key (optional)
|
||||
- BEDROCK_EMBEDDING_MODEL: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
|
||||
- BEDROCK_GENERATION_MODEL: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||
|
||||
Ollama:
|
||||
- OLLAMA_BASE_URL: Ollama API base URL (e.g., "http://localhost:11434")
|
||||
- OLLAMA_EMBEDDING_MODEL: Model for embeddings (default: "nomic-embed-text")
|
||||
- OLLAMA_GENERATION_MODEL: Model for text generation (e.g., "llama3.2:1b")
|
||||
- OLLAMA_VERIFY_SSL: Verify SSL certificates (default: "true")
|
||||
|
||||
Simple (no configuration needed, fallback):
|
||||
- SIMPLE_EMBEDDING_DIMENSION: Embedding dimension (default: 384)
|
||||
"""
|
||||
# 1. Check for Bedrock
|
||||
aws_region = os.getenv("AWS_REGION")
|
||||
bedrock_embedding_model = os.getenv("BEDROCK_EMBEDDING_MODEL")
|
||||
bedrock_generation_model = os.getenv("BEDROCK_GENERATION_MODEL")
|
||||
|
||||
if aws_region or bedrock_embedding_model or bedrock_generation_model:
|
||||
logger.info(
|
||||
f"Using Bedrock provider: region={aws_region}, "
|
||||
f"embedding_model={bedrock_embedding_model}, "
|
||||
f"generation_model={bedrock_generation_model}"
|
||||
)
|
||||
return BedrockProvider(
|
||||
region_name=aws_region,
|
||||
embedding_model=bedrock_embedding_model,
|
||||
generation_model=bedrock_generation_model,
|
||||
aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
|
||||
aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
|
||||
)
|
||||
|
||||
# 2. Check for Ollama
|
||||
ollama_url = os.getenv("OLLAMA_BASE_URL")
|
||||
if ollama_url:
|
||||
embedding_model = os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text")
|
||||
generation_model = os.getenv("OLLAMA_GENERATION_MODEL")
|
||||
verify_ssl = os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true"
|
||||
|
||||
logger.info(
|
||||
f"Using Ollama provider: {ollama_url}, "
|
||||
f"embedding_model={embedding_model}, "
|
||||
f"generation_model={generation_model}"
|
||||
)
|
||||
return OllamaProvider(
|
||||
base_url=ollama_url,
|
||||
embedding_model=embedding_model,
|
||||
generation_model=generation_model,
|
||||
verify_ssl=verify_ssl,
|
||||
)
|
||||
|
||||
# 3. Fallback to Simple provider for development/testing
|
||||
dimension = int(os.getenv("SIMPLE_EMBEDDING_DIMENSION", "384"))
|
||||
logger.warning(
|
||||
"No provider configured (AWS_REGION, OLLAMA_BASE_URL not set). "
|
||||
"Using SimpleProvider for testing/development. "
|
||||
"For production, configure Bedrock or Ollama."
|
||||
)
|
||||
return SimpleProvider(dimension=dimension)
|
||||
|
||||
|
||||
# Singleton instance
|
||||
_provider: Provider | None = None
|
||||
|
||||
|
||||
def get_provider() -> Provider:
|
||||
"""
|
||||
Get singleton provider instance.
|
||||
|
||||
Returns:
|
||||
Global Provider instance (auto-detected on first call)
|
||||
"""
|
||||
global _provider
|
||||
if _provider is None:
|
||||
_provider = ProviderRegistry.create_provider()
|
||||
return _provider
|
||||
|
||||
|
||||
def reset_provider():
|
||||
"""
|
||||
Reset singleton provider instance.
|
||||
|
||||
Useful for testing or reconfiguration.
|
||||
"""
|
||||
global _provider
|
||||
_provider = None
|
||||
@@ -0,0 +1,149 @@
|
||||
"""Simple in-process embedding provider for testing.
|
||||
|
||||
This provider uses a basic TF-IDF-like approach with feature hashing to generate
|
||||
deterministic embeddings without requiring external services. Suitable for testing
|
||||
but not for production use.
|
||||
"""
|
||||
|
||||
import hashlib
|
||||
import math
|
||||
import re
|
||||
from collections import Counter
|
||||
|
||||
from .base import Provider
|
||||
|
||||
|
||||
class SimpleProvider(Provider):
|
||||
"""Simple deterministic embedding provider using feature hashing.
|
||||
|
||||
This implementation:
|
||||
- Tokenizes text into words
|
||||
- Uses feature hashing to map words to fixed-size vectors
|
||||
- Applies TF-IDF-like weighting
|
||||
- Normalizes vectors to unit length
|
||||
|
||||
Not suitable for production but good for testing semantic search infrastructure.
|
||||
Only supports embeddings, not text generation.
|
||||
"""
|
||||
|
||||
def __init__(self, dimension: int = 384):
|
||||
"""Initialize simple embedding provider.
|
||||
|
||||
Args:
|
||||
dimension: Embedding dimension (default: 384)
|
||||
"""
|
||||
self.dimension = dimension
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return True
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return False
|
||||
|
||||
def _tokenize(self, text: str) -> list[str]:
|
||||
"""Tokenize text into lowercase words.
|
||||
|
||||
Args:
|
||||
text: Input text
|
||||
|
||||
Returns:
|
||||
List of lowercase word tokens
|
||||
"""
|
||||
# Simple word tokenization
|
||||
text = text.lower()
|
||||
words = re.findall(r"\b\w+\b", text)
|
||||
return words
|
||||
|
||||
def _hash_word(self, word: str) -> int:
|
||||
"""Hash word to dimension index.
|
||||
|
||||
Args:
|
||||
word: Word to hash
|
||||
|
||||
Returns:
|
||||
Index in range [0, dimension)
|
||||
"""
|
||||
hash_bytes = hashlib.md5(word.encode()).digest()
|
||||
hash_int = int.from_bytes(hash_bytes[:4], byteorder="big")
|
||||
return hash_int % self.dimension
|
||||
|
||||
def _embed_single(self, text: str) -> list[float]:
|
||||
"""Generate embedding for single text.
|
||||
|
||||
Args:
|
||||
text: Input text
|
||||
|
||||
Returns:
|
||||
Normalized embedding vector
|
||||
"""
|
||||
tokens = self._tokenize(text)
|
||||
if not tokens:
|
||||
return [0.0] * self.dimension
|
||||
|
||||
# Count term frequencies
|
||||
term_freq = Counter(tokens)
|
||||
|
||||
# Initialize vector
|
||||
vector = [0.0] * self.dimension
|
||||
|
||||
# Apply TF weighting with feature hashing
|
||||
for word, count in term_freq.items():
|
||||
idx = self._hash_word(word)
|
||||
# Simple TF weighting: log(1 + count)
|
||||
vector[idx] += math.log1p(count)
|
||||
|
||||
# Normalize to unit length
|
||||
norm = math.sqrt(sum(x * x for x in vector))
|
||||
if norm > 0:
|
||||
vector = [x / norm for x in vector]
|
||||
|
||||
return vector
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
"""
|
||||
return self._embed_single(text)
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""Generate embeddings for multiple texts.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
"""
|
||||
return [self._embed_single(text) for text in texts]
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""Get embedding dimension.
|
||||
|
||||
Returns:
|
||||
Vector dimension
|
||||
"""
|
||||
return self.dimension
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Simple provider doesn't support text generation
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported by Simple provider - use Ollama, Anthropic, or Bedrock"
|
||||
)
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the provider (no-op for simple provider)."""
|
||||
pass
|
||||
@@ -104,6 +104,7 @@ module-root = ""
|
||||
[dependency-groups]
|
||||
dev = [
|
||||
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
|
||||
"boto3>=1.35.0", # For Amazon Bedrock provider (optional)
|
||||
"commitizen>=4.8.2",
|
||||
"datasets>=3.3.0", # For BeIR nfcorpus dataset loading
|
||||
"ipython>=9.2.0",
|
||||
|
||||
@@ -1,99 +1,20 @@
|
||||
"""LLM provider abstraction for RAG evaluation.
|
||||
|
||||
Supports Ollama (local) and Anthropic (cloud) providers for both ground truth
|
||||
DEPRECATED: This module is maintained for backward compatibility with RAG evaluation tests.
|
||||
New code should use nextcloud_mcp_server.providers directly.
|
||||
|
||||
Supports Ollama (local), Anthropic (cloud), and Bedrock (AWS) providers for both ground truth
|
||||
generation and evaluation.
|
||||
"""
|
||||
|
||||
import os
|
||||
from typing import Protocol
|
||||
|
||||
import httpx
|
||||
from anthropic import AsyncAnthropic
|
||||
|
||||
|
||||
class LLMProvider(Protocol):
|
||||
"""Protocol for LLM providers."""
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
"""
|
||||
...
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the provider and release resources."""
|
||||
...
|
||||
|
||||
|
||||
class OllamaProvider:
|
||||
"""Ollama provider for local LLM inference."""
|
||||
|
||||
def __init__(self, base_url: str, model: str):
|
||||
"""Initialize Ollama provider.
|
||||
|
||||
Args:
|
||||
base_url: Ollama API base URL (e.g., http://localhost:11434)
|
||||
model: Model name (e.g., llama3.1:8b)
|
||||
"""
|
||||
self.base_url = base_url.rstrip("/")
|
||||
self.model = model
|
||||
self.client = httpx.AsyncClient(timeout=600.0) # 10 min timeout for generation
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text using Ollama API."""
|
||||
response = await self.client.post(
|
||||
f"{self.base_url}/api/generate",
|
||||
json={
|
||||
"model": self.model,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"num_predict": max_tokens,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
return data["response"]
|
||||
|
||||
async def close(self):
|
||||
"""Close the HTTP client."""
|
||||
await self.client.aclose()
|
||||
|
||||
|
||||
class AnthropicProvider:
|
||||
"""Anthropic provider for cloud LLM inference."""
|
||||
|
||||
def __init__(self, api_key: str, model: str):
|
||||
"""Initialize Anthropic provider.
|
||||
|
||||
Args:
|
||||
api_key: Anthropic API key
|
||||
model: Model name (e.g., claude-3-5-sonnet-20241022)
|
||||
"""
|
||||
self.client = AsyncAnthropic(api_key=api_key)
|
||||
self.model = model
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text using Anthropic API."""
|
||||
message = await self.client.messages.create(
|
||||
model=self.model,
|
||||
max_tokens=max_tokens,
|
||||
temperature=0.7,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return message.content[0].text
|
||||
|
||||
async def close(self):
|
||||
"""Close the client (no-op for Anthropic)."""
|
||||
pass
|
||||
from nextcloud_mcp_server.providers import (
|
||||
AnthropicProvider,
|
||||
BedrockProvider,
|
||||
OllamaProvider,
|
||||
Provider,
|
||||
)
|
||||
|
||||
|
||||
def create_llm_provider(
|
||||
@@ -102,18 +23,24 @@ def create_llm_provider(
|
||||
ollama_model: str | None = None,
|
||||
anthropic_api_key: str | None = None,
|
||||
anthropic_model: str | None = None,
|
||||
) -> LLMProvider:
|
||||
bedrock_region: str | None = None,
|
||||
bedrock_model: str | None = None,
|
||||
) -> Provider:
|
||||
"""Create an LLM provider from environment variables or arguments.
|
||||
|
||||
Args:
|
||||
provider: Provider type ('ollama' or 'anthropic'). Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
||||
provider: Provider type ('ollama', 'anthropic', or 'bedrock').
|
||||
Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
||||
ollama_base_url: Ollama base URL. Defaults to RAG_EVAL_OLLAMA_BASE_URL or 'http://localhost:11434'
|
||||
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.1:8b'
|
||||
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.2:1b'
|
||||
anthropic_api_key: Anthropic API key. Defaults to RAG_EVAL_ANTHROPIC_API_KEY env var
|
||||
anthropic_model: Anthropic model. Defaults to RAG_EVAL_ANTHROPIC_MODEL or 'claude-3-5-sonnet-20241022'
|
||||
bedrock_region: AWS region. Defaults to RAG_EVAL_BEDROCK_REGION or AWS_REGION env var
|
||||
bedrock_model: Bedrock model ID. Defaults to RAG_EVAL_BEDROCK_MODEL or
|
||||
'anthropic.claude-3-sonnet-20240229-v1:0'
|
||||
|
||||
Returns:
|
||||
LLMProvider instance
|
||||
Provider instance
|
||||
|
||||
Raises:
|
||||
ValueError: If provider is invalid or required credentials are missing
|
||||
@@ -130,7 +57,9 @@ def create_llm_provider(
|
||||
or "http://localhost:11434"
|
||||
)
|
||||
model = ollama_model or os.environ.get("RAG_EVAL_OLLAMA_MODEL", "llama3.2:1b")
|
||||
return OllamaProvider(base_url=base_url, model=model)
|
||||
return OllamaProvider(
|
||||
base_url=base_url, embedding_model=None, generation_model=model
|
||||
)
|
||||
|
||||
elif provider == "anthropic":
|
||||
api_key = anthropic_api_key or os.environ.get("RAG_EVAL_ANTHROPIC_API_KEY")
|
||||
@@ -143,7 +72,18 @@ def create_llm_provider(
|
||||
)
|
||||
return AnthropicProvider(api_key=api_key, model=model)
|
||||
|
||||
elif provider == "bedrock":
|
||||
region = bedrock_region or os.environ.get(
|
||||
"RAG_EVAL_BEDROCK_REGION", os.environ.get("AWS_REGION", "us-east-1")
|
||||
)
|
||||
model = bedrock_model or os.environ.get(
|
||||
"RAG_EVAL_BEDROCK_MODEL", "anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
)
|
||||
return BedrockProvider(
|
||||
region_name=region, embedding_model=None, generation_model=model
|
||||
)
|
||||
|
||||
else:
|
||||
raise ValueError(
|
||||
f"Invalid provider: {provider}. Must be 'ollama' or 'anthropic'."
|
||||
f"Invalid provider: {provider}. Must be 'ollama', 'anthropic', or 'bedrock'."
|
||||
)
|
||||
|
||||
@@ -0,0 +1 @@
|
||||
"""Unit tests for provider infrastructure."""
|
||||
@@ -0,0 +1,280 @@
|
||||
"""Unit tests for Bedrock provider."""
|
||||
|
||||
import json
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
from nextcloud_mcp_server.providers.bedrock import BOTO3_AVAILABLE, BedrockProvider
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_bedrock_client(mocker):
|
||||
"""Mock boto3 bedrock-runtime client."""
|
||||
if not BOTO3_AVAILABLE:
|
||||
pytest.skip("boto3 not installed")
|
||||
|
||||
mock_client = MagicMock()
|
||||
mocker.patch("boto3.client", return_value=mock_client)
|
||||
return mock_client
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_embedding_titan(mock_bedrock_client):
|
||||
"""Test Bedrock embedding with Titan model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"embedding": [0.1, 0.2, 0.3]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model=None,
|
||||
)
|
||||
|
||||
# Test embedding
|
||||
embedding = await provider.embed("test text")
|
||||
|
||||
assert embedding == [0.1, 0.2, 0.3]
|
||||
mock_bedrock_client.invoke_model.assert_called_once()
|
||||
call_args = mock_bedrock_client.invoke_model.call_args
|
||||
|
||||
assert call_args.kwargs["modelId"] == "amazon.titan-embed-text-v2:0"
|
||||
body = json.loads(call_args.kwargs["body"])
|
||||
assert body == {"inputText": "test text"}
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_embedding_batch(mock_bedrock_client):
|
||||
"""Test Bedrock batch embedding."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"embedding": [0.1, 0.2, 0.3]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model=None,
|
||||
)
|
||||
|
||||
# Test batch embedding
|
||||
embeddings = await provider.embed_batch(["text1", "text2"])
|
||||
|
||||
assert len(embeddings) == 2
|
||||
assert embeddings[0] == [0.1, 0.2, 0.3]
|
||||
assert embeddings[1] == [0.1, 0.2, 0.3]
|
||||
assert mock_bedrock_client.invoke_model.call_count == 2
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_generation_claude(mock_bedrock_client):
|
||||
"""Test Bedrock text generation with Claude model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps(
|
||||
{"content": [{"text": "Generated response"}]}
|
||||
).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model=None,
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
# Test generation
|
||||
text = await provider.generate("test prompt", max_tokens=100)
|
||||
|
||||
assert text == "Generated response"
|
||||
mock_bedrock_client.invoke_model.assert_called_once()
|
||||
call_args = mock_bedrock_client.invoke_model.call_args
|
||||
|
||||
assert call_args.kwargs["modelId"] == "anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
body = json.loads(call_args.kwargs["body"])
|
||||
assert body["messages"][0]["content"] == "test prompt"
|
||||
assert body["max_tokens"] == 100
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_generation_llama(mock_bedrock_client):
|
||||
"""Test Bedrock text generation with Llama model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"generation": "Llama response"}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model=None,
|
||||
generation_model="meta.llama3-8b-instruct-v1:0",
|
||||
)
|
||||
|
||||
# Test generation
|
||||
text = await provider.generate("test prompt")
|
||||
|
||||
assert text == "Llama response"
|
||||
body = json.loads(mock_bedrock_client.invoke_model.call_args.kwargs["body"])
|
||||
assert body["prompt"] == "test prompt"
|
||||
assert "max_gen_len" in body
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_both_capabilities(mock_bedrock_client):
|
||||
"""Test Bedrock with both embedding and generation models."""
|
||||
# Mock responses
|
||||
embed_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(return_value=json.dumps({"embedding": [0.1, 0.2]}).encode())
|
||||
)
|
||||
}
|
||||
gen_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"content": [{"text": "Response"}]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
|
||||
# Mock to return different responses based on modelId
|
||||
def mock_invoke(modelId, body, **kwargs):
|
||||
if "embed" in modelId:
|
||||
return embed_response
|
||||
else:
|
||||
return gen_response
|
||||
|
||||
mock_bedrock_client.invoke_model.side_effect = mock_invoke
|
||||
|
||||
# Create provider with both models
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
assert provider.supports_embeddings is True
|
||||
assert provider.supports_generation is True
|
||||
|
||||
# Test both capabilities
|
||||
embedding = await provider.embed("test")
|
||||
assert embedding == [0.1, 0.2]
|
||||
|
||||
text = await provider.generate("test")
|
||||
assert text == "Response"
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_no_embeddings():
|
||||
"""Test Bedrock provider with no embedding model raises error."""
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model=None,
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
assert provider.supports_embeddings is False
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||
await provider.embed("test")
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||
await provider.embed_batch(["test"])
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||
provider.get_dimension()
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_no_generation():
|
||||
"""Test Bedrock provider with no generation model raises error."""
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model=None,
|
||||
)
|
||||
|
||||
assert provider.supports_generation is False
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no generation_model configured"):
|
||||
await provider.generate("test")
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_dimension_detection(mock_bedrock_client):
|
||||
"""Test dimension detection for Bedrock embeddings."""
|
||||
# Mock response with specific dimension
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps(
|
||||
{"embedding": [0.1] * 1536} # 1536-dim embedding
|
||||
).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
)
|
||||
|
||||
# Dimension not detected yet
|
||||
with pytest.raises(RuntimeError, match="not detected yet"):
|
||||
provider.get_dimension()
|
||||
|
||||
# Detect dimension
|
||||
await provider._detect_dimension()
|
||||
|
||||
# Now dimension should be available
|
||||
assert provider.get_dimension() == 1536
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_cohere_embedding(mock_bedrock_client):
|
||||
"""Test Bedrock with Cohere embedding model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"embeddings": [[0.1, 0.2, 0.3]]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="cohere.embed-english-v3",
|
||||
)
|
||||
|
||||
embedding = await provider.embed("test text")
|
||||
|
||||
assert embedding == [0.1, 0.2, 0.3]
|
||||
body = json.loads(mock_bedrock_client.invoke_model.call_args.kwargs["body"])
|
||||
assert body == {"texts": ["test text"], "input_type": "search_document"}
|
||||
@@ -233,6 +233,34 @@ wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/f8/aa/5082412d1ee302e9e7d80b6949bc4d2a8fa1149aaab610c5fc24709605d6/authlib-1.6.5-py2.py3-none-any.whl", hash = "sha256:3e0e0507807f842b02175507bdee8957a1d5707fd4afb17c32fb43fee90b6e3a", size = 243608, upload-time = "2025-10-02T13:36:07.637Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "boto3"
|
||||
version = "1.40.74"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "botocore" },
|
||||
{ name = "jmespath" },
|
||||
{ name = "s3transfer" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/a2/37/0db5fc46548b347255310893f1a47971a1d8eb0dbc46dfb5ace8a1e7d45e/boto3-1.40.74.tar.gz", hash = "sha256:484e46bf394b03a7c31b34f90945ebe1390cb1e2ac61980d128a9079beac87d4", size = 111592, upload-time = "2025-11-14T20:29:10.991Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/d2/08/c52751748762901c0ca3c3019e3aa950010217f0fdf9940ebe68e6bb2f5a/boto3-1.40.74-py3-none-any.whl", hash = "sha256:41fc8844b37ae27b24bcabf8369769df246cc12c09453988d0696ad06d6aa9ef", size = 139360, upload-time = "2025-11-14T20:29:09.477Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "botocore"
|
||||
version = "1.40.74"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "jmespath" },
|
||||
{ name = "python-dateutil" },
|
||||
{ name = "urllib3" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/81/dc/0412505f05286f282a75bb0c650e525ddcfaf3f6f1a05cd8e99d32a2db06/botocore-1.40.74.tar.gz", hash = "sha256:57de0b9ffeada06015b3c7e5186c77d0692b210d9e5efa294f3214df97e2f8ee", size = 14452479, upload-time = "2025-11-14T20:29:00.949Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/7d/a2/306dec16e3c84f3ca7aaead0084358c1c7fbe6501f6160844cbc93bc871e/botocore-1.40.74-py3-none-any.whl", hash = "sha256:f39f5763e35e75f0bd91212b7b36120b1536203e8003cd952ef527db79702b15", size = 14117911, upload-time = "2025-11-14T20:28:58.153Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "caldav"
|
||||
version = "2.0.2.dev47+g3e44cf827"
|
||||
@@ -1296,6 +1324,15 @@ wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/2f/9c/6753e6522b8d0ef07d3a3d239426669e984fb0eba15a315cdbc1253904e4/jiter-0.12.0-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c24e864cb30ab82311c6425655b0cdab0a98c5d973b065c66a3f020740c2324c", size = 346110, upload-time = "2025-11-09T20:49:21.817Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jmespath"
|
||||
version = "1.0.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/00/2a/e867e8531cf3e36b41201936b7fa7ba7b5702dbef42922193f05c8976cd6/jmespath-1.0.1.tar.gz", hash = "sha256:90261b206d6defd58fdd5e85f478bf633a2901798906be2ad389150c5c60edbe", size = 25843, upload-time = "2022-06-17T18:00:12.224Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/31/b4/b9b800c45527aadd64d5b442f9b932b00648617eb5d63d2c7a6587b7cafc/jmespath-1.0.1-py3-none-any.whl", hash = "sha256:02e2e4cc71b5bcab88332eebf907519190dd9e6e82107fa7f83b1003a6252980", size = 20256, upload-time = "2022-06-17T18:00:10.251Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jsonschema"
|
||||
version = "4.25.1"
|
||||
@@ -1849,6 +1886,7 @@ dependencies = [
|
||||
[package.dev-dependencies]
|
||||
dev = [
|
||||
{ name = "anthropic" },
|
||||
{ name = "boto3" },
|
||||
{ name = "commitizen" },
|
||||
{ name = "datasets" },
|
||||
{ name = "ipython" },
|
||||
@@ -1891,6 +1929,7 @@ requires-dist = [
|
||||
[package.metadata.requires-dev]
|
||||
dev = [
|
||||
{ name = "anthropic", specifier = ">=0.42.0" },
|
||||
{ name = "boto3", specifier = ">=1.35.0" },
|
||||
{ name = "commitizen", specifier = ">=4.8.2" },
|
||||
{ name = "datasets", specifier = ">=3.3.0" },
|
||||
{ name = "ipython", specifier = ">=9.2.0" },
|
||||
@@ -3270,6 +3309,18 @@ wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/e5/80/69756670caedcf3b9be597a6e12276a6cf6197076eb62aad0c608f8efce0/ruff-0.14.5-py3-none-win_arm64.whl", hash = "sha256:4b700459d4649e2594b31f20a9de33bc7c19976d4746d8d0798ad959621d64a4", size = 13433331, upload-time = "2025-11-13T19:58:48.434Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "s3transfer"
|
||||
version = "0.14.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "botocore" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/62/74/8d69dcb7a9efe8baa2046891735e5dfe433ad558ae23d9e3c14c633d1d58/s3transfer-0.14.0.tar.gz", hash = "sha256:eff12264e7c8b4985074ccce27a3b38a485bb7f7422cc8046fee9be4983e4125", size = 151547, upload-time = "2025-09-09T19:23:31.089Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/48/f0/ae7ca09223a81a1d890b2557186ea015f6e0502e9b8cb8e1813f1d8cfa4e/s3transfer-0.14.0-py3-none-any.whl", hash = "sha256:ea3b790c7077558ed1f02a3072fb3cb992bbbd253392f4b6e9e8976941c7d456", size = 85712, upload-time = "2025-09-09T19:23:30.041Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "shellingham"
|
||||
version = "1.5.4"
|
||||
|
||||
Reference in New Issue
Block a user