feat: add unified provider architecture with Amazon Bedrock support
Refactored LLM provider infrastructure to support sustainable additions of new providers with both embedding and text generation capabilities.
## Major Changes
### Unified Provider Architecture (ADR-015)
- Created `nextcloud_mcp_server/providers/` with unified Provider ABC
- Providers now support optional capabilities (embeddings and/or generation)
- Auto-detection registry with priority: Bedrock → Ollama → Simple
- Backward compatible - existing code continues to work
### New Providers
- **BedrockProvider**: Full Amazon Bedrock integration
- Embeddings: Titan Embed, Cohere Embed models
- Generation: Claude, Llama, Titan Text, Mistral models
- Model-specific request/response handling
- AWS credential chain integration
- **OllamaProvider**: Migrated with both capabilities support
- **AnthropicProvider**: Moved from test code to production providers
- **SimpleProvider**: Migrated in-memory fallback provider
### Breaking Changes
None - full backward compatibility maintained:
- `embedding.get_embedding_service()` still works
- RAG evaluation tests updated to use unified providers
- All existing tests pass (127 unit tests)
### Testing
- Added 9 comprehensive Bedrock unit tests with mocked boto3
- All existing unit tests pass
- Type checking (ty) and linting (ruff) pass
- Verified backward compatibility
### Documentation
- `docs/ADR-015-unified-provider-architecture.md`: Comprehensive ADR
- `docs/bedrock-setup.md`: AWS setup guide with IAM permissions
- `CLAUDE.md`: Updated with provider architecture section
### Dependencies
- Added `boto3>=1.35.0` to dev dependencies (optional)
## Environment Variables
### Bedrock
- `AWS_REGION`: AWS region (e.g., "us-east-1")
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings
- `BEDROCK_GENERATION_MODEL`: Model ID for generation
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`: Optional credentials
### Ollama
- `OLLAMA_BASE_URL`: API URL
- `OLLAMA_EMBEDDING_MODEL`: Embedding model (default: "nomic-embed-text")
- `OLLAMA_GENERATION_MODEL`: Generation model
## AWS Bedrock Permissions Required
Minimal IAM policy:
```json
{
"Effect": "Allow",
"Action": ["bedrock:InvokeModel"],
"Resource": ["arn:aws:bedrock:*::foundation-model/*"]
}
```
See `docs/bedrock-setup.md` for detailed setup instructions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -61,8 +61,60 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|||||||
- `nextcloud_mcp_server/server/` - MCP tool/resource definitions
|
- `nextcloud_mcp_server/server/` - MCP tool/resource definitions
|
||||||
- `nextcloud_mcp_server/auth/` - OAuth/OIDC authentication
|
- `nextcloud_mcp_server/auth/` - OAuth/OIDC authentication
|
||||||
- `nextcloud_mcp_server/models/` - Pydantic response models
|
- `nextcloud_mcp_server/models/` - Pydantic response models
|
||||||
|
- `nextcloud_mcp_server/providers/` - Unified LLM provider infrastructure (embeddings + generation)
|
||||||
- `tests/` - Layered test suite (unit, smoke, integration, load)
|
- `tests/` - Layered test suite (unit, smoke, integration, load)
|
||||||
|
|
||||||
|
### Provider Architecture (ADR-015)
|
||||||
|
|
||||||
|
**Unified Provider System** for embeddings and text generation:
|
||||||
|
|
||||||
|
**Location:** `nextcloud_mcp_server/providers/`
|
||||||
|
- `base.py` - `Provider` ABC with optional capabilities
|
||||||
|
- `registry.py` - Auto-detection and factory pattern
|
||||||
|
- `ollama.py` - Ollama provider (embeddings + generation)
|
||||||
|
- `anthropic.py` - Anthropic provider (generation only)
|
||||||
|
- `bedrock.py` - Amazon Bedrock provider (embeddings + generation)
|
||||||
|
- `simple.py` - Simple in-memory provider (embeddings only, fallback)
|
||||||
|
|
||||||
|
**Usage:**
|
||||||
|
```python
|
||||||
|
from nextcloud_mcp_server.providers import get_provider
|
||||||
|
|
||||||
|
provider = get_provider() # Auto-detects from environment
|
||||||
|
|
||||||
|
# Check capabilities
|
||||||
|
if provider.supports_embeddings:
|
||||||
|
embeddings = await provider.embed_batch(texts)
|
||||||
|
|
||||||
|
if provider.supports_generation:
|
||||||
|
text = await provider.generate("prompt", max_tokens=500)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Environment Variables:**
|
||||||
|
|
||||||
|
Bedrock:
|
||||||
|
- `AWS_REGION` - AWS region (e.g., "us-east-1")
|
||||||
|
- `BEDROCK_EMBEDDING_MODEL` - Embedding model ID (e.g., "amazon.titan-embed-text-v2:0")
|
||||||
|
- `BEDROCK_GENERATION_MODEL` - Generation model ID (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||||
|
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` - Optional, uses AWS credential chain
|
||||||
|
|
||||||
|
Ollama:
|
||||||
|
- `OLLAMA_BASE_URL` - API URL (e.g., "http://localhost:11434")
|
||||||
|
- `OLLAMA_EMBEDDING_MODEL` - Embedding model (default: "nomic-embed-text")
|
||||||
|
- `OLLAMA_GENERATION_MODEL` - Generation model (e.g., "llama3.2:1b")
|
||||||
|
- `OLLAMA_VERIFY_SSL` - SSL verification (default: "true")
|
||||||
|
|
||||||
|
Simple (fallback, no config needed):
|
||||||
|
- `SIMPLE_EMBEDDING_DIMENSION` - Dimension (default: 384)
|
||||||
|
|
||||||
|
**Auto-Detection Priority:** Bedrock → Ollama → Simple
|
||||||
|
|
||||||
|
**Backward Compatibility:**
|
||||||
|
- Old code using `nextcloud_mcp_server.embedding.get_embedding_service()` still works
|
||||||
|
- `EmbeddingService` now wraps `get_provider()` internally
|
||||||
|
|
||||||
|
**For Details:** See `docs/ADR-015-unified-provider-architecture.md`
|
||||||
|
|
||||||
## Development Commands (Quick Reference)
|
## Development Commands (Quick Reference)
|
||||||
|
|
||||||
### Testing
|
### Testing
|
||||||
|
|||||||
@@ -0,0 +1,380 @@
|
|||||||
|
# ADR-015: Unified Provider Architecture for Embeddings and Text Generation
|
||||||
|
|
||||||
|
**Status:** Accepted
|
||||||
|
**Date:** 2025-01-16
|
||||||
|
**Deciders:** Development Team
|
||||||
|
**Related:** ADR-003 (Vector Database), ADR-008 (MCP Sampling), ADR-013 (RAG Evaluation)
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
Prior to this refactoring, the codebase had two separate provider systems:
|
||||||
|
|
||||||
|
1. **Embedding Providers** (`nextcloud_mcp_server/embedding/`)
|
||||||
|
- Used `EmbeddingProvider` ABC with methods: `embed()`, `embed_batch()`, `get_dimension()`
|
||||||
|
- Had auto-detection via `EmbeddingService._detect_provider()`
|
||||||
|
- Used for semantic search and vector indexing (production)
|
||||||
|
|
||||||
|
2. **LLM Providers** (`tests/rag_evaluation/llm_providers.py`)
|
||||||
|
- Used `LLMProvider` Protocol with method: `generate()`
|
||||||
|
- Had separate factory function `create_llm_provider()`
|
||||||
|
- Used only for RAG evaluation tests (not production)
|
||||||
|
|
||||||
|
This fragmentation created several problems:
|
||||||
|
|
||||||
|
### Problems with Dual Provider Systems
|
||||||
|
|
||||||
|
1. **Code Duplication**
|
||||||
|
- Ollama configuration appeared in both `embedding/service.py` and `tests/rag_evaluation/llm_providers.py`
|
||||||
|
- Similar provider detection logic in multiple places
|
||||||
|
- Separate singleton patterns for each system
|
||||||
|
|
||||||
|
2. **Limited Extensibility**
|
||||||
|
- Hard-coded provider detection in `EmbeddingService._detect_provider()`
|
||||||
|
- No support for providers that offer both capabilities (like Bedrock)
|
||||||
|
- Adding new providers required modifying multiple files
|
||||||
|
|
||||||
|
3. **Inconsistent Patterns**
|
||||||
|
- BM25 provider didn't follow `EmbeddingProvider` ABC
|
||||||
|
- Different method names across providers (`embed` vs `encode`)
|
||||||
|
- ABC vs Protocol for type checking
|
||||||
|
|
||||||
|
4. **Difficult Scaling**
|
||||||
|
- Adding Amazon Bedrock (our third provider) would exacerbate all issues
|
||||||
|
- No clear path for future providers (OpenAI, Cohere, etc.)
|
||||||
|
|
||||||
|
### Amazon Bedrock Requirements
|
||||||
|
|
||||||
|
Bedrock naturally supports **both** embeddings and text generation:
|
||||||
|
- **Embeddings**: `amazon.titan-embed-text-v1/v2`, `cohere.embed-*`
|
||||||
|
- **Text Generation**: `anthropic.claude-*`, `meta.llama3-*`, `amazon.titan-text-*`
|
||||||
|
- **Unified API**: Single `invoke_model()` method via bedrock-runtime
|
||||||
|
|
||||||
|
This made it the perfect opportunity to establish a unified provider architecture.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
We refactored the provider infrastructure to use a **unified Provider ABC** with optional capabilities:
|
||||||
|
|
||||||
|
### 1. Unified Provider Interface
|
||||||
|
|
||||||
|
**New Structure:**
|
||||||
|
```
|
||||||
|
nextcloud_mcp_server/providers/
|
||||||
|
├── __init__.py
|
||||||
|
├── base.py # Provider ABC with optional capabilities
|
||||||
|
├── registry.py # Auto-detection and factory
|
||||||
|
├── ollama.py # Supports both embedding + generation
|
||||||
|
├── anthropic.py # Generation only
|
||||||
|
├── bedrock.py # Supports both embedding + generation
|
||||||
|
└── simple.py # Embedding only (testing fallback)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Base Class (`providers/base.py`):**
|
||||||
|
```python
|
||||||
|
class Provider(ABC):
|
||||||
|
@property
|
||||||
|
@abstractmethod
|
||||||
|
def supports_embeddings(self) -> bool:
|
||||||
|
"""Whether this provider supports embedding generation."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@property
|
||||||
|
@abstractmethod
|
||||||
|
def supports_generation(self) -> bool:
|
||||||
|
"""Whether this provider supports text generation."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def embed(self, text: str) -> list[float]:
|
||||||
|
"""Generate embedding (raises NotImplementedError if not supported)."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""Generate batch embeddings (raises NotImplementedError if not supported)."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def get_dimension(self) -> int:
|
||||||
|
"""Get embedding dimension (raises NotImplementedError if not supported)."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||||
|
"""Generate text (raises NotImplementedError if not supported)."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def close(self) -> None:
|
||||||
|
"""Close provider and release resources."""
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Provider Registry
|
||||||
|
|
||||||
|
**Auto-Detection Priority** (`providers/registry.py`):
|
||||||
|
```python
|
||||||
|
class ProviderRegistry:
|
||||||
|
@staticmethod
|
||||||
|
def create_provider() -> Provider:
|
||||||
|
# 1. Bedrock (AWS_REGION or BEDROCK_*_MODEL)
|
||||||
|
# 2. Ollama (OLLAMA_BASE_URL)
|
||||||
|
# 3. Simple (fallback)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Environment Variables:**
|
||||||
|
|
||||||
|
**Bedrock:**
|
||||||
|
- `AWS_REGION`: AWS region (e.g., "us-east-1")
|
||||||
|
- `AWS_ACCESS_KEY_ID`: AWS access key (optional, uses credential chain)
|
||||||
|
- `AWS_SECRET_ACCESS_KEY`: AWS secret key (optional)
|
||||||
|
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
|
||||||
|
- `BEDROCK_GENERATION_MODEL`: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||||
|
|
||||||
|
**Ollama:**
|
||||||
|
- `OLLAMA_BASE_URL`: Ollama API base URL (e.g., "http://localhost:11434")
|
||||||
|
- `OLLAMA_EMBEDDING_MODEL`: Model for embeddings (default: "nomic-embed-text")
|
||||||
|
- `OLLAMA_GENERATION_MODEL`: Model for text generation (e.g., "llama3.2:1b")
|
||||||
|
- `OLLAMA_VERIFY_SSL`: Verify SSL certificates (default: "true")
|
||||||
|
|
||||||
|
**Simple (no configuration, fallback):**
|
||||||
|
- `SIMPLE_EMBEDDING_DIMENSION`: Embedding dimension (default: 384)
|
||||||
|
|
||||||
|
### 3. Backward Compatibility
|
||||||
|
|
||||||
|
**Old Code Continues to Work:**
|
||||||
|
```python
|
||||||
|
# Old way (still works)
|
||||||
|
from nextcloud_mcp_server.embedding import get_embedding_service
|
||||||
|
|
||||||
|
service = get_embedding_service() # Returns singleton Provider
|
||||||
|
embeddings = await service.embed_batch(texts)
|
||||||
|
```
|
||||||
|
|
||||||
|
**New Way (recommended):**
|
||||||
|
```python
|
||||||
|
# New way (cleaner)
|
||||||
|
from nextcloud_mcp_server.providers import get_provider
|
||||||
|
|
||||||
|
provider = get_provider() # Returns singleton Provider
|
||||||
|
embeddings = await provider.embed_batch(texts)
|
||||||
|
|
||||||
|
# Can also use generation if provider supports it
|
||||||
|
if provider.supports_generation:
|
||||||
|
text = await provider.generate("prompt")
|
||||||
|
```
|
||||||
|
|
||||||
|
**Migration Path:**
|
||||||
|
- `embedding/service.py` now wraps `providers.get_provider()` for compatibility
|
||||||
|
- `tests/rag_evaluation/llm_providers.py` now uses unified providers
|
||||||
|
- Old imports still work, marked as deprecated in docstrings
|
||||||
|
|
||||||
|
### 4. Amazon Bedrock Implementation
|
||||||
|
|
||||||
|
**Features:**
|
||||||
|
- Supports both embeddings and text generation
|
||||||
|
- Model-specific request/response handling for:
|
||||||
|
- Titan Embed (amazon.titan-embed-text-*)
|
||||||
|
- Cohere Embed (cohere.embed-*)
|
||||||
|
- Claude (anthropic.claude-*)
|
||||||
|
- Llama (meta.llama3-*)
|
||||||
|
- Titan Text (amazon.titan-text-*)
|
||||||
|
- Mistral (mistral.*)
|
||||||
|
- Uses boto3 bedrock-runtime client
|
||||||
|
- Graceful degradation if boto3 not installed
|
||||||
|
- Async implementation matching existing patterns
|
||||||
|
|
||||||
|
**Model-Specific Handling:**
|
||||||
|
```python
|
||||||
|
# Bedrock embedding request (Titan)
|
||||||
|
{"inputText": text}
|
||||||
|
|
||||||
|
# Bedrock generation request (Claude)
|
||||||
|
{
|
||||||
|
"anthropic_version": "bedrock-2023-05-31",
|
||||||
|
"max_tokens": max_tokens,
|
||||||
|
"temperature": 0.7,
|
||||||
|
"messages": [{"role": "user", "content": prompt}]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
|
||||||
|
1. **Sustainable Provider Additions**
|
||||||
|
- New providers only need to implement `Provider` ABC
|
||||||
|
- Auto-detection via environment variables
|
||||||
|
- No modifications to existing code required
|
||||||
|
|
||||||
|
2. **Code Consolidation**
|
||||||
|
- Single provider interface instead of two
|
||||||
|
- Unified configuration pattern
|
||||||
|
- Eliminated duplication
|
||||||
|
|
||||||
|
3. **Better Extensibility**
|
||||||
|
- Providers can support one or both capabilities
|
||||||
|
- Clear capability detection via properties
|
||||||
|
- Registry pattern simplifies auto-detection
|
||||||
|
|
||||||
|
4. **Improved Testing**
|
||||||
|
- RAG evaluation can use any provider (Ollama, Anthropic, Bedrock)
|
||||||
|
- Comprehensive unit tests for all providers
|
||||||
|
- Mocked boto3 tests for Bedrock
|
||||||
|
|
||||||
|
5. **Production-Ready Bedrock Support**
|
||||||
|
- Full embedding and generation support
|
||||||
|
- Multiple model families supported
|
||||||
|
- AWS credential chain integration
|
||||||
|
|
||||||
|
### Neutral
|
||||||
|
|
||||||
|
1. **Optional Boto3 Dependency**
|
||||||
|
- boto3 is dev dependency only (not required for core functionality)
|
||||||
|
- Bedrock provider gracefully fails if boto3 not installed
|
||||||
|
- Users who want Bedrock must `pip install boto3`
|
||||||
|
|
||||||
|
2. **Capability Properties**
|
||||||
|
- All providers must implement capability properties
|
||||||
|
- Methods raise `NotImplementedError` if capability not supported
|
||||||
|
- Clear error messages guide users to alternatives
|
||||||
|
|
||||||
|
### Negative
|
||||||
|
|
||||||
|
1. **Migration Effort**
|
||||||
|
- Existing code must be migrated to new imports (optional, backward compatible)
|
||||||
|
- Documentation needs updating
|
||||||
|
- Users must learn new environment variables
|
||||||
|
|
||||||
|
2. **Increased Complexity**
|
||||||
|
- Provider base class has more methods (embedding + generation)
|
||||||
|
- More environment variables to configure
|
||||||
|
- Capability detection adds runtime checks
|
||||||
|
|
||||||
|
## Implementation
|
||||||
|
|
||||||
|
### Files Created
|
||||||
|
|
||||||
|
**New Provider Infrastructure:**
|
||||||
|
- `nextcloud_mcp_server/providers/__init__.py`
|
||||||
|
- `nextcloud_mcp_server/providers/base.py`
|
||||||
|
- `nextcloud_mcp_server/providers/registry.py`
|
||||||
|
- `nextcloud_mcp_server/providers/ollama.py`
|
||||||
|
- `nextcloud_mcp_server/providers/anthropic.py`
|
||||||
|
- `nextcloud_mcp_server/providers/bedrock.py`
|
||||||
|
- `nextcloud_mcp_server/providers/simple.py`
|
||||||
|
|
||||||
|
**Tests:**
|
||||||
|
- `tests/unit/providers/__init__.py`
|
||||||
|
- `tests/unit/providers/test_bedrock.py` (9 unit tests)
|
||||||
|
|
||||||
|
**Documentation:**
|
||||||
|
- `docs/ADR-015-unified-provider-architecture.md` (this file)
|
||||||
|
|
||||||
|
### Files Modified
|
||||||
|
|
||||||
|
**Backward Compatibility:**
|
||||||
|
- `nextcloud_mcp_server/embedding/service.py` - Now wraps `get_provider()`
|
||||||
|
- `tests/rag_evaluation/llm_providers.py` - Uses unified providers
|
||||||
|
|
||||||
|
**Dependencies:**
|
||||||
|
- `pyproject.toml` - Added `boto3>=1.35.0` to dev dependencies
|
||||||
|
|
||||||
|
### Testing Results
|
||||||
|
|
||||||
|
**Unit Tests:** 127 passed (including 9 new Bedrock tests)
|
||||||
|
**Type Checking:** All checks passed (ty)
|
||||||
|
**Linting:** All checks passed (ruff)
|
||||||
|
**Backward Compatibility:** Verified - existing embedding tests work
|
||||||
|
|
||||||
|
## Alternatives Considered
|
||||||
|
|
||||||
|
### Alternative 1: Keep Separate Provider Systems
|
||||||
|
|
||||||
|
**Pros:**
|
||||||
|
- No refactoring needed
|
||||||
|
- Simpler short-term
|
||||||
|
|
||||||
|
**Cons:**
|
||||||
|
- Bedrock would need to be implemented twice
|
||||||
|
- Continued code duplication
|
||||||
|
- No long-term scalability
|
||||||
|
|
||||||
|
**Decision:** Rejected - technical debt would continue to grow
|
||||||
|
|
||||||
|
### Alternative 2: Separate Embedding and Generation Providers
|
||||||
|
|
||||||
|
Use composition instead of unified interface:
|
||||||
|
```python
|
||||||
|
class CombinedProvider:
|
||||||
|
def __init__(self, embedding: EmbeddingProvider, generation: LLMProvider):
|
||||||
|
self.embedding = embedding
|
||||||
|
self.generation = generation
|
||||||
|
```
|
||||||
|
|
||||||
|
**Pros:**
|
||||||
|
- Clearer separation of concerns
|
||||||
|
- Simpler individual providers
|
||||||
|
|
||||||
|
**Cons:**
|
||||||
|
- Bedrock and Ollama naturally do both - artificial separation
|
||||||
|
- More complex configuration (two providers to configure)
|
||||||
|
- More boilerplate code
|
||||||
|
|
||||||
|
**Decision:** Rejected - unified interface better matches provider capabilities
|
||||||
|
|
||||||
|
### Alternative 3: Plugin System
|
||||||
|
|
||||||
|
Dynamic provider registration via entry points:
|
||||||
|
```python
|
||||||
|
# setup.py
|
||||||
|
entry_points={
|
||||||
|
'nextcloud_mcp.providers': [
|
||||||
|
'ollama = nextcloud_mcp_server.providers.ollama:OllamaProvider',
|
||||||
|
'bedrock = nextcloud_mcp_server.providers.bedrock:BedrockProvider',
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Pros:**
|
||||||
|
- Most extensible
|
||||||
|
- Third-party providers possible
|
||||||
|
|
||||||
|
**Cons:**
|
||||||
|
- Over-engineered for current needs
|
||||||
|
- Added complexity
|
||||||
|
- No immediate benefit
|
||||||
|
|
||||||
|
**Decision:** Deferred - can add later if needed
|
||||||
|
|
||||||
|
## Future Work
|
||||||
|
|
||||||
|
1. **Additional Providers**
|
||||||
|
- OpenAI (embeddings + generation)
|
||||||
|
- Cohere (embeddings + generation)
|
||||||
|
- Google Vertex AI
|
||||||
|
- Azure OpenAI
|
||||||
|
|
||||||
|
2. **Provider Features**
|
||||||
|
- Streaming generation support
|
||||||
|
- Batch API optimization (when available)
|
||||||
|
- Model-specific optimizations
|
||||||
|
- Cost tracking and metrics
|
||||||
|
|
||||||
|
3. **Configuration Improvements**
|
||||||
|
- Provider profiles (development, production)
|
||||||
|
- Model aliasing (e.g., "small", "large")
|
||||||
|
- Fallback provider chains
|
||||||
|
|
||||||
|
4. **Testing**
|
||||||
|
- Integration tests with real Bedrock endpoints
|
||||||
|
- Performance benchmarking across providers
|
||||||
|
- Cost comparison analysis
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [boto3 Bedrock Runtime Documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
|
||||||
|
- [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
|
||||||
|
- ADR-003: Vector Database and Semantic Search
|
||||||
|
- ADR-008: MCP Sampling for Semantic Search
|
||||||
|
- ADR-013: RAG Evaluation Framework
|
||||||
@@ -0,0 +1,338 @@
|
|||||||
|
# Amazon Bedrock Setup Guide
|
||||||
|
|
||||||
|
This guide covers how to configure the Nextcloud MCP Server to use Amazon Bedrock for embeddings and text generation.
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
1. **AWS Account** with access to Amazon Bedrock
|
||||||
|
2. **boto3 library** installed: `pip install boto3` or `uv sync --group dev`
|
||||||
|
3. **Model Access** - Request access to models in AWS Bedrock console
|
||||||
|
|
||||||
|
## Required AWS Permissions
|
||||||
|
|
||||||
|
### IAM Policy for Bedrock Access
|
||||||
|
|
||||||
|
The AWS IAM user or role needs the following permissions:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Version": "2012-10-17",
|
||||||
|
"Statement": [
|
||||||
|
{
|
||||||
|
"Sid": "BedrockInvokeModels",
|
||||||
|
"Effect": "Allow",
|
||||||
|
"Action": [
|
||||||
|
"bedrock:InvokeModel",
|
||||||
|
"bedrock:InvokeModelWithResponseStream"
|
||||||
|
],
|
||||||
|
"Resource": [
|
||||||
|
"arn:aws:bedrock:*::foundation-model/*"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Minimal Permissions (Production)
|
||||||
|
|
||||||
|
For production deployments, restrict to specific models:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Version": "2012-10-17",
|
||||||
|
"Statement": [
|
||||||
|
{
|
||||||
|
"Sid": "BedrockEmbeddings",
|
||||||
|
"Effect": "Allow",
|
||||||
|
"Action": [
|
||||||
|
"bedrock:InvokeModel"
|
||||||
|
],
|
||||||
|
"Resource": [
|
||||||
|
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"Sid": "BedrockGeneration",
|
||||||
|
"Effect": "Allow",
|
||||||
|
"Action": [
|
||||||
|
"bedrock:InvokeModel"
|
||||||
|
],
|
||||||
|
"Resource": [
|
||||||
|
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Additional Permissions (Optional)
|
||||||
|
|
||||||
|
For advanced use cases:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Version": "2012-10-17",
|
||||||
|
"Statement": [
|
||||||
|
{
|
||||||
|
"Sid": "BedrockListModels",
|
||||||
|
"Effect": "Allow",
|
||||||
|
"Action": [
|
||||||
|
"bedrock:ListFoundationModels",
|
||||||
|
"bedrock:GetFoundationModel"
|
||||||
|
],
|
||||||
|
"Resource": "*"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"Sid": "BedrockAsyncInvoke",
|
||||||
|
"Effect": "Allow",
|
||||||
|
"Action": [
|
||||||
|
"bedrock:InvokeModelAsync",
|
||||||
|
"bedrock:GetAsyncInvoke",
|
||||||
|
"bedrock:ListAsyncInvokes"
|
||||||
|
],
|
||||||
|
"Resource": [
|
||||||
|
"arn:aws:bedrock:*::foundation-model/*"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Model Access
|
||||||
|
|
||||||
|
Before using Bedrock models, you must request access in the AWS Console:
|
||||||
|
|
||||||
|
1. Navigate to **Amazon Bedrock** → **Model access**
|
||||||
|
2. Click **Manage model access**
|
||||||
|
3. Select models you want to use:
|
||||||
|
- **Embeddings:** Amazon Titan Embed Text, Cohere Embed
|
||||||
|
- **Text Generation:** Anthropic Claude, Meta Llama, Amazon Titan Text
|
||||||
|
4. Click **Request model access**
|
||||||
|
5. Wait for approval (usually instant for most models)
|
||||||
|
|
||||||
|
## Supported Models
|
||||||
|
|
||||||
|
### Embedding Models
|
||||||
|
|
||||||
|
| Provider | Model ID | Dimensions | Best For |
|
||||||
|
|----------|----------|------------|----------|
|
||||||
|
| Amazon Titan | `amazon.titan-embed-text-v1` | 1,536 | General purpose |
|
||||||
|
| Amazon Titan | `amazon.titan-embed-text-v2:0` | 1,024 | Latest, improved quality |
|
||||||
|
| Cohere | `cohere.embed-english-v3` | 1,024 | English text |
|
||||||
|
| Cohere | `cohere.embed-multilingual-v3` | 1,024 | Multilingual |
|
||||||
|
|
||||||
|
### Text Generation Models
|
||||||
|
|
||||||
|
| Provider | Model ID | Context | Best For |
|
||||||
|
|----------|----------|---------|----------|
|
||||||
|
| Anthropic | `anthropic.claude-3-sonnet-20240229-v1:0` | 200K | Balanced performance |
|
||||||
|
| Anthropic | `anthropic.claude-3-haiku-20240307-v1:0` | 200K | Fast, cost-effective |
|
||||||
|
| Anthropic | `anthropic.claude-3-opus-20240229-v1:0` | 200K | Highest quality |
|
||||||
|
| Meta | `meta.llama3-8b-instruct-v1:0` | 8K | Fast, open-source |
|
||||||
|
| Meta | `meta.llama3-70b-instruct-v1:0` | 8K | High quality |
|
||||||
|
| Amazon | `amazon.titan-text-express-v1` | 8K | Fast, low cost |
|
||||||
|
| Mistral | `mistral.mistral-7b-instruct-v0:2` | 32K | Efficient |
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
**Required:**
|
||||||
|
```bash
|
||||||
|
AWS_REGION=us-east-1
|
||||||
|
```
|
||||||
|
|
||||||
|
**Optional (at least one model required):**
|
||||||
|
```bash
|
||||||
|
# For embeddings
|
||||||
|
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||||
|
|
||||||
|
# For text generation (RAG evaluation)
|
||||||
|
BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||||
|
```
|
||||||
|
|
||||||
|
**AWS Credentials (choose one method):**
|
||||||
|
|
||||||
|
**Method 1: Environment Variables**
|
||||||
|
```bash
|
||||||
|
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
|
||||||
|
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||||
|
```
|
||||||
|
|
||||||
|
**Method 2: AWS Credentials File** (`~/.aws/credentials`)
|
||||||
|
```ini
|
||||||
|
[default]
|
||||||
|
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
|
||||||
|
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||||
|
```
|
||||||
|
|
||||||
|
**Method 3: IAM Role** (when running on AWS EC2/ECS/Lambda)
|
||||||
|
- No credentials needed, uses instance/task role automatically
|
||||||
|
|
||||||
|
### Docker Configuration
|
||||||
|
|
||||||
|
Add to your `docker-compose.yml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
mcp:
|
||||||
|
environment:
|
||||||
|
- AWS_REGION=us-east-1
|
||||||
|
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||||
|
- BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||||
|
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
|
||||||
|
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use AWS credentials file volume mount:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
mcp:
|
||||||
|
volumes:
|
||||||
|
- ~/.aws:/root/.aws:ro
|
||||||
|
environment:
|
||||||
|
- AWS_REGION=us-east-1
|
||||||
|
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage Examples
|
||||||
|
|
||||||
|
### Embeddings Only
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export AWS_REGION=us-east-1
|
||||||
|
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||||
|
export AWS_ACCESS_KEY_ID=your-key
|
||||||
|
export AWS_SECRET_ACCESS_KEY=your-secret
|
||||||
|
|
||||||
|
uv run nextcloud-mcp-server
|
||||||
|
```
|
||||||
|
|
||||||
|
### Both Embeddings and Generation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export AWS_REGION=us-east-1
|
||||||
|
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||||
|
export BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||||
|
|
||||||
|
# For RAG evaluation with Bedrock
|
||||||
|
export RAG_EVAL_PROVIDER=bedrock
|
||||||
|
export RAG_EVAL_BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||||
|
|
||||||
|
uv run python -m tests.rag_evaluation.evaluate
|
||||||
|
```
|
||||||
|
|
||||||
|
### Programmatic Usage
|
||||||
|
|
||||||
|
```python
|
||||||
|
from nextcloud_mcp_server.providers import BedrockProvider
|
||||||
|
|
||||||
|
# Embeddings only
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
embeddings = await provider.embed_batch(["text1", "text2"])
|
||||||
|
|
||||||
|
# Both capabilities
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Generate embeddings
|
||||||
|
embedding = await provider.embed("query text")
|
||||||
|
|
||||||
|
# Generate text
|
||||||
|
response = await provider.generate("Write a summary", max_tokens=500)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Cost Considerations
|
||||||
|
|
||||||
|
### Embedding Costs (as of Jan 2025)
|
||||||
|
|
||||||
|
| Model | Price per 1K tokens |
|
||||||
|
|-------|---------------------|
|
||||||
|
| Titan Embed Text v2 | $0.0001 |
|
||||||
|
| Cohere Embed English v3 | $0.0001 |
|
||||||
|
|
||||||
|
### Generation Costs (as of Jan 2025)
|
||||||
|
|
||||||
|
| Model | Input (per 1K tokens) | Output (per 1K tokens) |
|
||||||
|
|-------|----------------------|------------------------|
|
||||||
|
| Claude 3 Haiku | $0.00025 | $0.00125 |
|
||||||
|
| Claude 3 Sonnet | $0.003 | $0.015 |
|
||||||
|
| Claude 3 Opus | $0.015 | $0.075 |
|
||||||
|
| Llama 3 8B | $0.0003 | $0.0006 |
|
||||||
|
| Titan Text Express | $0.0002 | $0.0006 |
|
||||||
|
|
||||||
|
**Note:** Prices vary by region. Check [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/) for current rates.
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Error: "Executable doesn't exist" or boto3 not found
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
```bash
|
||||||
|
uv sync --group dev # Installs boto3
|
||||||
|
```
|
||||||
|
|
||||||
|
### Error: "AccessDeniedException"
|
||||||
|
|
||||||
|
**Causes:**
|
||||||
|
1. IAM permissions missing
|
||||||
|
2. Model access not requested
|
||||||
|
3. Wrong AWS region
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
1. Verify IAM policy includes `bedrock:InvokeModel`
|
||||||
|
2. Request model access in Bedrock console
|
||||||
|
3. Check model is available in your region
|
||||||
|
|
||||||
|
### Error: "ResourceNotFoundException"
|
||||||
|
|
||||||
|
**Cause:** Invalid model ID or model not available in region
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Verify model ID matches exactly (case-sensitive)
|
||||||
|
- Check model availability in your AWS region
|
||||||
|
- Use `aws bedrock list-foundation-models` to see available models
|
||||||
|
|
||||||
|
### Error: "ThrottlingException"
|
||||||
|
|
||||||
|
**Cause:** Rate limit exceeded
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Reduce request rate
|
||||||
|
- Request quota increase via AWS Support
|
||||||
|
- Use batch operations where possible
|
||||||
|
|
||||||
|
## Security Best Practices
|
||||||
|
|
||||||
|
1. **Use IAM Roles** when running on AWS infrastructure
|
||||||
|
2. **Rotate Access Keys** regularly if using IAM users
|
||||||
|
3. **Restrict Permissions** to only required models
|
||||||
|
4. **Enable CloudTrail** for audit logging
|
||||||
|
5. **Use AWS Secrets Manager** for credential management
|
||||||
|
6. **Monitor Costs** with AWS Cost Explorer and Budgets
|
||||||
|
|
||||||
|
## Regional Availability
|
||||||
|
|
||||||
|
Amazon Bedrock is available in:
|
||||||
|
- **US East (N. Virginia)**: `us-east-1` ✅ Most models
|
||||||
|
- **US West (Oregon)**: `us-west-2` ✅ Most models
|
||||||
|
- **Asia Pacific (Singapore)**: `ap-southeast-1`
|
||||||
|
- **Asia Pacific (Tokyo)**: `ap-northeast-1`
|
||||||
|
- **Europe (Frankfurt)**: `eu-central-1`
|
||||||
|
|
||||||
|
**Note:** Model availability varies by region. Check the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html) for current availability.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [AWS Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
|
||||||
|
- [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
|
||||||
|
- [boto3 Bedrock Runtime API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
|
||||||
|
- [Provider Architecture ADR](./ADR-015-unified-provider-architecture.md)
|
||||||
@@ -1,57 +1,30 @@
|
|||||||
"""Embedding service with provider detection."""
|
"""Embedding service with provider detection.
|
||||||
|
|
||||||
|
DEPRECATED: This module is maintained for backward compatibility.
|
||||||
|
New code should use nextcloud_mcp_server.providers.get_provider() directly.
|
||||||
|
"""
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
import os
|
|
||||||
|
|
||||||
from .base import EmbeddingProvider
|
from nextcloud_mcp_server.providers import get_provider
|
||||||
|
|
||||||
from .bm25_provider import BM25SparseEmbeddingProvider
|
from .bm25_provider import BM25SparseEmbeddingProvider
|
||||||
from .ollama_provider import OllamaEmbeddingProvider
|
|
||||||
from .simple_provider import SimpleEmbeddingProvider
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class EmbeddingService:
|
class EmbeddingService:
|
||||||
"""Unified embedding service with automatic provider detection."""
|
"""
|
||||||
|
Unified embedding service with automatic provider detection.
|
||||||
|
|
||||||
|
DEPRECATED: This class wraps the new unified provider infrastructure
|
||||||
|
for backward compatibility. New code should use
|
||||||
|
nextcloud_mcp_server.providers.get_provider() directly.
|
||||||
|
"""
|
||||||
|
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
"""Initialize embedding service with auto-detected provider."""
|
"""Initialize embedding service with auto-detected provider."""
|
||||||
self.provider = self._detect_provider()
|
self.provider = get_provider()
|
||||||
|
|
||||||
def _detect_provider(self) -> EmbeddingProvider:
|
|
||||||
"""
|
|
||||||
Auto-detect available embedding provider.
|
|
||||||
|
|
||||||
Checks environment variables in order:
|
|
||||||
1. OLLAMA_BASE_URL - Use Ollama provider (production)
|
|
||||||
2. OPENAI_API_KEY - Use OpenAI provider (future)
|
|
||||||
3. Fallback to SimpleEmbeddingProvider (testing/development)
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Configured embedding provider
|
|
||||||
"""
|
|
||||||
# Ollama provider (production)
|
|
||||||
ollama_url = os.getenv("OLLAMA_BASE_URL")
|
|
||||||
if ollama_url:
|
|
||||||
logger.info(f"Using Ollama embedding provider: {ollama_url}")
|
|
||||||
return OllamaEmbeddingProvider(
|
|
||||||
base_url=ollama_url,
|
|
||||||
model=os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text"),
|
|
||||||
verify_ssl=os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true",
|
|
||||||
)
|
|
||||||
|
|
||||||
# OpenAI provider (future implementation)
|
|
||||||
# openai_key = os.getenv("OPENAI_API_KEY")
|
|
||||||
# if openai_key:
|
|
||||||
# return OpenAIEmbeddingProvider(api_key=openai_key)
|
|
||||||
|
|
||||||
# Fallback to simple provider for development/testing
|
|
||||||
logger.warning(
|
|
||||||
"No embedding provider configured (OLLAMA_BASE_URL or OPENAI_API_KEY not set). "
|
|
||||||
"Using SimpleEmbeddingProvider for testing/development. "
|
|
||||||
"For production, configure an external embedding service."
|
|
||||||
)
|
|
||||||
return SimpleEmbeddingProvider(dimension=384)
|
|
||||||
|
|
||||||
async def embed(self, text: str) -> list[float]:
|
async def embed(self, text: str) -> list[float]:
|
||||||
"""
|
"""
|
||||||
|
|||||||
@@ -0,0 +1,18 @@
|
|||||||
|
"""Unified provider infrastructure for embeddings and text generation."""
|
||||||
|
|
||||||
|
from .anthropic import AnthropicProvider
|
||||||
|
from .base import Provider
|
||||||
|
from .bedrock import BedrockProvider
|
||||||
|
from .ollama import OllamaProvider
|
||||||
|
from .registry import get_provider, reset_provider
|
||||||
|
from .simple import SimpleProvider
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
"Provider",
|
||||||
|
"OllamaProvider",
|
||||||
|
"AnthropicProvider",
|
||||||
|
"SimpleProvider",
|
||||||
|
"BedrockProvider",
|
||||||
|
"get_provider",
|
||||||
|
"reset_provider",
|
||||||
|
]
|
||||||
@@ -0,0 +1,97 @@
|
|||||||
|
"""Unified Anthropic provider for text generation."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
|
from anthropic import AsyncAnthropic
|
||||||
|
|
||||||
|
from .base import Provider
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class AnthropicProvider(Provider):
|
||||||
|
"""
|
||||||
|
Anthropic provider for text generation.
|
||||||
|
|
||||||
|
Supports Claude models via the Anthropic API.
|
||||||
|
Note: Anthropic doesn't provide embedding models, only text generation.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, api_key: str, model: str = "claude-3-5-sonnet-20241022"):
|
||||||
|
"""
|
||||||
|
Initialize Anthropic provider.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
api_key: Anthropic API key
|
||||||
|
model: Model name (e.g., "claude-3-5-sonnet-20241022")
|
||||||
|
"""
|
||||||
|
self.client = AsyncAnthropic(api_key=api_key)
|
||||||
|
self.model = model
|
||||||
|
|
||||||
|
logger.info(f"Initialized Anthropic provider (model={model})")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_embeddings(self) -> bool:
|
||||||
|
"""Whether this provider supports embedding generation."""
|
||||||
|
return False
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_generation(self) -> bool:
|
||||||
|
"""Whether this provider supports text generation."""
|
||||||
|
return True
|
||||||
|
|
||||||
|
async def embed(self, text: str) -> list[float]:
|
||||||
|
"""
|
||||||
|
Generate embedding vector for text.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: Anthropic doesn't provide embedding models
|
||||||
|
"""
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||||
|
)
|
||||||
|
|
||||||
|
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""
|
||||||
|
Generate embeddings for multiple texts.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: Anthropic doesn't provide embedding models
|
||||||
|
"""
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_dimension(self) -> int:
|
||||||
|
"""
|
||||||
|
Get embedding dimension.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: Anthropic doesn't provide embedding models
|
||||||
|
"""
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||||
|
)
|
||||||
|
|
||||||
|
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||||
|
"""
|
||||||
|
Generate text using Anthropic API.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The prompt to generate from
|
||||||
|
max_tokens: Maximum tokens to generate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Generated text
|
||||||
|
"""
|
||||||
|
message = await self.client.messages.create(
|
||||||
|
model=self.model,
|
||||||
|
max_tokens=max_tokens,
|
||||||
|
temperature=0.7,
|
||||||
|
messages=[{"role": "user", "content": prompt}],
|
||||||
|
)
|
||||||
|
return message.content[0].text
|
||||||
|
|
||||||
|
async def close(self) -> None:
|
||||||
|
"""Close the client (no-op for Anthropic SDK)."""
|
||||||
|
pass
|
||||||
@@ -0,0 +1,91 @@
|
|||||||
|
"""Unified provider interface for embeddings and text generation."""
|
||||||
|
|
||||||
|
from abc import ABC, abstractmethod
|
||||||
|
|
||||||
|
|
||||||
|
class Provider(ABC):
|
||||||
|
"""
|
||||||
|
Unified base class for LLM providers.
|
||||||
|
|
||||||
|
Providers can support embeddings, text generation, or both.
|
||||||
|
Use capability properties to determine what features are available.
|
||||||
|
"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
@abstractmethod
|
||||||
|
def supports_embeddings(self) -> bool:
|
||||||
|
"""Whether this provider supports embedding generation."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@property
|
||||||
|
@abstractmethod
|
||||||
|
def supports_generation(self) -> bool:
|
||||||
|
"""Whether this provider supports text generation."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def embed(self, text: str) -> list[float]:
|
||||||
|
"""
|
||||||
|
Generate embedding vector for text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector embedding as list of floats
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If provider doesn't support embeddings
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""
|
||||||
|
Generate embeddings for multiple texts (optimized).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
texts: List of texts to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of vector embeddings
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If provider doesn't support embeddings
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def get_dimension(self) -> int:
|
||||||
|
"""
|
||||||
|
Get embedding dimension for this provider.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector dimension (e.g., 768 for nomic-embed-text)
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If provider doesn't support embeddings
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||||
|
"""
|
||||||
|
Generate text from a prompt.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The prompt to generate from
|
||||||
|
max_tokens: Maximum tokens to generate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Generated text
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If provider doesn't support generation
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def close(self) -> None:
|
||||||
|
"""Close the provider and release resources."""
|
||||||
|
pass
|
||||||
@@ -0,0 +1,397 @@
|
|||||||
|
"""Amazon Bedrock provider for embeddings and text generation."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
try:
|
||||||
|
import boto3
|
||||||
|
from botocore.exceptions import BotoCoreError, ClientError
|
||||||
|
|
||||||
|
BOTO3_AVAILABLE = True
|
||||||
|
except ImportError:
|
||||||
|
BOTO3_AVAILABLE = False
|
||||||
|
|
||||||
|
from .base import Provider
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class BedrockProvider(Provider):
|
||||||
|
"""
|
||||||
|
Amazon Bedrock provider supporting both embeddings and text generation.
|
||||||
|
|
||||||
|
Uses AWS Bedrock Runtime API with boto3. Supports various model families:
|
||||||
|
- Embeddings: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2, cohere.embed-*
|
||||||
|
- Text Generation: anthropic.claude-*, meta.llama3-*, amazon.titan-text-*, mistral.*, etc.
|
||||||
|
|
||||||
|
Requires AWS credentials configured via:
|
||||||
|
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION)
|
||||||
|
- AWS credentials file (~/.aws/credentials)
|
||||||
|
- IAM role (when running on AWS)
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
region_name: str | None = None,
|
||||||
|
embedding_model: str | None = None,
|
||||||
|
generation_model: str | None = None,
|
||||||
|
aws_access_key_id: str | None = None,
|
||||||
|
aws_secret_access_key: str | None = None,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Initialize Bedrock provider.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
region_name: AWS region (e.g., "us-east-1"). Defaults to AWS_REGION env var.
|
||||||
|
embedding_model: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0").
|
||||||
|
None disables embeddings.
|
||||||
|
generation_model: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0").
|
||||||
|
None disables generation.
|
||||||
|
aws_access_key_id: AWS access key (optional, uses default credential chain if not provided)
|
||||||
|
aws_secret_access_key: AWS secret key (optional, uses default credential chain if not provided)
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ImportError: If boto3 is not installed
|
||||||
|
"""
|
||||||
|
if not BOTO3_AVAILABLE:
|
||||||
|
raise ImportError(
|
||||||
|
"boto3 is required for Bedrock provider. Install with: pip install boto3"
|
||||||
|
)
|
||||||
|
|
||||||
|
self.embedding_model = embedding_model
|
||||||
|
self.generation_model = generation_model
|
||||||
|
self._dimension: int | None = None # Detected dynamically
|
||||||
|
|
||||||
|
# Initialize bedrock-runtime client
|
||||||
|
client_kwargs: dict[str, Any] = {}
|
||||||
|
if region_name:
|
||||||
|
client_kwargs["region_name"] = region_name
|
||||||
|
if aws_access_key_id:
|
||||||
|
client_kwargs["aws_access_key_id"] = aws_access_key_id
|
||||||
|
if aws_secret_access_key:
|
||||||
|
client_kwargs["aws_secret_access_key"] = aws_secret_access_key
|
||||||
|
|
||||||
|
self.client = boto3.client("bedrock-runtime", **client_kwargs)
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Initialized Bedrock provider in region {region_name or 'default'} "
|
||||||
|
f"(embedding_model={embedding_model}, generation_model={generation_model})"
|
||||||
|
)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_embeddings(self) -> bool:
|
||||||
|
"""Whether this provider supports embedding generation."""
|
||||||
|
return self.embedding_model is not None
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_generation(self) -> bool:
|
||||||
|
"""Whether this provider supports text generation."""
|
||||||
|
return self.generation_model is not None
|
||||||
|
|
||||||
|
def _create_embedding_request(self, text: str) -> dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Create model-specific embedding request payload.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Request payload dict for the embedding model
|
||||||
|
"""
|
||||||
|
if not self.embedding_model:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Titan Embed models
|
||||||
|
if self.embedding_model.startswith("amazon.titan-embed"):
|
||||||
|
return {"inputText": text}
|
||||||
|
|
||||||
|
# Cohere Embed models
|
||||||
|
elif self.embedding_model.startswith("cohere.embed"):
|
||||||
|
return {"texts": [text], "input_type": "search_document"}
|
||||||
|
|
||||||
|
# Unknown model - try Titan format as default
|
||||||
|
else:
|
||||||
|
logger.warning(
|
||||||
|
f"Unknown embedding model format for {self.embedding_model}, "
|
||||||
|
"using Titan format as default"
|
||||||
|
)
|
||||||
|
return {"inputText": text}
|
||||||
|
|
||||||
|
def _parse_embedding_response(self, response: dict[str, Any]) -> list[float]:
|
||||||
|
"""
|
||||||
|
Parse model-specific embedding response.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
response: Raw response from Bedrock
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Embedding vector as list of floats
|
||||||
|
"""
|
||||||
|
# Titan Embed models
|
||||||
|
if self.embedding_model and self.embedding_model.startswith(
|
||||||
|
"amazon.titan-embed"
|
||||||
|
):
|
||||||
|
return response["embedding"]
|
||||||
|
|
||||||
|
# Cohere Embed models
|
||||||
|
elif self.embedding_model and self.embedding_model.startswith("cohere.embed"):
|
||||||
|
return response["embeddings"][0]
|
||||||
|
|
||||||
|
# Unknown model - try Titan format as default
|
||||||
|
else:
|
||||||
|
logger.warning(
|
||||||
|
f"Unknown embedding response format for {self.embedding_model}, "
|
||||||
|
"trying Titan format"
|
||||||
|
)
|
||||||
|
return response.get("embedding", response.get("embeddings", [None])[0])
|
||||||
|
|
||||||
|
async def embed(self, text: str) -> list[float]:
|
||||||
|
"""
|
||||||
|
Generate embedding vector for text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector embedding as list of floats
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||||
|
ClientError: If Bedrock API call fails
|
||||||
|
"""
|
||||||
|
if not self.supports_embeddings:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
request_body = self._create_embedding_request(text)
|
||||||
|
|
||||||
|
response = self.client.invoke_model(
|
||||||
|
modelId=self.embedding_model,
|
||||||
|
body=json.dumps(request_body),
|
||||||
|
accept="application/json",
|
||||||
|
contentType="application/json",
|
||||||
|
)
|
||||||
|
|
||||||
|
response_body = json.loads(response["body"].read())
|
||||||
|
embedding = self._parse_embedding_response(response_body)
|
||||||
|
|
||||||
|
return embedding
|
||||||
|
|
||||||
|
except (BotoCoreError, ClientError) as e:
|
||||||
|
logger.error(f"Bedrock embedding error: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""
|
||||||
|
Generate embeddings for multiple texts.
|
||||||
|
|
||||||
|
Note: Current implementation sends requests sequentially.
|
||||||
|
Future optimization could use asyncio for concurrent requests.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
texts: List of texts to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of vector embeddings
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||||
|
ClientError: If Bedrock API call fails
|
||||||
|
"""
|
||||||
|
if not self.supports_embeddings:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
embeddings = []
|
||||||
|
for text in texts:
|
||||||
|
embedding = await self.embed(text)
|
||||||
|
embeddings.append(embedding)
|
||||||
|
return embeddings
|
||||||
|
|
||||||
|
async def _detect_dimension(self):
|
||||||
|
"""
|
||||||
|
Detect embedding dimension by generating a test embedding.
|
||||||
|
"""
|
||||||
|
if self._dimension is None and self.supports_embeddings:
|
||||||
|
logger.debug(
|
||||||
|
f"Detecting embedding dimension for model {self.embedding_model}..."
|
||||||
|
)
|
||||||
|
test_embedding = await self.embed("test")
|
||||||
|
self._dimension = len(test_embedding)
|
||||||
|
logger.info(
|
||||||
|
f"Detected embedding dimension: {self._dimension} "
|
||||||
|
f"for model {self.embedding_model}"
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_dimension(self) -> int:
|
||||||
|
"""
|
||||||
|
Get embedding dimension.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector dimension for the configured embedding model
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||||
|
RuntimeError: If dimension not detected yet (call _detect_dimension first)
|
||||||
|
"""
|
||||||
|
if not self.supports_embeddings:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
if self._dimension is None:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Embedding dimension not detected yet for model {self.embedding_model}. "
|
||||||
|
"Call _detect_dimension() first or generate an embedding."
|
||||||
|
)
|
||||||
|
return self._dimension
|
||||||
|
|
||||||
|
def _create_generation_request(
|
||||||
|
self, prompt: str, max_tokens: int
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Create model-specific text generation request payload.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The prompt to generate from
|
||||||
|
max_tokens: Maximum tokens to generate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Request payload dict for the generation model
|
||||||
|
"""
|
||||||
|
if not self.generation_model:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Text generation not supported - no generation_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Anthropic Claude models
|
||||||
|
if self.generation_model.startswith("anthropic.claude"):
|
||||||
|
return {
|
||||||
|
"anthropic_version": "bedrock-2023-05-31",
|
||||||
|
"max_tokens": max_tokens,
|
||||||
|
"temperature": 0.7,
|
||||||
|
"messages": [{"role": "user", "content": prompt}],
|
||||||
|
}
|
||||||
|
|
||||||
|
# Meta Llama models
|
||||||
|
elif self.generation_model.startswith("meta.llama"):
|
||||||
|
return {"prompt": prompt, "max_gen_len": max_tokens, "temperature": 0.7}
|
||||||
|
|
||||||
|
# Amazon Titan Text models
|
||||||
|
elif self.generation_model.startswith("amazon.titan-text"):
|
||||||
|
return {
|
||||||
|
"inputText": prompt,
|
||||||
|
"textGenerationConfig": {
|
||||||
|
"maxTokenCount": max_tokens,
|
||||||
|
"temperature": 0.7,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
# Mistral models
|
||||||
|
elif self.generation_model.startswith("mistral"):
|
||||||
|
return {"prompt": prompt, "max_tokens": max_tokens, "temperature": 0.7}
|
||||||
|
|
||||||
|
# Unknown model - try Claude format as default
|
||||||
|
else:
|
||||||
|
logger.warning(
|
||||||
|
f"Unknown generation model format for {self.generation_model}, "
|
||||||
|
"using Claude format as default"
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"anthropic_version": "bedrock-2023-05-31",
|
||||||
|
"max_tokens": max_tokens,
|
||||||
|
"temperature": 0.7,
|
||||||
|
"messages": [{"role": "user", "content": prompt}],
|
||||||
|
}
|
||||||
|
|
||||||
|
def _parse_generation_response(self, response: dict[str, Any]) -> str:
|
||||||
|
"""
|
||||||
|
Parse model-specific text generation response.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
response: Raw response from Bedrock
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Generated text
|
||||||
|
"""
|
||||||
|
# Anthropic Claude models
|
||||||
|
if self.generation_model and self.generation_model.startswith(
|
||||||
|
"anthropic.claude"
|
||||||
|
):
|
||||||
|
return response["content"][0]["text"]
|
||||||
|
|
||||||
|
# Meta Llama models
|
||||||
|
elif self.generation_model and self.generation_model.startswith("meta.llama"):
|
||||||
|
return response["generation"]
|
||||||
|
|
||||||
|
# Amazon Titan Text models
|
||||||
|
elif self.generation_model and self.generation_model.startswith(
|
||||||
|
"amazon.titan-text"
|
||||||
|
):
|
||||||
|
return response["results"][0]["outputText"]
|
||||||
|
|
||||||
|
# Mistral models
|
||||||
|
elif self.generation_model and self.generation_model.startswith("mistral"):
|
||||||
|
return response["outputs"][0]["text"]
|
||||||
|
|
||||||
|
# Unknown model - try common response fields
|
||||||
|
else:
|
||||||
|
logger.warning(
|
||||||
|
f"Unknown generation response format for {self.generation_model}, "
|
||||||
|
"trying common fields"
|
||||||
|
)
|
||||||
|
# Try common response field names
|
||||||
|
for field in ["text", "generation", "outputText", "completion"]:
|
||||||
|
if field in response:
|
||||||
|
return response[field]
|
||||||
|
# Last resort: return JSON string
|
||||||
|
return json.dumps(response)
|
||||||
|
|
||||||
|
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||||
|
"""
|
||||||
|
Generate text from a prompt.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The prompt to generate from
|
||||||
|
max_tokens: Maximum tokens to generate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Generated text
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If generation not enabled (no generation_model)
|
||||||
|
ClientError: If Bedrock API call fails
|
||||||
|
"""
|
||||||
|
if not self.supports_generation:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Text generation not supported - no generation_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
request_body = self._create_generation_request(prompt, max_tokens)
|
||||||
|
|
||||||
|
response = self.client.invoke_model(
|
||||||
|
modelId=self.generation_model,
|
||||||
|
body=json.dumps(request_body),
|
||||||
|
accept="application/json",
|
||||||
|
contentType="application/json",
|
||||||
|
)
|
||||||
|
|
||||||
|
response_body = json.loads(response["body"].read())
|
||||||
|
text = self._parse_generation_response(response_body)
|
||||||
|
|
||||||
|
return text
|
||||||
|
|
||||||
|
except (BotoCoreError, ClientError) as e:
|
||||||
|
logger.error(f"Bedrock generation error: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
async def close(self) -> None:
|
||||||
|
"""Close the client (no-op for boto3 clients)."""
|
||||||
|
pass
|
||||||
@@ -0,0 +1,221 @@
|
|||||||
|
"""Unified Ollama provider for embeddings and text generation."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
from .base import Provider
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class OllamaProvider(Provider):
|
||||||
|
"""
|
||||||
|
Ollama provider supporting both embeddings and text generation.
|
||||||
|
|
||||||
|
Supports TLS, SSL verification, and automatic model loading.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
base_url: str,
|
||||||
|
embedding_model: str | None = None,
|
||||||
|
generation_model: str | None = None,
|
||||||
|
verify_ssl: bool = True,
|
||||||
|
timeout: httpx.Timeout | None = None,
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Initialize Ollama provider.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
base_url: Ollama API base URL (e.g., https://ollama.internal.example.com:443)
|
||||||
|
embedding_model: Model for embeddings (e.g., "nomic-embed-text"). None disables embeddings.
|
||||||
|
generation_model: Model for text generation (e.g., "llama3.2:1b"). None disables generation.
|
||||||
|
verify_ssl: Verify SSL certificates (default: True)
|
||||||
|
timeout: HTTP timeout configuration
|
||||||
|
"""
|
||||||
|
self.base_url = base_url.rstrip("/")
|
||||||
|
self.embedding_model = embedding_model
|
||||||
|
self.generation_model = generation_model
|
||||||
|
self.verify_ssl = verify_ssl
|
||||||
|
|
||||||
|
if timeout is None:
|
||||||
|
timeout = httpx.Timeout(timeout=120, connect=5)
|
||||||
|
|
||||||
|
self.client = httpx.AsyncClient(verify=verify_ssl, timeout=timeout)
|
||||||
|
self._dimension: int | None = None # Detected dynamically for embeddings
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Initialized Ollama provider: {base_url} "
|
||||||
|
f"(embedding_model={embedding_model}, generation_model={generation_model}, "
|
||||||
|
f"verify_ssl={verify_ssl})"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Pre-check and auto-load models
|
||||||
|
if embedding_model:
|
||||||
|
self._check_model_is_loaded(embedding_model, autoload=True)
|
||||||
|
if generation_model:
|
||||||
|
self._check_model_is_loaded(generation_model, autoload=True)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_embeddings(self) -> bool:
|
||||||
|
"""Whether this provider supports embedding generation."""
|
||||||
|
return self.embedding_model is not None
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_generation(self) -> bool:
|
||||||
|
"""Whether this provider supports text generation."""
|
||||||
|
return self.generation_model is not None
|
||||||
|
|
||||||
|
async def embed(self, text: str) -> list[float]:
|
||||||
|
"""
|
||||||
|
Generate embedding vector for text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector embedding as list of floats
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||||
|
"""
|
||||||
|
if not self.supports_embeddings:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
response = await self.client.post(
|
||||||
|
f"{self.base_url}/api/embeddings",
|
||||||
|
json={"model": self.embedding_model, "prompt": text},
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()["embedding"]
|
||||||
|
|
||||||
|
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""
|
||||||
|
Generate embeddings for multiple texts (batched requests).
|
||||||
|
|
||||||
|
Note: Ollama doesn't have native batch API, so we send requests sequentially.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
texts: List of texts to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of vector embeddings
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||||
|
"""
|
||||||
|
if not self.supports_embeddings:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
embeddings = []
|
||||||
|
for text in texts:
|
||||||
|
embedding = await self.embed(text)
|
||||||
|
embeddings.append(embedding)
|
||||||
|
return embeddings
|
||||||
|
|
||||||
|
async def _detect_dimension(self):
|
||||||
|
"""
|
||||||
|
Detect embedding dimension by generating a test embedding.
|
||||||
|
|
||||||
|
This method queries the model to determine the actual dimension
|
||||||
|
instead of relying on hardcoded values.
|
||||||
|
"""
|
||||||
|
if self._dimension is None and self.supports_embeddings:
|
||||||
|
logger.debug(
|
||||||
|
f"Detecting embedding dimension for model {self.embedding_model}..."
|
||||||
|
)
|
||||||
|
test_embedding = await self.embed("test")
|
||||||
|
self._dimension = len(test_embedding)
|
||||||
|
logger.info(
|
||||||
|
f"Detected embedding dimension: {self._dimension} "
|
||||||
|
f"for model {self.embedding_model}"
|
||||||
|
)
|
||||||
|
|
||||||
|
def get_dimension(self) -> int:
|
||||||
|
"""
|
||||||
|
Get embedding dimension.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector dimension for the configured embedding model
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||||
|
RuntimeError: If dimension not detected yet (call _detect_dimension first)
|
||||||
|
"""
|
||||||
|
if not self.supports_embeddings:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Embedding not supported - no embedding_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
if self._dimension is None:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Embedding dimension not detected yet for model {self.embedding_model}. "
|
||||||
|
"Call _detect_dimension() first or generate an embedding."
|
||||||
|
)
|
||||||
|
return self._dimension
|
||||||
|
|
||||||
|
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||||
|
"""
|
||||||
|
Generate text from a prompt.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The prompt to generate from
|
||||||
|
max_tokens: Maximum tokens to generate
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Generated text
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: If generation not enabled (no generation_model)
|
||||||
|
"""
|
||||||
|
if not self.supports_generation:
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Text generation not supported - no generation_model configured"
|
||||||
|
)
|
||||||
|
|
||||||
|
response = await self.client.post(
|
||||||
|
f"{self.base_url}/api/generate",
|
||||||
|
json={
|
||||||
|
"model": self.generation_model,
|
||||||
|
"prompt": prompt,
|
||||||
|
"stream": False,
|
||||||
|
"options": {
|
||||||
|
"num_predict": max_tokens,
|
||||||
|
"temperature": 0.7,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
data = response.json()
|
||||||
|
return data["response"]
|
||||||
|
|
||||||
|
def _check_model_is_loaded(self, model: str, autoload: bool = True):
|
||||||
|
"""
|
||||||
|
Check if model is loaded in Ollama, optionally auto-loading it.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model: Model name to check
|
||||||
|
autoload: Whether to automatically pull the model if not loaded
|
||||||
|
"""
|
||||||
|
response = httpx.get(f"{self.base_url}/api/tags")
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
models = [m["name"] for m in response.json().get("models", [])]
|
||||||
|
logger.info("Ollama has following models pre-loaded: %s", models)
|
||||||
|
|
||||||
|
if (model not in models) and autoload:
|
||||||
|
logger.warning(
|
||||||
|
"Model '%s' not yet available in ollama, attempting to pull now...",
|
||||||
|
model,
|
||||||
|
)
|
||||||
|
response = httpx.post(f"{self.base_url}/api/pull", json={"model": model})
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
async def close(self) -> None:
|
||||||
|
"""Close HTTP client."""
|
||||||
|
await self.client.aclose()
|
||||||
@@ -0,0 +1,126 @@
|
|||||||
|
"""Provider registry and factory for auto-detection and instantiation."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
|
||||||
|
from .base import Provider
|
||||||
|
from .bedrock import BedrockProvider
|
||||||
|
from .ollama import OllamaProvider
|
||||||
|
from .simple import SimpleProvider
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class ProviderRegistry:
|
||||||
|
"""
|
||||||
|
Registry for provider auto-detection and instantiation.
|
||||||
|
|
||||||
|
Checks environment variables in priority order and creates appropriate provider:
|
||||||
|
1. Bedrock (AWS_REGION + BEDROCK_*_MODEL)
|
||||||
|
2. Ollama (OLLAMA_BASE_URL)
|
||||||
|
3. Simple (fallback for testing/development)
|
||||||
|
"""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def create_provider() -> Provider:
|
||||||
|
"""
|
||||||
|
Auto-detect and create provider based on environment variables.
|
||||||
|
|
||||||
|
Priority order:
|
||||||
|
1. Bedrock - if AWS_REGION or BEDROCK_EMBEDDING_MODEL is set
|
||||||
|
2. Ollama - if OLLAMA_BASE_URL is set
|
||||||
|
3. Simple - fallback for testing/development
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Provider instance
|
||||||
|
|
||||||
|
Environment Variables:
|
||||||
|
Bedrock:
|
||||||
|
- AWS_REGION: AWS region (e.g., "us-east-1")
|
||||||
|
- AWS_ACCESS_KEY_ID: AWS access key (optional, uses credential chain)
|
||||||
|
- AWS_SECRET_ACCESS_KEY: AWS secret key (optional)
|
||||||
|
- BEDROCK_EMBEDDING_MODEL: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
|
||||||
|
- BEDROCK_GENERATION_MODEL: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||||
|
|
||||||
|
Ollama:
|
||||||
|
- OLLAMA_BASE_URL: Ollama API base URL (e.g., "http://localhost:11434")
|
||||||
|
- OLLAMA_EMBEDDING_MODEL: Model for embeddings (default: "nomic-embed-text")
|
||||||
|
- OLLAMA_GENERATION_MODEL: Model for text generation (e.g., "llama3.2:1b")
|
||||||
|
- OLLAMA_VERIFY_SSL: Verify SSL certificates (default: "true")
|
||||||
|
|
||||||
|
Simple (no configuration needed, fallback):
|
||||||
|
- SIMPLE_EMBEDDING_DIMENSION: Embedding dimension (default: 384)
|
||||||
|
"""
|
||||||
|
# 1. Check for Bedrock
|
||||||
|
aws_region = os.getenv("AWS_REGION")
|
||||||
|
bedrock_embedding_model = os.getenv("BEDROCK_EMBEDDING_MODEL")
|
||||||
|
bedrock_generation_model = os.getenv("BEDROCK_GENERATION_MODEL")
|
||||||
|
|
||||||
|
if aws_region or bedrock_embedding_model or bedrock_generation_model:
|
||||||
|
logger.info(
|
||||||
|
f"Using Bedrock provider: region={aws_region}, "
|
||||||
|
f"embedding_model={bedrock_embedding_model}, "
|
||||||
|
f"generation_model={bedrock_generation_model}"
|
||||||
|
)
|
||||||
|
return BedrockProvider(
|
||||||
|
region_name=aws_region,
|
||||||
|
embedding_model=bedrock_embedding_model,
|
||||||
|
generation_model=bedrock_generation_model,
|
||||||
|
aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
|
||||||
|
aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
|
||||||
|
)
|
||||||
|
|
||||||
|
# 2. Check for Ollama
|
||||||
|
ollama_url = os.getenv("OLLAMA_BASE_URL")
|
||||||
|
if ollama_url:
|
||||||
|
embedding_model = os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text")
|
||||||
|
generation_model = os.getenv("OLLAMA_GENERATION_MODEL")
|
||||||
|
verify_ssl = os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true"
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Using Ollama provider: {ollama_url}, "
|
||||||
|
f"embedding_model={embedding_model}, "
|
||||||
|
f"generation_model={generation_model}"
|
||||||
|
)
|
||||||
|
return OllamaProvider(
|
||||||
|
base_url=ollama_url,
|
||||||
|
embedding_model=embedding_model,
|
||||||
|
generation_model=generation_model,
|
||||||
|
verify_ssl=verify_ssl,
|
||||||
|
)
|
||||||
|
|
||||||
|
# 3. Fallback to Simple provider for development/testing
|
||||||
|
dimension = int(os.getenv("SIMPLE_EMBEDDING_DIMENSION", "384"))
|
||||||
|
logger.warning(
|
||||||
|
"No provider configured (AWS_REGION, OLLAMA_BASE_URL not set). "
|
||||||
|
"Using SimpleProvider for testing/development. "
|
||||||
|
"For production, configure Bedrock or Ollama."
|
||||||
|
)
|
||||||
|
return SimpleProvider(dimension=dimension)
|
||||||
|
|
||||||
|
|
||||||
|
# Singleton instance
|
||||||
|
_provider: Provider | None = None
|
||||||
|
|
||||||
|
|
||||||
|
def get_provider() -> Provider:
|
||||||
|
"""
|
||||||
|
Get singleton provider instance.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Global Provider instance (auto-detected on first call)
|
||||||
|
"""
|
||||||
|
global _provider
|
||||||
|
if _provider is None:
|
||||||
|
_provider = ProviderRegistry.create_provider()
|
||||||
|
return _provider
|
||||||
|
|
||||||
|
|
||||||
|
def reset_provider():
|
||||||
|
"""
|
||||||
|
Reset singleton provider instance.
|
||||||
|
|
||||||
|
Useful for testing or reconfiguration.
|
||||||
|
"""
|
||||||
|
global _provider
|
||||||
|
_provider = None
|
||||||
@@ -0,0 +1,149 @@
|
|||||||
|
"""Simple in-process embedding provider for testing.
|
||||||
|
|
||||||
|
This provider uses a basic TF-IDF-like approach with feature hashing to generate
|
||||||
|
deterministic embeddings without requiring external services. Suitable for testing
|
||||||
|
but not for production use.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import hashlib
|
||||||
|
import math
|
||||||
|
import re
|
||||||
|
from collections import Counter
|
||||||
|
|
||||||
|
from .base import Provider
|
||||||
|
|
||||||
|
|
||||||
|
class SimpleProvider(Provider):
|
||||||
|
"""Simple deterministic embedding provider using feature hashing.
|
||||||
|
|
||||||
|
This implementation:
|
||||||
|
- Tokenizes text into words
|
||||||
|
- Uses feature hashing to map words to fixed-size vectors
|
||||||
|
- Applies TF-IDF-like weighting
|
||||||
|
- Normalizes vectors to unit length
|
||||||
|
|
||||||
|
Not suitable for production but good for testing semantic search infrastructure.
|
||||||
|
Only supports embeddings, not text generation.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, dimension: int = 384):
|
||||||
|
"""Initialize simple embedding provider.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
dimension: Embedding dimension (default: 384)
|
||||||
|
"""
|
||||||
|
self.dimension = dimension
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_embeddings(self) -> bool:
|
||||||
|
"""Whether this provider supports embedding generation."""
|
||||||
|
return True
|
||||||
|
|
||||||
|
@property
|
||||||
|
def supports_generation(self) -> bool:
|
||||||
|
"""Whether this provider supports text generation."""
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _tokenize(self, text: str) -> list[str]:
|
||||||
|
"""Tokenize text into lowercase words.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of lowercase word tokens
|
||||||
|
"""
|
||||||
|
# Simple word tokenization
|
||||||
|
text = text.lower()
|
||||||
|
words = re.findall(r"\b\w+\b", text)
|
||||||
|
return words
|
||||||
|
|
||||||
|
def _hash_word(self, word: str) -> int:
|
||||||
|
"""Hash word to dimension index.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
word: Word to hash
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Index in range [0, dimension)
|
||||||
|
"""
|
||||||
|
hash_bytes = hashlib.md5(word.encode()).digest()
|
||||||
|
hash_int = int.from_bytes(hash_bytes[:4], byteorder="big")
|
||||||
|
return hash_int % self.dimension
|
||||||
|
|
||||||
|
def _embed_single(self, text: str) -> list[float]:
|
||||||
|
"""Generate embedding for single text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Normalized embedding vector
|
||||||
|
"""
|
||||||
|
tokens = self._tokenize(text)
|
||||||
|
if not tokens:
|
||||||
|
return [0.0] * self.dimension
|
||||||
|
|
||||||
|
# Count term frequencies
|
||||||
|
term_freq = Counter(tokens)
|
||||||
|
|
||||||
|
# Initialize vector
|
||||||
|
vector = [0.0] * self.dimension
|
||||||
|
|
||||||
|
# Apply TF weighting with feature hashing
|
||||||
|
for word, count in term_freq.items():
|
||||||
|
idx = self._hash_word(word)
|
||||||
|
# Simple TF weighting: log(1 + count)
|
||||||
|
vector[idx] += math.log1p(count)
|
||||||
|
|
||||||
|
# Normalize to unit length
|
||||||
|
norm = math.sqrt(sum(x * x for x in vector))
|
||||||
|
if norm > 0:
|
||||||
|
vector = [x / norm for x in vector]
|
||||||
|
|
||||||
|
return vector
|
||||||
|
|
||||||
|
async def embed(self, text: str) -> list[float]:
|
||||||
|
"""Generate embedding vector for text.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: Input text to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector embedding as list of floats
|
||||||
|
"""
|
||||||
|
return self._embed_single(text)
|
||||||
|
|
||||||
|
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""Generate embeddings for multiple texts.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
texts: List of texts to embed
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of vector embeddings
|
||||||
|
"""
|
||||||
|
return [self._embed_single(text) for text in texts]
|
||||||
|
|
||||||
|
def get_dimension(self) -> int:
|
||||||
|
"""Get embedding dimension.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Vector dimension
|
||||||
|
"""
|
||||||
|
return self.dimension
|
||||||
|
|
||||||
|
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||||
|
"""
|
||||||
|
Generate text from a prompt.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
NotImplementedError: Simple provider doesn't support text generation
|
||||||
|
"""
|
||||||
|
raise NotImplementedError(
|
||||||
|
"Text generation not supported by Simple provider - use Ollama, Anthropic, or Bedrock"
|
||||||
|
)
|
||||||
|
|
||||||
|
async def close(self) -> None:
|
||||||
|
"""Close the provider (no-op for simple provider)."""
|
||||||
|
pass
|
||||||
@@ -104,6 +104,7 @@ module-root = ""
|
|||||||
[dependency-groups]
|
[dependency-groups]
|
||||||
dev = [
|
dev = [
|
||||||
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
|
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
|
||||||
|
"boto3>=1.35.0", # For Amazon Bedrock provider (optional)
|
||||||
"commitizen>=4.8.2",
|
"commitizen>=4.8.2",
|
||||||
"datasets>=3.3.0", # For BeIR nfcorpus dataset loading
|
"datasets>=3.3.0", # For BeIR nfcorpus dataset loading
|
||||||
"ipython>=9.2.0",
|
"ipython>=9.2.0",
|
||||||
|
|||||||
@@ -1,99 +1,20 @@
|
|||||||
"""LLM provider abstraction for RAG evaluation.
|
"""LLM provider abstraction for RAG evaluation.
|
||||||
|
|
||||||
Supports Ollama (local) and Anthropic (cloud) providers for both ground truth
|
DEPRECATED: This module is maintained for backward compatibility with RAG evaluation tests.
|
||||||
|
New code should use nextcloud_mcp_server.providers directly.
|
||||||
|
|
||||||
|
Supports Ollama (local), Anthropic (cloud), and Bedrock (AWS) providers for both ground truth
|
||||||
generation and evaluation.
|
generation and evaluation.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
from typing import Protocol
|
|
||||||
|
|
||||||
import httpx
|
from nextcloud_mcp_server.providers import (
|
||||||
from anthropic import AsyncAnthropic
|
AnthropicProvider,
|
||||||
|
BedrockProvider,
|
||||||
|
OllamaProvider,
|
||||||
class LLMProvider(Protocol):
|
Provider,
|
||||||
"""Protocol for LLM providers."""
|
)
|
||||||
|
|
||||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
|
||||||
"""Generate text from a prompt.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
prompt: The prompt to generate from
|
|
||||||
max_tokens: Maximum tokens to generate
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Generated text
|
|
||||||
"""
|
|
||||||
...
|
|
||||||
|
|
||||||
async def close(self) -> None:
|
|
||||||
"""Close the provider and release resources."""
|
|
||||||
...
|
|
||||||
|
|
||||||
|
|
||||||
class OllamaProvider:
|
|
||||||
"""Ollama provider for local LLM inference."""
|
|
||||||
|
|
||||||
def __init__(self, base_url: str, model: str):
|
|
||||||
"""Initialize Ollama provider.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
base_url: Ollama API base URL (e.g., http://localhost:11434)
|
|
||||||
model: Model name (e.g., llama3.1:8b)
|
|
||||||
"""
|
|
||||||
self.base_url = base_url.rstrip("/")
|
|
||||||
self.model = model
|
|
||||||
self.client = httpx.AsyncClient(timeout=600.0) # 10 min timeout for generation
|
|
||||||
|
|
||||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
|
||||||
"""Generate text using Ollama API."""
|
|
||||||
response = await self.client.post(
|
|
||||||
f"{self.base_url}/api/generate",
|
|
||||||
json={
|
|
||||||
"model": self.model,
|
|
||||||
"prompt": prompt,
|
|
||||||
"stream": False,
|
|
||||||
"options": {
|
|
||||||
"num_predict": max_tokens,
|
|
||||||
"temperature": 0.7,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
)
|
|
||||||
response.raise_for_status()
|
|
||||||
data = response.json()
|
|
||||||
return data["response"]
|
|
||||||
|
|
||||||
async def close(self):
|
|
||||||
"""Close the HTTP client."""
|
|
||||||
await self.client.aclose()
|
|
||||||
|
|
||||||
|
|
||||||
class AnthropicProvider:
|
|
||||||
"""Anthropic provider for cloud LLM inference."""
|
|
||||||
|
|
||||||
def __init__(self, api_key: str, model: str):
|
|
||||||
"""Initialize Anthropic provider.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
api_key: Anthropic API key
|
|
||||||
model: Model name (e.g., claude-3-5-sonnet-20241022)
|
|
||||||
"""
|
|
||||||
self.client = AsyncAnthropic(api_key=api_key)
|
|
||||||
self.model = model
|
|
||||||
|
|
||||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
|
||||||
"""Generate text using Anthropic API."""
|
|
||||||
message = await self.client.messages.create(
|
|
||||||
model=self.model,
|
|
||||||
max_tokens=max_tokens,
|
|
||||||
temperature=0.7,
|
|
||||||
messages=[{"role": "user", "content": prompt}],
|
|
||||||
)
|
|
||||||
return message.content[0].text
|
|
||||||
|
|
||||||
async def close(self):
|
|
||||||
"""Close the client (no-op for Anthropic)."""
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
def create_llm_provider(
|
def create_llm_provider(
|
||||||
@@ -102,18 +23,24 @@ def create_llm_provider(
|
|||||||
ollama_model: str | None = None,
|
ollama_model: str | None = None,
|
||||||
anthropic_api_key: str | None = None,
|
anthropic_api_key: str | None = None,
|
||||||
anthropic_model: str | None = None,
|
anthropic_model: str | None = None,
|
||||||
) -> LLMProvider:
|
bedrock_region: str | None = None,
|
||||||
|
bedrock_model: str | None = None,
|
||||||
|
) -> Provider:
|
||||||
"""Create an LLM provider from environment variables or arguments.
|
"""Create an LLM provider from environment variables or arguments.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
provider: Provider type ('ollama' or 'anthropic'). Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
provider: Provider type ('ollama', 'anthropic', or 'bedrock').
|
||||||
|
Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
||||||
ollama_base_url: Ollama base URL. Defaults to RAG_EVAL_OLLAMA_BASE_URL or 'http://localhost:11434'
|
ollama_base_url: Ollama base URL. Defaults to RAG_EVAL_OLLAMA_BASE_URL or 'http://localhost:11434'
|
||||||
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.1:8b'
|
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.2:1b'
|
||||||
anthropic_api_key: Anthropic API key. Defaults to RAG_EVAL_ANTHROPIC_API_KEY env var
|
anthropic_api_key: Anthropic API key. Defaults to RAG_EVAL_ANTHROPIC_API_KEY env var
|
||||||
anthropic_model: Anthropic model. Defaults to RAG_EVAL_ANTHROPIC_MODEL or 'claude-3-5-sonnet-20241022'
|
anthropic_model: Anthropic model. Defaults to RAG_EVAL_ANTHROPIC_MODEL or 'claude-3-5-sonnet-20241022'
|
||||||
|
bedrock_region: AWS region. Defaults to RAG_EVAL_BEDROCK_REGION or AWS_REGION env var
|
||||||
|
bedrock_model: Bedrock model ID. Defaults to RAG_EVAL_BEDROCK_MODEL or
|
||||||
|
'anthropic.claude-3-sonnet-20240229-v1:0'
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
LLMProvider instance
|
Provider instance
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If provider is invalid or required credentials are missing
|
ValueError: If provider is invalid or required credentials are missing
|
||||||
@@ -130,7 +57,9 @@ def create_llm_provider(
|
|||||||
or "http://localhost:11434"
|
or "http://localhost:11434"
|
||||||
)
|
)
|
||||||
model = ollama_model or os.environ.get("RAG_EVAL_OLLAMA_MODEL", "llama3.2:1b")
|
model = ollama_model or os.environ.get("RAG_EVAL_OLLAMA_MODEL", "llama3.2:1b")
|
||||||
return OllamaProvider(base_url=base_url, model=model)
|
return OllamaProvider(
|
||||||
|
base_url=base_url, embedding_model=None, generation_model=model
|
||||||
|
)
|
||||||
|
|
||||||
elif provider == "anthropic":
|
elif provider == "anthropic":
|
||||||
api_key = anthropic_api_key or os.environ.get("RAG_EVAL_ANTHROPIC_API_KEY")
|
api_key = anthropic_api_key or os.environ.get("RAG_EVAL_ANTHROPIC_API_KEY")
|
||||||
@@ -143,7 +72,18 @@ def create_llm_provider(
|
|||||||
)
|
)
|
||||||
return AnthropicProvider(api_key=api_key, model=model)
|
return AnthropicProvider(api_key=api_key, model=model)
|
||||||
|
|
||||||
|
elif provider == "bedrock":
|
||||||
|
region = bedrock_region or os.environ.get(
|
||||||
|
"RAG_EVAL_BEDROCK_REGION", os.environ.get("AWS_REGION", "us-east-1")
|
||||||
|
)
|
||||||
|
model = bedrock_model or os.environ.get(
|
||||||
|
"RAG_EVAL_BEDROCK_MODEL", "anthropic.claude-3-sonnet-20240229-v1:0"
|
||||||
|
)
|
||||||
|
return BedrockProvider(
|
||||||
|
region_name=region, embedding_model=None, generation_model=model
|
||||||
|
)
|
||||||
|
|
||||||
else:
|
else:
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
f"Invalid provider: {provider}. Must be 'ollama' or 'anthropic'."
|
f"Invalid provider: {provider}. Must be 'ollama', 'anthropic', or 'bedrock'."
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -0,0 +1 @@
|
|||||||
|
"""Unit tests for provider infrastructure."""
|
||||||
@@ -0,0 +1,280 @@
|
|||||||
|
"""Unit tests for Bedrock provider."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
from unittest.mock import MagicMock
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from nextcloud_mcp_server.providers.bedrock import BOTO3_AVAILABLE, BedrockProvider
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def mock_bedrock_client(mocker):
|
||||||
|
"""Mock boto3 bedrock-runtime client."""
|
||||||
|
if not BOTO3_AVAILABLE:
|
||||||
|
pytest.skip("boto3 not installed")
|
||||||
|
|
||||||
|
mock_client = MagicMock()
|
||||||
|
mocker.patch("boto3.client", return_value=mock_client)
|
||||||
|
return mock_client
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_embedding_titan(mock_bedrock_client):
|
||||||
|
"""Test Bedrock embedding with Titan model."""
|
||||||
|
# Mock response
|
||||||
|
mock_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps({"embedding": [0.1, 0.2, 0.3]}).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||||
|
|
||||||
|
# Create provider
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
generation_model=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Test embedding
|
||||||
|
embedding = await provider.embed("test text")
|
||||||
|
|
||||||
|
assert embedding == [0.1, 0.2, 0.3]
|
||||||
|
mock_bedrock_client.invoke_model.assert_called_once()
|
||||||
|
call_args = mock_bedrock_client.invoke_model.call_args
|
||||||
|
|
||||||
|
assert call_args.kwargs["modelId"] == "amazon.titan-embed-text-v2:0"
|
||||||
|
body = json.loads(call_args.kwargs["body"])
|
||||||
|
assert body == {"inputText": "test text"}
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_embedding_batch(mock_bedrock_client):
|
||||||
|
"""Test Bedrock batch embedding."""
|
||||||
|
# Mock response
|
||||||
|
mock_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps({"embedding": [0.1, 0.2, 0.3]}).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||||
|
|
||||||
|
# Create provider
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
generation_model=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Test batch embedding
|
||||||
|
embeddings = await provider.embed_batch(["text1", "text2"])
|
||||||
|
|
||||||
|
assert len(embeddings) == 2
|
||||||
|
assert embeddings[0] == [0.1, 0.2, 0.3]
|
||||||
|
assert embeddings[1] == [0.1, 0.2, 0.3]
|
||||||
|
assert mock_bedrock_client.invoke_model.call_count == 2
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_generation_claude(mock_bedrock_client):
|
||||||
|
"""Test Bedrock text generation with Claude model."""
|
||||||
|
# Mock response
|
||||||
|
mock_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps(
|
||||||
|
{"content": [{"text": "Generated response"}]}
|
||||||
|
).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||||
|
|
||||||
|
# Create provider
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model=None,
|
||||||
|
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Test generation
|
||||||
|
text = await provider.generate("test prompt", max_tokens=100)
|
||||||
|
|
||||||
|
assert text == "Generated response"
|
||||||
|
mock_bedrock_client.invoke_model.assert_called_once()
|
||||||
|
call_args = mock_bedrock_client.invoke_model.call_args
|
||||||
|
|
||||||
|
assert call_args.kwargs["modelId"] == "anthropic.claude-3-sonnet-20240229-v1:0"
|
||||||
|
body = json.loads(call_args.kwargs["body"])
|
||||||
|
assert body["messages"][0]["content"] == "test prompt"
|
||||||
|
assert body["max_tokens"] == 100
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_generation_llama(mock_bedrock_client):
|
||||||
|
"""Test Bedrock text generation with Llama model."""
|
||||||
|
# Mock response
|
||||||
|
mock_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps({"generation": "Llama response"}).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||||
|
|
||||||
|
# Create provider
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model=None,
|
||||||
|
generation_model="meta.llama3-8b-instruct-v1:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Test generation
|
||||||
|
text = await provider.generate("test prompt")
|
||||||
|
|
||||||
|
assert text == "Llama response"
|
||||||
|
body = json.loads(mock_bedrock_client.invoke_model.call_args.kwargs["body"])
|
||||||
|
assert body["prompt"] == "test prompt"
|
||||||
|
assert "max_gen_len" in body
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_both_capabilities(mock_bedrock_client):
|
||||||
|
"""Test Bedrock with both embedding and generation models."""
|
||||||
|
# Mock responses
|
||||||
|
embed_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(return_value=json.dumps({"embedding": [0.1, 0.2]}).encode())
|
||||||
|
)
|
||||||
|
}
|
||||||
|
gen_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps({"content": [{"text": "Response"}]}).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
# Mock to return different responses based on modelId
|
||||||
|
def mock_invoke(modelId, body, **kwargs):
|
||||||
|
if "embed" in modelId:
|
||||||
|
return embed_response
|
||||||
|
else:
|
||||||
|
return gen_response
|
||||||
|
|
||||||
|
mock_bedrock_client.invoke_model.side_effect = mock_invoke
|
||||||
|
|
||||||
|
# Create provider with both models
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert provider.supports_embeddings is True
|
||||||
|
assert provider.supports_generation is True
|
||||||
|
|
||||||
|
# Test both capabilities
|
||||||
|
embedding = await provider.embed("test")
|
||||||
|
assert embedding == [0.1, 0.2]
|
||||||
|
|
||||||
|
text = await provider.generate("test")
|
||||||
|
assert text == "Response"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_no_embeddings():
|
||||||
|
"""Test Bedrock provider with no embedding model raises error."""
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model=None,
|
||||||
|
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert provider.supports_embeddings is False
|
||||||
|
|
||||||
|
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||||
|
await provider.embed("test")
|
||||||
|
|
||||||
|
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||||
|
await provider.embed_batch(["test"])
|
||||||
|
|
||||||
|
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||||
|
provider.get_dimension()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_no_generation():
|
||||||
|
"""Test Bedrock provider with no generation model raises error."""
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
generation_model=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert provider.supports_generation is False
|
||||||
|
|
||||||
|
with pytest.raises(NotImplementedError, match="no generation_model configured"):
|
||||||
|
await provider.generate("test")
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_dimension_detection(mock_bedrock_client):
|
||||||
|
"""Test dimension detection for Bedrock embeddings."""
|
||||||
|
# Mock response with specific dimension
|
||||||
|
mock_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps(
|
||||||
|
{"embedding": [0.1] * 1536} # 1536-dim embedding
|
||||||
|
).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||||
|
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="amazon.titan-embed-text-v2:0",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Dimension not detected yet
|
||||||
|
with pytest.raises(RuntimeError, match="not detected yet"):
|
||||||
|
provider.get_dimension()
|
||||||
|
|
||||||
|
# Detect dimension
|
||||||
|
await provider._detect_dimension()
|
||||||
|
|
||||||
|
# Now dimension should be available
|
||||||
|
assert provider.get_dimension() == 1536
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_bedrock_cohere_embedding(mock_bedrock_client):
|
||||||
|
"""Test Bedrock with Cohere embedding model."""
|
||||||
|
# Mock response
|
||||||
|
mock_response = {
|
||||||
|
"body": MagicMock(
|
||||||
|
read=MagicMock(
|
||||||
|
return_value=json.dumps({"embeddings": [[0.1, 0.2, 0.3]]}).encode()
|
||||||
|
)
|
||||||
|
)
|
||||||
|
}
|
||||||
|
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||||
|
|
||||||
|
provider = BedrockProvider(
|
||||||
|
region_name="us-east-1",
|
||||||
|
embedding_model="cohere.embed-english-v3",
|
||||||
|
)
|
||||||
|
|
||||||
|
embedding = await provider.embed("test text")
|
||||||
|
|
||||||
|
assert embedding == [0.1, 0.2, 0.3]
|
||||||
|
body = json.loads(mock_bedrock_client.invoke_model.call_args.kwargs["body"])
|
||||||
|
assert body == {"texts": ["test text"], "input_type": "search_document"}
|
||||||
@@ -233,6 +233,34 @@ wheels = [
|
|||||||
{ url = "https://files.pythonhosted.org/packages/f8/aa/5082412d1ee302e9e7d80b6949bc4d2a8fa1149aaab610c5fc24709605d6/authlib-1.6.5-py2.py3-none-any.whl", hash = "sha256:3e0e0507807f842b02175507bdee8957a1d5707fd4afb17c32fb43fee90b6e3a", size = 243608, upload-time = "2025-10-02T13:36:07.637Z" },
|
{ url = "https://files.pythonhosted.org/packages/f8/aa/5082412d1ee302e9e7d80b6949bc4d2a8fa1149aaab610c5fc24709605d6/authlib-1.6.5-py2.py3-none-any.whl", hash = "sha256:3e0e0507807f842b02175507bdee8957a1d5707fd4afb17c32fb43fee90b6e3a", size = 243608, upload-time = "2025-10-02T13:36:07.637Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "boto3"
|
||||||
|
version = "1.40.74"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "botocore" },
|
||||||
|
{ name = "jmespath" },
|
||||||
|
{ name = "s3transfer" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/a2/37/0db5fc46548b347255310893f1a47971a1d8eb0dbc46dfb5ace8a1e7d45e/boto3-1.40.74.tar.gz", hash = "sha256:484e46bf394b03a7c31b34f90945ebe1390cb1e2ac61980d128a9079beac87d4", size = 111592, upload-time = "2025-11-14T20:29:10.991Z" }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/d2/08/c52751748762901c0ca3c3019e3aa950010217f0fdf9940ebe68e6bb2f5a/boto3-1.40.74-py3-none-any.whl", hash = "sha256:41fc8844b37ae27b24bcabf8369769df246cc12c09453988d0696ad06d6aa9ef", size = 139360, upload-time = "2025-11-14T20:29:09.477Z" },
|
||||||
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "botocore"
|
||||||
|
version = "1.40.74"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "jmespath" },
|
||||||
|
{ name = "python-dateutil" },
|
||||||
|
{ name = "urllib3" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/81/dc/0412505f05286f282a75bb0c650e525ddcfaf3f6f1a05cd8e99d32a2db06/botocore-1.40.74.tar.gz", hash = "sha256:57de0b9ffeada06015b3c7e5186c77d0692b210d9e5efa294f3214df97e2f8ee", size = 14452479, upload-time = "2025-11-14T20:29:00.949Z" }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/7d/a2/306dec16e3c84f3ca7aaead0084358c1c7fbe6501f6160844cbc93bc871e/botocore-1.40.74-py3-none-any.whl", hash = "sha256:f39f5763e35e75f0bd91212b7b36120b1536203e8003cd952ef527db79702b15", size = 14117911, upload-time = "2025-11-14T20:28:58.153Z" },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "caldav"
|
name = "caldav"
|
||||||
version = "2.0.2.dev47+g3e44cf827"
|
version = "2.0.2.dev47+g3e44cf827"
|
||||||
@@ -1296,6 +1324,15 @@ wheels = [
|
|||||||
{ url = "https://files.pythonhosted.org/packages/2f/9c/6753e6522b8d0ef07d3a3d239426669e984fb0eba15a315cdbc1253904e4/jiter-0.12.0-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c24e864cb30ab82311c6425655b0cdab0a98c5d973b065c66a3f020740c2324c", size = 346110, upload-time = "2025-11-09T20:49:21.817Z" },
|
{ url = "https://files.pythonhosted.org/packages/2f/9c/6753e6522b8d0ef07d3a3d239426669e984fb0eba15a315cdbc1253904e4/jiter-0.12.0-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c24e864cb30ab82311c6425655b0cdab0a98c5d973b065c66a3f020740c2324c", size = 346110, upload-time = "2025-11-09T20:49:21.817Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "jmespath"
|
||||||
|
version = "1.0.1"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/00/2a/e867e8531cf3e36b41201936b7fa7ba7b5702dbef42922193f05c8976cd6/jmespath-1.0.1.tar.gz", hash = "sha256:90261b206d6defd58fdd5e85f478bf633a2901798906be2ad389150c5c60edbe", size = 25843, upload-time = "2022-06-17T18:00:12.224Z" }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/31/b4/b9b800c45527aadd64d5b442f9b932b00648617eb5d63d2c7a6587b7cafc/jmespath-1.0.1-py3-none-any.whl", hash = "sha256:02e2e4cc71b5bcab88332eebf907519190dd9e6e82107fa7f83b1003a6252980", size = 20256, upload-time = "2022-06-17T18:00:10.251Z" },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "jsonschema"
|
name = "jsonschema"
|
||||||
version = "4.25.1"
|
version = "4.25.1"
|
||||||
@@ -1849,6 +1886,7 @@ dependencies = [
|
|||||||
[package.dev-dependencies]
|
[package.dev-dependencies]
|
||||||
dev = [
|
dev = [
|
||||||
{ name = "anthropic" },
|
{ name = "anthropic" },
|
||||||
|
{ name = "boto3" },
|
||||||
{ name = "commitizen" },
|
{ name = "commitizen" },
|
||||||
{ name = "datasets" },
|
{ name = "datasets" },
|
||||||
{ name = "ipython" },
|
{ name = "ipython" },
|
||||||
@@ -1891,6 +1929,7 @@ requires-dist = [
|
|||||||
[package.metadata.requires-dev]
|
[package.metadata.requires-dev]
|
||||||
dev = [
|
dev = [
|
||||||
{ name = "anthropic", specifier = ">=0.42.0" },
|
{ name = "anthropic", specifier = ">=0.42.0" },
|
||||||
|
{ name = "boto3", specifier = ">=1.35.0" },
|
||||||
{ name = "commitizen", specifier = ">=4.8.2" },
|
{ name = "commitizen", specifier = ">=4.8.2" },
|
||||||
{ name = "datasets", specifier = ">=3.3.0" },
|
{ name = "datasets", specifier = ">=3.3.0" },
|
||||||
{ name = "ipython", specifier = ">=9.2.0" },
|
{ name = "ipython", specifier = ">=9.2.0" },
|
||||||
@@ -3270,6 +3309,18 @@ wheels = [
|
|||||||
{ url = "https://files.pythonhosted.org/packages/e5/80/69756670caedcf3b9be597a6e12276a6cf6197076eb62aad0c608f8efce0/ruff-0.14.5-py3-none-win_arm64.whl", hash = "sha256:4b700459d4649e2594b31f20a9de33bc7c19976d4746d8d0798ad959621d64a4", size = 13433331, upload-time = "2025-11-13T19:58:48.434Z" },
|
{ url = "https://files.pythonhosted.org/packages/e5/80/69756670caedcf3b9be597a6e12276a6cf6197076eb62aad0c608f8efce0/ruff-0.14.5-py3-none-win_arm64.whl", hash = "sha256:4b700459d4649e2594b31f20a9de33bc7c19976d4746d8d0798ad959621d64a4", size = 13433331, upload-time = "2025-11-13T19:58:48.434Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "s3transfer"
|
||||||
|
version = "0.14.0"
|
||||||
|
source = { registry = "https://pypi.org/simple" }
|
||||||
|
dependencies = [
|
||||||
|
{ name = "botocore" },
|
||||||
|
]
|
||||||
|
sdist = { url = "https://files.pythonhosted.org/packages/62/74/8d69dcb7a9efe8baa2046891735e5dfe433ad558ae23d9e3c14c633d1d58/s3transfer-0.14.0.tar.gz", hash = "sha256:eff12264e7c8b4985074ccce27a3b38a485bb7f7422cc8046fee9be4983e4125", size = 151547, upload-time = "2025-09-09T19:23:31.089Z" }
|
||||||
|
wheels = [
|
||||||
|
{ url = "https://files.pythonhosted.org/packages/48/f0/ae7ca09223a81a1d890b2557186ea015f6e0502e9b8cb8e1813f1d8cfa4e/s3transfer-0.14.0-py3-none-any.whl", hash = "sha256:ea3b790c7077558ed1f02a3072fb3cb992bbbd253392f4b6e9e8976941c7d456", size = 85712, upload-time = "2025-09-09T19:23:30.041Z" },
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "shellingham"
|
name = "shellingham"
|
||||||
version = "1.5.4"
|
version = "1.5.4"
|
||||||
|
|||||||
Reference in New Issue
Block a user