Compare commits
27 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| b1f7b1d30b | |||
| b8bdbb499f | |||
| 2522b13d35 | |||
| 6cfd7e2729 | |||
| 3aa7128f45 | |||
| c3282534eb | |||
| 862308418e | |||
| 3464b21845 | |||
| ea01ce7673 | |||
| 216cb94383 | |||
| 5f3e0b84a3 | |||
| 39131cefcc | |||
| 9498c0fa36 | |||
| ed33b39062 | |||
| 1504df6fb5 | |||
| 392e1536b9 | |||
| 00ed3f07e5 | |||
| 050e9a56b9 | |||
| 7fccd47722 | |||
| ad4b45889f | |||
| 5b484c9226 | |||
| f559ca049e | |||
| 8e7b3c3ded | |||
| c74695af16 | |||
| 1faf572546 | |||
| 2aa82d849c | |||
| d1fb7eb633 |
@@ -5,3 +5,4 @@
|
||||
!uv.lock
|
||||
|
||||
!nextcloud_mcp_server/**/*.py
|
||||
!nextcloud_mcp_server/**/*.html
|
||||
|
||||
@@ -85,4 +85,4 @@ jobs:
|
||||
NEXTCLOUD_USERNAME: "admin"
|
||||
NEXTCLOUD_PASSWORD: "admin"
|
||||
run: |
|
||||
uv run pytest -v --log-cli-level=WARN -m smoke
|
||||
uv run pytest -v --log-cli-level=WARN -m unit -m smoke
|
||||
|
||||
@@ -1,3 +1,38 @@
|
||||
## v0.41.0 (2025-11-17)
|
||||
|
||||
### Feat
|
||||
|
||||
- add configurable fusion algorithms for BM25 hybrid search
|
||||
- add chunk position tracking to vector indexing and search
|
||||
- add vector viz template and chunk context endpoint
|
||||
|
||||
### Fix
|
||||
|
||||
- prevent infinite loop in DocumentChunker with position tracking
|
||||
- Relax SearchResult validation to support DBSF fusion scores > 1.0
|
||||
|
||||
## v0.40.0 (2025-11-16)
|
||||
|
||||
### Feat
|
||||
|
||||
- add unified provider architecture with Amazon Bedrock support
|
||||
|
||||
### Fix
|
||||
|
||||
- suppress Starlette middleware type warnings in ty checker
|
||||
|
||||
## v0.39.0 (2025-11-16)
|
||||
|
||||
### Feat
|
||||
|
||||
- Implement BM25 hybrid search with native Qdrant RRF fusion
|
||||
|
||||
### Fix
|
||||
|
||||
- Handle named vectors in visualization and semantic search
|
||||
- Update vizApp to use bm25_hybrid algorithm and remove deprecated weights
|
||||
- Update viz routes to use BM25 hybrid search after refactor
|
||||
|
||||
## v0.38.0 (2025-11-16)
|
||||
|
||||
### Feat
|
||||
|
||||
@@ -61,8 +61,60 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
- `nextcloud_mcp_server/server/` - MCP tool/resource definitions
|
||||
- `nextcloud_mcp_server/auth/` - OAuth/OIDC authentication
|
||||
- `nextcloud_mcp_server/models/` - Pydantic response models
|
||||
- `nextcloud_mcp_server/providers/` - Unified LLM provider infrastructure (embeddings + generation)
|
||||
- `tests/` - Layered test suite (unit, smoke, integration, load)
|
||||
|
||||
### Provider Architecture (ADR-015)
|
||||
|
||||
**Unified Provider System** for embeddings and text generation:
|
||||
|
||||
**Location:** `nextcloud_mcp_server/providers/`
|
||||
- `base.py` - `Provider` ABC with optional capabilities
|
||||
- `registry.py` - Auto-detection and factory pattern
|
||||
- `ollama.py` - Ollama provider (embeddings + generation)
|
||||
- `anthropic.py` - Anthropic provider (generation only)
|
||||
- `bedrock.py` - Amazon Bedrock provider (embeddings + generation)
|
||||
- `simple.py` - Simple in-memory provider (embeddings only, fallback)
|
||||
|
||||
**Usage:**
|
||||
```python
|
||||
from nextcloud_mcp_server.providers import get_provider
|
||||
|
||||
provider = get_provider() # Auto-detects from environment
|
||||
|
||||
# Check capabilities
|
||||
if provider.supports_embeddings:
|
||||
embeddings = await provider.embed_batch(texts)
|
||||
|
||||
if provider.supports_generation:
|
||||
text = await provider.generate("prompt", max_tokens=500)
|
||||
```
|
||||
|
||||
**Environment Variables:**
|
||||
|
||||
Bedrock:
|
||||
- `AWS_REGION` - AWS region (e.g., "us-east-1")
|
||||
- `BEDROCK_EMBEDDING_MODEL` - Embedding model ID (e.g., "amazon.titan-embed-text-v2:0")
|
||||
- `BEDROCK_GENERATION_MODEL` - Generation model ID (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||
- `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` - Optional, uses AWS credential chain
|
||||
|
||||
Ollama:
|
||||
- `OLLAMA_BASE_URL` - API URL (e.g., "http://localhost:11434")
|
||||
- `OLLAMA_EMBEDDING_MODEL` - Embedding model (default: "nomic-embed-text")
|
||||
- `OLLAMA_GENERATION_MODEL` - Generation model (e.g., "llama3.2:1b")
|
||||
- `OLLAMA_VERIFY_SSL` - SSL verification (default: "true")
|
||||
|
||||
Simple (fallback, no config needed):
|
||||
- `SIMPLE_EMBEDDING_DIMENSION` - Dimension (default: 384)
|
||||
|
||||
**Auto-Detection Priority:** Bedrock → Ollama → Simple
|
||||
|
||||
**Backward Compatibility:**
|
||||
- Old code using `nextcloud_mcp_server.embedding.get_embedding_service()` still works
|
||||
- `EmbeddingService` now wraps `get_provider()` internally
|
||||
|
||||
**For Details:** See `docs/ADR-015-unified-provider-architecture.md`
|
||||
|
||||
## Development Commands (Quick Reference)
|
||||
|
||||
### Testing
|
||||
|
||||
+2
-2
@@ -1,6 +1,6 @@
|
||||
FROM python:3.12-slim-trixie
|
||||
FROM docker.io/library/python:3.12-slim-trixie@sha256:d86b4c74b936c438cd4cc3a9f7256b9a7c27ad68c7caf8c205e18d9845af0164
|
||||
|
||||
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
|
||||
COPY --from=ghcr.io/astral-sh/uv:latest@sha256:f6e3549ed287fee0ddde2460a2a74a2d74366f84b04aaa34c1f19fec40da8652 /uv /uvx /bin/
|
||||
|
||||
# Install dependencies
|
||||
# 1. git (required for caldav dependency from git)
|
||||
|
||||
@@ -2,8 +2,8 @@ apiVersion: v2
|
||||
name: nextcloud-mcp-server
|
||||
description: A Helm chart for Nextcloud MCP Server - enables AI assistants to interact with Nextcloud
|
||||
type: application
|
||||
version: 0.38.0
|
||||
appVersion: "0.38.0"
|
||||
version: 0.41.0
|
||||
appVersion: "0.41.0"
|
||||
keywords:
|
||||
- nextcloud
|
||||
- mcp
|
||||
|
||||
@@ -1,8 +1,4 @@
|
||||
Here is a complete Architectural Decision Record (ADR) template based on your requirements. You can copy, paste, and adapt this directly.
|
||||
|
||||
---
|
||||
|
||||
## ADR-007: Replace Custom Keyword Search with BM25 Hybrid Search via Qdrant
|
||||
# ADR-014: Replace Custom Keyword Search with BM25 Hybrid Search via Qdrant
|
||||
|
||||
**Date:** 2025-11-16
|
||||
|
||||
@@ -151,7 +147,95 @@ This decision consolidates our retrieval logic, eliminates the data consistency
|
||||
|
||||
**Benefits Realized:**
|
||||
- ✅ Consolidated architecture (single Qdrant database for both dense + sparse)
|
||||
- ✅ Native RRF fusion (database-level, more efficient)
|
||||
- ✅ Native fusion algorithms (database-level, more efficient)
|
||||
- ✅ Industry-standard BM25 (replaces custom keyword search)
|
||||
- ✅ Simplified codebase (removed 736 lines of legacy code)
|
||||
- ✅ Better relevance (handles both semantic and keyword queries)
|
||||
- ✅ Configurable fusion methods (RRF and DBSF)
|
||||
|
||||
---
|
||||
|
||||
### 7. Fusion Algorithm Options
|
||||
|
||||
**Update: 2025-11-16**
|
||||
|
||||
The BM25 hybrid search now supports two fusion algorithms for combining dense (semantic) and sparse (BM25) search results:
|
||||
|
||||
#### Reciprocal Rank Fusion (RRF)
|
||||
|
||||
**Default fusion method.** RRF is a widely-used, well-established algorithm that combines rankings from multiple retrieval systems using the reciprocal rank formula:
|
||||
|
||||
```
|
||||
RRF(doc) = Σ 1/(k + rank_i(doc))
|
||||
```
|
||||
|
||||
where `k` is a constant (typically 60) and `rank_i(doc)` is the rank of the document in retrieval system `i`.
|
||||
|
||||
**Characteristics:**
|
||||
- ✅ **General-purpose**: Works well across diverse query types and document collections
|
||||
- ✅ **Rank-based**: Focuses on relative rankings rather than absolute scores
|
||||
- ✅ **Established**: Well-tested, documented, and understood in IR literature
|
||||
- ✅ **Robust**: Less sensitive to score distribution differences between systems
|
||||
|
||||
**When to use RRF:**
|
||||
- Default choice for most use cases
|
||||
- When you have mixed query types (semantic + keyword)
|
||||
- When retrieval systems have very different score ranges
|
||||
- When you want predictable, well-understood behavior
|
||||
|
||||
#### Distribution-Based Score Fusion (DBSF)
|
||||
|
||||
**Alternative fusion method.** DBSF normalizes scores from each retrieval system using distribution statistics before combining them:
|
||||
|
||||
1. **Normalization**: For each query, calculates mean (μ) and standard deviation (σ) of scores
|
||||
2. **Outlier handling**: Uses μ ± 3σ as normalization bounds
|
||||
3. **Fusion**: Sums normalized scores across systems
|
||||
|
||||
**Characteristics:**
|
||||
- ✅ **Score-aware**: Uses actual relevance scores, not just rankings
|
||||
- ✅ **Statistical**: Normalizes based on score distribution properties
|
||||
- ⚠️ **Experimental**: Newer algorithm, less battle-tested than RRF
|
||||
- ⚠️ **Sensitive**: May behave differently depending on score distributions
|
||||
|
||||
**When to use DBSF:**
|
||||
- When retrieval systems have vastly different score ranges that RRF doesn't balance well
|
||||
- When you want to experiment with score-based (vs rank-based) fusion
|
||||
- When statistical normalization better matches your use case
|
||||
- For A/B testing against RRF to measure retrieval quality improvements
|
||||
|
||||
#### Configuration
|
||||
|
||||
Both fusion algorithms are exposed via the `fusion` parameter in MCP tools:
|
||||
|
||||
```python
|
||||
# Use RRF (default)
|
||||
response = await nc_semantic_search(
|
||||
query="async programming",
|
||||
fusion="rrf" # Can be omitted, RRF is default
|
||||
)
|
||||
|
||||
# Use DBSF
|
||||
response = await nc_semantic_search(
|
||||
query="async programming",
|
||||
fusion="dbsf"
|
||||
)
|
||||
```
|
||||
|
||||
The `nc_semantic_search_answer` tool also supports the `fusion` parameter and passes it through to the underlying search.
|
||||
|
||||
#### Future: Configurable Weights
|
||||
|
||||
**Current limitation**: Neither RRF nor DBSF currently support per-system weights (e.g., 0.8 for semantic, 0.2 for BM25). This is a Qdrant platform limitation tracked in [qdrant/qdrant#6067](https://github.com/qdrant/qdrant/issues/6067).
|
||||
|
||||
When Qdrant adds weight support, the `fusion` parameter can be extended to accept weight configurations:
|
||||
|
||||
```python
|
||||
# Hypothetical future API
|
||||
response = await nc_semantic_search(
|
||||
query="async programming",
|
||||
fusion="rrf",
|
||||
fusion_weights={"dense": 0.7, "sparse": 0.3} # Not yet implemented
|
||||
)
|
||||
```
|
||||
|
||||
**Recommendation**: Start with RRF (default). If you encounter cases where keyword matches are under- or over-weighted, experiment with DBSF. Monitor [qdrant/qdrant#6067](https://github.com/qdrant/qdrant/issues/6067) for configurable weight support.
|
||||
|
||||
@@ -0,0 +1,380 @@
|
||||
# ADR-015: Unified Provider Architecture for Embeddings and Text Generation
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2025-01-16
|
||||
**Deciders:** Development Team
|
||||
**Related:** ADR-003 (Vector Database), ADR-008 (MCP Sampling), ADR-013 (RAG Evaluation)
|
||||
|
||||
## Context
|
||||
|
||||
Prior to this refactoring, the codebase had two separate provider systems:
|
||||
|
||||
1. **Embedding Providers** (`nextcloud_mcp_server/embedding/`)
|
||||
- Used `EmbeddingProvider` ABC with methods: `embed()`, `embed_batch()`, `get_dimension()`
|
||||
- Had auto-detection via `EmbeddingService._detect_provider()`
|
||||
- Used for semantic search and vector indexing (production)
|
||||
|
||||
2. **LLM Providers** (`tests/rag_evaluation/llm_providers.py`)
|
||||
- Used `LLMProvider` Protocol with method: `generate()`
|
||||
- Had separate factory function `create_llm_provider()`
|
||||
- Used only for RAG evaluation tests (not production)
|
||||
|
||||
This fragmentation created several problems:
|
||||
|
||||
### Problems with Dual Provider Systems
|
||||
|
||||
1. **Code Duplication**
|
||||
- Ollama configuration appeared in both `embedding/service.py` and `tests/rag_evaluation/llm_providers.py`
|
||||
- Similar provider detection logic in multiple places
|
||||
- Separate singleton patterns for each system
|
||||
|
||||
2. **Limited Extensibility**
|
||||
- Hard-coded provider detection in `EmbeddingService._detect_provider()`
|
||||
- No support for providers that offer both capabilities (like Bedrock)
|
||||
- Adding new providers required modifying multiple files
|
||||
|
||||
3. **Inconsistent Patterns**
|
||||
- BM25 provider didn't follow `EmbeddingProvider` ABC
|
||||
- Different method names across providers (`embed` vs `encode`)
|
||||
- ABC vs Protocol for type checking
|
||||
|
||||
4. **Difficult Scaling**
|
||||
- Adding Amazon Bedrock (our third provider) would exacerbate all issues
|
||||
- No clear path for future providers (OpenAI, Cohere, etc.)
|
||||
|
||||
### Amazon Bedrock Requirements
|
||||
|
||||
Bedrock naturally supports **both** embeddings and text generation:
|
||||
- **Embeddings**: `amazon.titan-embed-text-v1/v2`, `cohere.embed-*`
|
||||
- **Text Generation**: `anthropic.claude-*`, `meta.llama3-*`, `amazon.titan-text-*`
|
||||
- **Unified API**: Single `invoke_model()` method via bedrock-runtime
|
||||
|
||||
This made it the perfect opportunity to establish a unified provider architecture.
|
||||
|
||||
## Decision
|
||||
|
||||
We refactored the provider infrastructure to use a **unified Provider ABC** with optional capabilities:
|
||||
|
||||
### 1. Unified Provider Interface
|
||||
|
||||
**New Structure:**
|
||||
```
|
||||
nextcloud_mcp_server/providers/
|
||||
├── __init__.py
|
||||
├── base.py # Provider ABC with optional capabilities
|
||||
├── registry.py # Auto-detection and factory
|
||||
├── ollama.py # Supports both embedding + generation
|
||||
├── anthropic.py # Generation only
|
||||
├── bedrock.py # Supports both embedding + generation
|
||||
└── simple.py # Embedding only (testing fallback)
|
||||
```
|
||||
|
||||
**Base Class (`providers/base.py`):**
|
||||
```python
|
||||
class Provider(ABC):
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""Generate embedding (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""Generate batch embeddings (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_dimension(self) -> int:
|
||||
"""Get embedding dimension (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text (raises NotImplementedError if not supported)."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def close(self) -> None:
|
||||
"""Close provider and release resources."""
|
||||
pass
|
||||
```
|
||||
|
||||
### 2. Provider Registry
|
||||
|
||||
**Auto-Detection Priority** (`providers/registry.py`):
|
||||
```python
|
||||
class ProviderRegistry:
|
||||
@staticmethod
|
||||
def create_provider() -> Provider:
|
||||
# 1. Bedrock (AWS_REGION or BEDROCK_*_MODEL)
|
||||
# 2. Ollama (OLLAMA_BASE_URL)
|
||||
# 3. Simple (fallback)
|
||||
```
|
||||
|
||||
**Environment Variables:**
|
||||
|
||||
**Bedrock:**
|
||||
- `AWS_REGION`: AWS region (e.g., "us-east-1")
|
||||
- `AWS_ACCESS_KEY_ID`: AWS access key (optional, uses credential chain)
|
||||
- `AWS_SECRET_ACCESS_KEY`: AWS secret key (optional)
|
||||
- `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
|
||||
- `BEDROCK_GENERATION_MODEL`: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||
|
||||
**Ollama:**
|
||||
- `OLLAMA_BASE_URL`: Ollama API base URL (e.g., "http://localhost:11434")
|
||||
- `OLLAMA_EMBEDDING_MODEL`: Model for embeddings (default: "nomic-embed-text")
|
||||
- `OLLAMA_GENERATION_MODEL`: Model for text generation (e.g., "llama3.2:1b")
|
||||
- `OLLAMA_VERIFY_SSL`: Verify SSL certificates (default: "true")
|
||||
|
||||
**Simple (no configuration, fallback):**
|
||||
- `SIMPLE_EMBEDDING_DIMENSION`: Embedding dimension (default: 384)
|
||||
|
||||
### 3. Backward Compatibility
|
||||
|
||||
**Old Code Continues to Work:**
|
||||
```python
|
||||
# Old way (still works)
|
||||
from nextcloud_mcp_server.embedding import get_embedding_service
|
||||
|
||||
service = get_embedding_service() # Returns singleton Provider
|
||||
embeddings = await service.embed_batch(texts)
|
||||
```
|
||||
|
||||
**New Way (recommended):**
|
||||
```python
|
||||
# New way (cleaner)
|
||||
from nextcloud_mcp_server.providers import get_provider
|
||||
|
||||
provider = get_provider() # Returns singleton Provider
|
||||
embeddings = await provider.embed_batch(texts)
|
||||
|
||||
# Can also use generation if provider supports it
|
||||
if provider.supports_generation:
|
||||
text = await provider.generate("prompt")
|
||||
```
|
||||
|
||||
**Migration Path:**
|
||||
- `embedding/service.py` now wraps `providers.get_provider()` for compatibility
|
||||
- `tests/rag_evaluation/llm_providers.py` now uses unified providers
|
||||
- Old imports still work, marked as deprecated in docstrings
|
||||
|
||||
### 4. Amazon Bedrock Implementation
|
||||
|
||||
**Features:**
|
||||
- Supports both embeddings and text generation
|
||||
- Model-specific request/response handling for:
|
||||
- Titan Embed (amazon.titan-embed-text-*)
|
||||
- Cohere Embed (cohere.embed-*)
|
||||
- Claude (anthropic.claude-*)
|
||||
- Llama (meta.llama3-*)
|
||||
- Titan Text (amazon.titan-text-*)
|
||||
- Mistral (mistral.*)
|
||||
- Uses boto3 bedrock-runtime client
|
||||
- Graceful degradation if boto3 not installed
|
||||
- Async implementation matching existing patterns
|
||||
|
||||
**Model-Specific Handling:**
|
||||
```python
|
||||
# Bedrock embedding request (Titan)
|
||||
{"inputText": text}
|
||||
|
||||
# Bedrock generation request (Claude)
|
||||
{
|
||||
"anthropic_version": "bedrock-2023-05-31",
|
||||
"max_tokens": max_tokens,
|
||||
"temperature": 0.7,
|
||||
"messages": [{"role": "user", "content": prompt}]
|
||||
}
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Sustainable Provider Additions**
|
||||
- New providers only need to implement `Provider` ABC
|
||||
- Auto-detection via environment variables
|
||||
- No modifications to existing code required
|
||||
|
||||
2. **Code Consolidation**
|
||||
- Single provider interface instead of two
|
||||
- Unified configuration pattern
|
||||
- Eliminated duplication
|
||||
|
||||
3. **Better Extensibility**
|
||||
- Providers can support one or both capabilities
|
||||
- Clear capability detection via properties
|
||||
- Registry pattern simplifies auto-detection
|
||||
|
||||
4. **Improved Testing**
|
||||
- RAG evaluation can use any provider (Ollama, Anthropic, Bedrock)
|
||||
- Comprehensive unit tests for all providers
|
||||
- Mocked boto3 tests for Bedrock
|
||||
|
||||
5. **Production-Ready Bedrock Support**
|
||||
- Full embedding and generation support
|
||||
- Multiple model families supported
|
||||
- AWS credential chain integration
|
||||
|
||||
### Neutral
|
||||
|
||||
1. **Optional Boto3 Dependency**
|
||||
- boto3 is dev dependency only (not required for core functionality)
|
||||
- Bedrock provider gracefully fails if boto3 not installed
|
||||
- Users who want Bedrock must `pip install boto3`
|
||||
|
||||
2. **Capability Properties**
|
||||
- All providers must implement capability properties
|
||||
- Methods raise `NotImplementedError` if capability not supported
|
||||
- Clear error messages guide users to alternatives
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Migration Effort**
|
||||
- Existing code must be migrated to new imports (optional, backward compatible)
|
||||
- Documentation needs updating
|
||||
- Users must learn new environment variables
|
||||
|
||||
2. **Increased Complexity**
|
||||
- Provider base class has more methods (embedding + generation)
|
||||
- More environment variables to configure
|
||||
- Capability detection adds runtime checks
|
||||
|
||||
## Implementation
|
||||
|
||||
### Files Created
|
||||
|
||||
**New Provider Infrastructure:**
|
||||
- `nextcloud_mcp_server/providers/__init__.py`
|
||||
- `nextcloud_mcp_server/providers/base.py`
|
||||
- `nextcloud_mcp_server/providers/registry.py`
|
||||
- `nextcloud_mcp_server/providers/ollama.py`
|
||||
- `nextcloud_mcp_server/providers/anthropic.py`
|
||||
- `nextcloud_mcp_server/providers/bedrock.py`
|
||||
- `nextcloud_mcp_server/providers/simple.py`
|
||||
|
||||
**Tests:**
|
||||
- `tests/unit/providers/__init__.py`
|
||||
- `tests/unit/providers/test_bedrock.py` (9 unit tests)
|
||||
|
||||
**Documentation:**
|
||||
- `docs/ADR-015-unified-provider-architecture.md` (this file)
|
||||
|
||||
### Files Modified
|
||||
|
||||
**Backward Compatibility:**
|
||||
- `nextcloud_mcp_server/embedding/service.py` - Now wraps `get_provider()`
|
||||
- `tests/rag_evaluation/llm_providers.py` - Uses unified providers
|
||||
|
||||
**Dependencies:**
|
||||
- `pyproject.toml` - Added `boto3>=1.35.0` to dev dependencies
|
||||
|
||||
### Testing Results
|
||||
|
||||
**Unit Tests:** 127 passed (including 9 new Bedrock tests)
|
||||
**Type Checking:** All checks passed (ty)
|
||||
**Linting:** All checks passed (ruff)
|
||||
**Backward Compatibility:** Verified - existing embedding tests work
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Alternative 1: Keep Separate Provider Systems
|
||||
|
||||
**Pros:**
|
||||
- No refactoring needed
|
||||
- Simpler short-term
|
||||
|
||||
**Cons:**
|
||||
- Bedrock would need to be implemented twice
|
||||
- Continued code duplication
|
||||
- No long-term scalability
|
||||
|
||||
**Decision:** Rejected - technical debt would continue to grow
|
||||
|
||||
### Alternative 2: Separate Embedding and Generation Providers
|
||||
|
||||
Use composition instead of unified interface:
|
||||
```python
|
||||
class CombinedProvider:
|
||||
def __init__(self, embedding: EmbeddingProvider, generation: LLMProvider):
|
||||
self.embedding = embedding
|
||||
self.generation = generation
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Clearer separation of concerns
|
||||
- Simpler individual providers
|
||||
|
||||
**Cons:**
|
||||
- Bedrock and Ollama naturally do both - artificial separation
|
||||
- More complex configuration (two providers to configure)
|
||||
- More boilerplate code
|
||||
|
||||
**Decision:** Rejected - unified interface better matches provider capabilities
|
||||
|
||||
### Alternative 3: Plugin System
|
||||
|
||||
Dynamic provider registration via entry points:
|
||||
```python
|
||||
# setup.py
|
||||
entry_points={
|
||||
'nextcloud_mcp.providers': [
|
||||
'ollama = nextcloud_mcp_server.providers.ollama:OllamaProvider',
|
||||
'bedrock = nextcloud_mcp_server.providers.bedrock:BedrockProvider',
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Most extensible
|
||||
- Third-party providers possible
|
||||
|
||||
**Cons:**
|
||||
- Over-engineered for current needs
|
||||
- Added complexity
|
||||
- No immediate benefit
|
||||
|
||||
**Decision:** Deferred - can add later if needed
|
||||
|
||||
## Future Work
|
||||
|
||||
1. **Additional Providers**
|
||||
- OpenAI (embeddings + generation)
|
||||
- Cohere (embeddings + generation)
|
||||
- Google Vertex AI
|
||||
- Azure OpenAI
|
||||
|
||||
2. **Provider Features**
|
||||
- Streaming generation support
|
||||
- Batch API optimization (when available)
|
||||
- Model-specific optimizations
|
||||
- Cost tracking and metrics
|
||||
|
||||
3. **Configuration Improvements**
|
||||
- Provider profiles (development, production)
|
||||
- Model aliasing (e.g., "small", "large")
|
||||
- Fallback provider chains
|
||||
|
||||
4. **Testing**
|
||||
- Integration tests with real Bedrock endpoints
|
||||
- Performance benchmarking across providers
|
||||
- Cost comparison analysis
|
||||
|
||||
## References
|
||||
|
||||
- [boto3 Bedrock Runtime Documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
|
||||
- [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
|
||||
- ADR-003: Vector Database and Semantic Search
|
||||
- ADR-008: MCP Sampling for Semantic Search
|
||||
- ADR-013: RAG Evaluation Framework
|
||||
@@ -0,0 +1,338 @@
|
||||
# Amazon Bedrock Setup Guide
|
||||
|
||||
This guide covers how to configure the Nextcloud MCP Server to use Amazon Bedrock for embeddings and text generation.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **AWS Account** with access to Amazon Bedrock
|
||||
2. **boto3 library** installed: `pip install boto3` or `uv sync --group dev`
|
||||
3. **Model Access** - Request access to models in AWS Bedrock console
|
||||
|
||||
## Required AWS Permissions
|
||||
|
||||
### IAM Policy for Bedrock Access
|
||||
|
||||
The AWS IAM user or role needs the following permissions:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "BedrockInvokeModels",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel",
|
||||
"bedrock:InvokeModelWithResponseStream"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:*::foundation-model/*"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Minimal Permissions (Production)
|
||||
|
||||
For production deployments, restrict to specific models:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "BedrockEmbeddings",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
|
||||
]
|
||||
},
|
||||
{
|
||||
"Sid": "BedrockGeneration",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Additional Permissions (Optional)
|
||||
|
||||
For advanced use cases:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Sid": "BedrockListModels",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:ListFoundationModels",
|
||||
"bedrock:GetFoundationModel"
|
||||
],
|
||||
"Resource": "*"
|
||||
},
|
||||
{
|
||||
"Sid": "BedrockAsyncInvoke",
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModelAsync",
|
||||
"bedrock:GetAsyncInvoke",
|
||||
"bedrock:ListAsyncInvokes"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:bedrock:*::foundation-model/*"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Model Access
|
||||
|
||||
Before using Bedrock models, you must request access in the AWS Console:
|
||||
|
||||
1. Navigate to **Amazon Bedrock** → **Model access**
|
||||
2. Click **Manage model access**
|
||||
3. Select models you want to use:
|
||||
- **Embeddings:** Amazon Titan Embed Text, Cohere Embed
|
||||
- **Text Generation:** Anthropic Claude, Meta Llama, Amazon Titan Text
|
||||
4. Click **Request model access**
|
||||
5. Wait for approval (usually instant for most models)
|
||||
|
||||
## Supported Models
|
||||
|
||||
### Embedding Models
|
||||
|
||||
| Provider | Model ID | Dimensions | Best For |
|
||||
|----------|----------|------------|----------|
|
||||
| Amazon Titan | `amazon.titan-embed-text-v1` | 1,536 | General purpose |
|
||||
| Amazon Titan | `amazon.titan-embed-text-v2:0` | 1,024 | Latest, improved quality |
|
||||
| Cohere | `cohere.embed-english-v3` | 1,024 | English text |
|
||||
| Cohere | `cohere.embed-multilingual-v3` | 1,024 | Multilingual |
|
||||
|
||||
### Text Generation Models
|
||||
|
||||
| Provider | Model ID | Context | Best For |
|
||||
|----------|----------|---------|----------|
|
||||
| Anthropic | `anthropic.claude-3-sonnet-20240229-v1:0` | 200K | Balanced performance |
|
||||
| Anthropic | `anthropic.claude-3-haiku-20240307-v1:0` | 200K | Fast, cost-effective |
|
||||
| Anthropic | `anthropic.claude-3-opus-20240229-v1:0` | 200K | Highest quality |
|
||||
| Meta | `meta.llama3-8b-instruct-v1:0` | 8K | Fast, open-source |
|
||||
| Meta | `meta.llama3-70b-instruct-v1:0` | 8K | High quality |
|
||||
| Amazon | `amazon.titan-text-express-v1` | 8K | Fast, low cost |
|
||||
| Mistral | `mistral.mistral-7b-instruct-v0:2` | 32K | Efficient |
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
**Required:**
|
||||
```bash
|
||||
AWS_REGION=us-east-1
|
||||
```
|
||||
|
||||
**Optional (at least one model required):**
|
||||
```bash
|
||||
# For embeddings
|
||||
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
|
||||
# For text generation (RAG evaluation)
|
||||
BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
```
|
||||
|
||||
**AWS Credentials (choose one method):**
|
||||
|
||||
**Method 1: Environment Variables**
|
||||
```bash
|
||||
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
|
||||
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
```
|
||||
|
||||
**Method 2: AWS Credentials File** (`~/.aws/credentials`)
|
||||
```ini
|
||||
[default]
|
||||
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
|
||||
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
```
|
||||
|
||||
**Method 3: IAM Role** (when running on AWS EC2/ECS/Lambda)
|
||||
- No credentials needed, uses instance/task role automatically
|
||||
|
||||
### Docker Configuration
|
||||
|
||||
Add to your `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
mcp:
|
||||
environment:
|
||||
- AWS_REGION=us-east-1
|
||||
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
- BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
|
||||
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
|
||||
```
|
||||
|
||||
Or use AWS credentials file volume mount:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
mcp:
|
||||
volumes:
|
||||
- ~/.aws:/root/.aws:ro
|
||||
environment:
|
||||
- AWS_REGION=us-east-1
|
||||
- BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Embeddings Only
|
||||
|
||||
```bash
|
||||
export AWS_REGION=us-east-1
|
||||
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
export AWS_ACCESS_KEY_ID=your-key
|
||||
export AWS_SECRET_ACCESS_KEY=your-secret
|
||||
|
||||
uv run nextcloud-mcp-server
|
||||
```
|
||||
|
||||
### Both Embeddings and Generation
|
||||
|
||||
```bash
|
||||
export AWS_REGION=us-east-1
|
||||
export BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
|
||||
export BEDROCK_GENERATION_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
|
||||
# For RAG evaluation with Bedrock
|
||||
export RAG_EVAL_PROVIDER=bedrock
|
||||
export RAG_EVAL_BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
|
||||
|
||||
uv run python -m tests.rag_evaluation.evaluate
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
|
||||
```python
|
||||
from nextcloud_mcp_server.providers import BedrockProvider
|
||||
|
||||
# Embeddings only
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
)
|
||||
|
||||
embeddings = await provider.embed_batch(["text1", "text2"])
|
||||
|
||||
# Both capabilities
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
# Generate embeddings
|
||||
embedding = await provider.embed("query text")
|
||||
|
||||
# Generate text
|
||||
response = await provider.generate("Write a summary", max_tokens=500)
|
||||
```
|
||||
|
||||
## Cost Considerations
|
||||
|
||||
### Embedding Costs (as of Jan 2025)
|
||||
|
||||
| Model | Price per 1K tokens |
|
||||
|-------|---------------------|
|
||||
| Titan Embed Text v2 | $0.0001 |
|
||||
| Cohere Embed English v3 | $0.0001 |
|
||||
|
||||
### Generation Costs (as of Jan 2025)
|
||||
|
||||
| Model | Input (per 1K tokens) | Output (per 1K tokens) |
|
||||
|-------|----------------------|------------------------|
|
||||
| Claude 3 Haiku | $0.00025 | $0.00125 |
|
||||
| Claude 3 Sonnet | $0.003 | $0.015 |
|
||||
| Claude 3 Opus | $0.015 | $0.075 |
|
||||
| Llama 3 8B | $0.0003 | $0.0006 |
|
||||
| Titan Text Express | $0.0002 | $0.0006 |
|
||||
|
||||
**Note:** Prices vary by region. Check [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/) for current rates.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Error: "Executable doesn't exist" or boto3 not found
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
uv sync --group dev # Installs boto3
|
||||
```
|
||||
|
||||
### Error: "AccessDeniedException"
|
||||
|
||||
**Causes:**
|
||||
1. IAM permissions missing
|
||||
2. Model access not requested
|
||||
3. Wrong AWS region
|
||||
|
||||
**Solution:**
|
||||
1. Verify IAM policy includes `bedrock:InvokeModel`
|
||||
2. Request model access in Bedrock console
|
||||
3. Check model is available in your region
|
||||
|
||||
### Error: "ResourceNotFoundException"
|
||||
|
||||
**Cause:** Invalid model ID or model not available in region
|
||||
|
||||
**Solution:**
|
||||
- Verify model ID matches exactly (case-sensitive)
|
||||
- Check model availability in your AWS region
|
||||
- Use `aws bedrock list-foundation-models` to see available models
|
||||
|
||||
### Error: "ThrottlingException"
|
||||
|
||||
**Cause:** Rate limit exceeded
|
||||
|
||||
**Solution:**
|
||||
- Reduce request rate
|
||||
- Request quota increase via AWS Support
|
||||
- Use batch operations where possible
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
1. **Use IAM Roles** when running on AWS infrastructure
|
||||
2. **Rotate Access Keys** regularly if using IAM users
|
||||
3. **Restrict Permissions** to only required models
|
||||
4. **Enable CloudTrail** for audit logging
|
||||
5. **Use AWS Secrets Manager** for credential management
|
||||
6. **Monitor Costs** with AWS Cost Explorer and Budgets
|
||||
|
||||
## Regional Availability
|
||||
|
||||
Amazon Bedrock is available in:
|
||||
- **US East (N. Virginia)**: `us-east-1` ✅ Most models
|
||||
- **US West (Oregon)**: `us-west-2` ✅ Most models
|
||||
- **Asia Pacific (Singapore)**: `ap-southeast-1`
|
||||
- **Asia Pacific (Tokyo)**: `ap-northeast-1`
|
||||
- **Europe (Frankfurt)**: `eu-central-1`
|
||||
|
||||
**Note:** Model availability varies by region. Check the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html) for current availability.
|
||||
|
||||
## References
|
||||
|
||||
- [AWS Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
|
||||
- [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
|
||||
- [boto3 Bedrock Runtime API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html)
|
||||
- [Provider Architecture ADR](./ADR-015-unified-provider-architecture.md)
|
||||
@@ -1478,6 +1478,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
|
||||
vector_sync_status_fragment,
|
||||
)
|
||||
from nextcloud_mcp_server.auth.viz_routes import (
|
||||
chunk_context_endpoint,
|
||||
vector_visualization_html,
|
||||
vector_visualization_search,
|
||||
)
|
||||
@@ -1509,6 +1510,11 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
|
||||
vector_visualization_search,
|
||||
methods=["GET"],
|
||||
), # /app/vector-viz/search
|
||||
Route(
|
||||
"/chunk-context",
|
||||
chunk_context_endpoint,
|
||||
methods=["GET"],
|
||||
), # /app/chunk-context
|
||||
# Webhook management routes (admin-only)
|
||||
Route("/webhooks", webhook_management_pane, methods=["GET"]), # /app/webhooks
|
||||
Route(
|
||||
@@ -1523,7 +1529,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
|
||||
|
||||
browser_app = Starlette(routes=browser_routes)
|
||||
browser_app.add_middleware(
|
||||
AuthenticationMiddleware,
|
||||
AuthenticationMiddleware, # type: ignore[invalid-argument-type]
|
||||
backend=SessionAuthBackend(oauth_enabled=oauth_enabled),
|
||||
)
|
||||
|
||||
@@ -1613,7 +1619,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
|
||||
|
||||
# Add CORS middleware to allow browser-based clients like MCP Inspector
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
CORSMiddleware, # type: ignore[invalid-argument-type]
|
||||
allow_origins=["*"], # Allow all origins for development
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
@@ -1623,7 +1629,7 @@ def get_app(transport: str = "sse", enabled_apps: list[str] | None = None):
|
||||
|
||||
# Add observability middleware (metrics + tracing)
|
||||
if settings.metrics_enabled or settings.otel_exporter_otlp_endpoint:
|
||||
app.add_middleware(ObservabilityMiddleware)
|
||||
app.add_middleware(ObservabilityMiddleware) # type: ignore[invalid-argument-type]
|
||||
logger.info("Observability middleware enabled (metrics and/or tracing)")
|
||||
|
||||
# Add exception handler for scope challenges (OAuth mode only)
|
||||
|
||||
@@ -0,0 +1,323 @@
|
||||
<style>
|
||||
.viz-card {
|
||||
background: white;
|
||||
border-radius: 8px;
|
||||
padding: 20px;
|
||||
margin-bottom: 20px;
|
||||
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
|
||||
}
|
||||
.viz-controls {
|
||||
margin-bottom: 20px;
|
||||
}
|
||||
.viz-control-row {
|
||||
display: grid;
|
||||
grid-template-columns: 2fr 1fr auto;
|
||||
gap: 12px;
|
||||
margin-bottom: 12px;
|
||||
align-items: end;
|
||||
}
|
||||
.viz-control-group {
|
||||
margin-bottom: 15px;
|
||||
}
|
||||
.viz-control-group label {
|
||||
display: block;
|
||||
margin-bottom: 5px;
|
||||
font-weight: 500;
|
||||
color: #333;
|
||||
}
|
||||
.viz-control-group input[type="text"],
|
||||
.viz-control-group input[type="number"],
|
||||
.viz-control-group select {
|
||||
width: 100%;
|
||||
padding: 8px 12px;
|
||||
border: 1px solid #ddd;
|
||||
border-radius: 4px;
|
||||
font-size: 14px;
|
||||
}
|
||||
.viz-control-group input[type="range"] {
|
||||
width: 100%;
|
||||
}
|
||||
.viz-control-group select[multiple] {
|
||||
min-height: 100px;
|
||||
}
|
||||
.viz-weight-display {
|
||||
display: inline-block;
|
||||
min-width: 40px;
|
||||
text-align: right;
|
||||
color: #666;
|
||||
}
|
||||
.viz-btn {
|
||||
background: #0066cc;
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 10px 20px;
|
||||
border-radius: 4px;
|
||||
cursor: pointer;
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
}
|
||||
.viz-btn:hover {
|
||||
background: #0052a3;
|
||||
}
|
||||
.viz-btn-secondary {
|
||||
background: #6c757d;
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 6px 12px;
|
||||
border-radius: 4px;
|
||||
cursor: pointer;
|
||||
font-size: 13px;
|
||||
margin-bottom: 12px;
|
||||
}
|
||||
.viz-btn-secondary:hover {
|
||||
background: #5a6268;
|
||||
}
|
||||
#viz-plot-container {
|
||||
width: 100%;
|
||||
height: 600px;
|
||||
position: relative;
|
||||
}
|
||||
#viz-plot {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
}
|
||||
.viz-loading {
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #666;
|
||||
}
|
||||
.viz-loading-overlay {
|
||||
position: absolute;
|
||||
inset: 0;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
background: white;
|
||||
color: #666;
|
||||
}
|
||||
.viz-no-results {
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #666;
|
||||
font-style: italic;
|
||||
}
|
||||
.viz-advanced-section {
|
||||
margin-top: 16px;
|
||||
padding: 16px;
|
||||
background: #f8f9fa;
|
||||
border-radius: 4px;
|
||||
border: 1px solid #dee2e6;
|
||||
}
|
||||
.viz-advanced-grid {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 20px;
|
||||
}
|
||||
.viz-info-box {
|
||||
background: #e3f2fd;
|
||||
border-left: 4px solid #2196f3;
|
||||
padding: 12px;
|
||||
margin-bottom: 20px;
|
||||
font-size: 14px;
|
||||
}
|
||||
.chunk-toggle-btn {
|
||||
background: #6c757d;
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 4px 10px;
|
||||
border-radius: 3px;
|
||||
cursor: pointer;
|
||||
font-size: 12px;
|
||||
margin-top: 6px;
|
||||
}
|
||||
.chunk-toggle-btn:hover {
|
||||
background: #5a6268;
|
||||
}
|
||||
.chunk-context {
|
||||
background: #f8f9fa;
|
||||
border: 1px solid #dee2e6;
|
||||
border-radius: 4px;
|
||||
padding: 12px;
|
||||
margin-top: 8px;
|
||||
font-family: monospace;
|
||||
font-size: 13px;
|
||||
line-height: 1.6;
|
||||
white-space: pre-wrap;
|
||||
word-wrap: break-word;
|
||||
}
|
||||
.chunk-text {
|
||||
color: #666;
|
||||
}
|
||||
.chunk-matched {
|
||||
background: #fff3cd;
|
||||
border: 1px solid #ffc107;
|
||||
padding: 2px 4px;
|
||||
border-radius: 2px;
|
||||
font-weight: 500;
|
||||
color: #333;
|
||||
}
|
||||
.chunk-ellipsis {
|
||||
color: #999;
|
||||
font-style: italic;
|
||||
}
|
||||
</style>
|
||||
|
||||
<div x-data="vizApp()">
|
||||
<div class="viz-card">
|
||||
<h2>Vector Visualization</h2>
|
||||
<div class="viz-info-box">
|
||||
Testing search algorithms on your indexed documents. User: <strong>{{ username }}</strong>
|
||||
</div>
|
||||
|
||||
<form @submit.prevent="executeSearch">
|
||||
<div class="viz-controls">
|
||||
<!-- Main Controls -->
|
||||
<div class="viz-control-group">
|
||||
<label>Search Query</label>
|
||||
<input type="text" x-model="query" placeholder="Enter search query..." required />
|
||||
</div>
|
||||
|
||||
<div class="viz-control-row">
|
||||
<div class="viz-control-group" style="margin-bottom: 0;">
|
||||
<label>Algorithm</label>
|
||||
<select x-model="algorithm">
|
||||
<option value="semantic">Semantic (Dense Vectors)</option>
|
||||
<option value="bm25_hybrid" selected>BM25 Hybrid (Dense + Sparse)</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="viz-control-group" style="margin-bottom: 0;" x-show="algorithm === 'bm25_hybrid'">
|
||||
<label>Fusion Method</label>
|
||||
<select x-model="fusion">
|
||||
<option value="rrf" selected>RRF (Reciprocal Rank Fusion)</option>
|
||||
<option value="dbsf">DBSF (Distribution-Based Score Fusion)</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div style="display: flex; align-items: flex-end;">
|
||||
<button type="submit" class="viz-btn" style="width: 100%;">Search & Visualize</button>
|
||||
</div>
|
||||
|
||||
<div style="display: flex; align-items: flex-end;">
|
||||
<button type="button" class="viz-btn-secondary" @click="showAdvanced = !showAdvanced" style="white-space: nowrap;">
|
||||
<span x-text="showAdvanced ? 'Hide Advanced' : 'Advanced'"></span>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Advanced Options (Collapsible) -->
|
||||
<div class="viz-advanced-section" x-show="showAdvanced" x-transition.opacity.duration.200ms>
|
||||
<h3 style="margin-top: 0; margin-bottom: 16px; font-size: 16px;">Advanced Options</h3>
|
||||
|
||||
<div class="viz-advanced-grid">
|
||||
<div class="viz-control-group">
|
||||
<label>Document Types</label>
|
||||
<select x-model="docTypes" multiple>
|
||||
<option value="">All Types (cross-app search)</option>
|
||||
<option value="note">Notes</option>
|
||||
<option value="file">Files</option>
|
||||
<option value="calendar">Calendar Events</option>
|
||||
<option value="contact">Contacts</option>
|
||||
<option value="deck">Deck Cards</option>
|
||||
</select>
|
||||
<small style="color: #666; display: block; margin-top: 4px;">
|
||||
Hold Ctrl/Cmd to select multiple
|
||||
</small>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<div class="viz-control-group">
|
||||
<label>Score Threshold (Semantic/Hybrid)</label>
|
||||
<input type="number" x-model.number="scoreThreshold" min="0" max="1" step="0.1" />
|
||||
</div>
|
||||
|
||||
<div class="viz-control-group">
|
||||
<label>Result Limit</label>
|
||||
<input type="number" x-model.number="limit" min="1" max="100" />
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Info: BM25 Hybrid fusion methods -->
|
||||
<div x-show="algorithm === 'bm25_hybrid'" style="margin-top: 16px; padding: 12px; background: #e9ecef; border-radius: 4px;">
|
||||
<p style="margin: 0; font-size: 14px; color: #666;">
|
||||
<strong>BM25 Hybrid Search:</strong> Combines dense semantic vectors with sparse BM25 keyword vectors.
|
||||
</p>
|
||||
<p style="margin: 8px 0 0 0; font-size: 13px; color: #666;">
|
||||
<strong>RRF:</strong> Reciprocal Rank Fusion - Rank-based fusion producing scores in [0.0, 1.0]
|
||||
</p>
|
||||
<p style="margin: 4px 0 0 0; font-size: 13px; color: #666;">
|
||||
<strong>DBSF:</strong> Distribution-Based Score Fusion - Sums normalized scores (can exceed 1.0)
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
|
||||
<div class="viz-card">
|
||||
<div id="viz-plot-container">
|
||||
<div x-show="loading" class="viz-loading-overlay" x-transition.opacity.duration.200ms>
|
||||
Executing search and computing PCA projection...
|
||||
</div>
|
||||
<div id="viz-plot" x-show="!loading" x-transition.opacity.duration.200ms></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="viz-card">
|
||||
<h3>Search Results (<span x-text="loading ? '...' : results.length"></span>)</h3>
|
||||
|
||||
<div x-show="loading" class="viz-loading" x-transition.opacity.duration.200ms>
|
||||
Loading results...
|
||||
</div>
|
||||
|
||||
<div x-show="!loading && results.length === 0" class="viz-no-results" x-transition.opacity.duration.200ms>
|
||||
No results found. Try a different query or adjust your search parameters.
|
||||
</div>
|
||||
|
||||
<template x-if="!loading && results.length > 0">
|
||||
<div x-transition.opacity.duration.200ms>
|
||||
<template x-for="result in results" :key="result.id">
|
||||
<div style="padding: 12px; border-bottom: 1px solid #eee;">
|
||||
<a :href="getNextcloudUrl(result)" target="_blank" style="font-weight: 500; color: #0066cc; text-decoration: none;">
|
||||
<span x-text="result.title"></span>
|
||||
</a>
|
||||
<div style="font-size: 14px; color: #666; margin-top: 4px;" x-text="result.excerpt"></div>
|
||||
<div style="font-size: 12px; color: #999; margin-top: 4px;">
|
||||
Score: <span x-text="result.score.toFixed(3)"></span> |
|
||||
Type: <span x-text="result.doc_type"></span>
|
||||
</div>
|
||||
|
||||
<!-- Show Chunk button (only if chunk position is available) -->
|
||||
<template x-if="hasChunkPosition(result)">
|
||||
<button
|
||||
class="chunk-toggle-btn"
|
||||
@click="toggleChunk(result)"
|
||||
x-text="isChunkExpanded(`${result.doc_type}_${result.id}`) ? 'Hide Chunk' : 'Show Chunk'"
|
||||
></button>
|
||||
</template>
|
||||
|
||||
<!-- Chunk context (expanded inline) -->
|
||||
<template x-if="isChunkExpanded(`${result.doc_type}_${result.id}`)">
|
||||
<div class="chunk-context" x-transition.opacity.duration.200ms>
|
||||
<template x-if="chunkLoading[`${result.doc_type}_${result.id}`]">
|
||||
<div style="color: #666; font-style: italic;">Loading chunk...</div>
|
||||
</template>
|
||||
<template x-if="!chunkLoading[`${result.doc_type}_${result.id}`]">
|
||||
<div>
|
||||
<template x-if="expandedChunks[`${result.doc_type}_${result.id}`]?.has_more_before">
|
||||
<span class="chunk-ellipsis">...</span>
|
||||
</template>
|
||||
<span class="chunk-text" x-text="expandedChunks[`${result.doc_type}_${result.id}`]?.before_context"></span><span class="chunk-matched" x-text="expandedChunks[`${result.doc_type}_${result.id}`]?.chunk_text"></span><span class="chunk-text" x-text="expandedChunks[`${result.doc_type}_${result.id}`]?.after_context"></span><template x-if="expandedChunks[`${result.doc_type}_${result.id}`]?.has_more_after">
|
||||
<span class="chunk-ellipsis">...</span>
|
||||
</template>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
</div>
|
||||
@@ -677,12 +677,15 @@ async def user_info_html(request: Request) -> HTMLResponse:
|
||||
return {{
|
||||
query: '',
|
||||
algorithm: 'bm25_hybrid',
|
||||
fusion: 'rrf', // Default fusion method for BM25 Hybrid
|
||||
showAdvanced: false,
|
||||
docTypes: [''], // Default to "All Types"
|
||||
limit: 50,
|
||||
scoreThreshold: 0.0,
|
||||
loading: false,
|
||||
results: [],
|
||||
expandedChunks: {{}}, // Track which chunks are expanded (result_id -> chunk data)
|
||||
chunkLoading: {{}}, // Track loading state per result
|
||||
|
||||
async executeSearch() {{
|
||||
this.loading = true;
|
||||
@@ -696,6 +699,11 @@ async def user_info_html(request: Request) -> HTMLResponse:
|
||||
score_threshold: this.scoreThreshold,
|
||||
}});
|
||||
|
||||
// Add fusion parameter for BM25 Hybrid
|
||||
if (this.algorithm === 'bm25_hybrid') {{
|
||||
params.append('fusion', this.fusion);
|
||||
}}
|
||||
|
||||
// Add doc_types parameter (filter out empty string for "All Types")
|
||||
const selectedTypes = this.docTypes.filter(t => t !== '');
|
||||
if (selectedTypes.length > 0) {{
|
||||
@@ -778,6 +786,51 @@ async def user_info_html(request: Request) -> HTMLResponse:
|
||||
default:
|
||||
return `${{baseUrl}}`;
|
||||
}}
|
||||
}},
|
||||
|
||||
hasChunkPosition(result) {{
|
||||
// Check if result has position metadata
|
||||
return result.chunk_start_offset != null && result.chunk_end_offset != null;
|
||||
}},
|
||||
|
||||
isChunkExpanded(resultKey) {{
|
||||
return this.expandedChunks[resultKey] !== undefined;
|
||||
}},
|
||||
|
||||
async toggleChunk(result) {{
|
||||
const resultKey = `${{result.doc_type}}_${{result.id}}`;
|
||||
|
||||
// If already expanded, collapse
|
||||
if (this.isChunkExpanded(resultKey)) {{
|
||||
delete this.expandedChunks[resultKey];
|
||||
return;
|
||||
}}
|
||||
|
||||
// Otherwise, fetch and expand
|
||||
this.chunkLoading[resultKey] = true;
|
||||
|
||||
try {{
|
||||
const params = new URLSearchParams({{
|
||||
doc_type: result.doc_type,
|
||||
doc_id: result.id,
|
||||
start: result.chunk_start_offset,
|
||||
end: result.chunk_end_offset,
|
||||
context: 500 // 500 chars before/after
|
||||
}});
|
||||
|
||||
const response = await fetch(`/app/chunk-context?${{params}}`);
|
||||
const data = await response.json();
|
||||
|
||||
if (data.success) {{
|
||||
this.expandedChunks[resultKey] = data;
|
||||
}} else {{
|
||||
alert('Failed to load chunk: ' + data.error);
|
||||
}}
|
||||
}} catch (error) {{
|
||||
alert('Error loading chunk: ' + error.message);
|
||||
}} finally {{
|
||||
delete this.chunkLoading[resultKey];
|
||||
}}
|
||||
}}
|
||||
}}
|
||||
}}
|
||||
|
||||
@@ -12,8 +12,10 @@ All processing happens server-side following ADR-012:
|
||||
|
||||
import logging
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
import numpy as np
|
||||
from jinja2 import Environment, FileSystemLoader
|
||||
from starlette.authentication import requires
|
||||
from starlette.requests import Request
|
||||
from starlette.responses import HTMLResponse, JSONResponse
|
||||
@@ -28,6 +30,10 @@ from nextcloud_mcp_server.vector.qdrant_client import get_qdrant_client
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Setup Jinja2 environment for templates
|
||||
_template_dir = Path(__file__).parent / "templates"
|
||||
_jinja_env = Environment(loader=FileSystemLoader(_template_dir))
|
||||
|
||||
|
||||
@requires("authenticated", redirect="oauth_login")
|
||||
async def vector_visualization_html(request: Request) -> HTMLResponse:
|
||||
@@ -63,252 +69,9 @@ async def vector_visualization_html(request: Request) -> HTMLResponse:
|
||||
else "unknown"
|
||||
)
|
||||
|
||||
html_content = f"""
|
||||
<style>
|
||||
.viz-card {{
|
||||
background: white;
|
||||
border-radius: 8px;
|
||||
padding: 20px;
|
||||
margin-bottom: 20px;
|
||||
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
|
||||
}}
|
||||
.viz-controls {{
|
||||
margin-bottom: 20px;
|
||||
}}
|
||||
.viz-control-row {{
|
||||
display: grid;
|
||||
grid-template-columns: 2fr 1fr auto;
|
||||
gap: 12px;
|
||||
margin-bottom: 12px;
|
||||
align-items: end;
|
||||
}}
|
||||
.viz-control-group {{
|
||||
margin-bottom: 15px;
|
||||
}}
|
||||
.viz-control-group label {{
|
||||
display: block;
|
||||
margin-bottom: 5px;
|
||||
font-weight: 500;
|
||||
color: #333;
|
||||
}}
|
||||
.viz-control-group input[type="text"],
|
||||
.viz-control-group input[type="number"],
|
||||
.viz-control-group select {{
|
||||
width: 100%;
|
||||
padding: 8px 12px;
|
||||
border: 1px solid #ddd;
|
||||
border-radius: 4px;
|
||||
font-size: 14px;
|
||||
}}
|
||||
.viz-control-group input[type="range"] {{
|
||||
width: 100%;
|
||||
}}
|
||||
.viz-control-group select[multiple] {{
|
||||
min-height: 100px;
|
||||
}}
|
||||
.viz-weight-display {{
|
||||
display: inline-block;
|
||||
min-width: 40px;
|
||||
text-align: right;
|
||||
color: #666;
|
||||
}}
|
||||
.viz-btn {{
|
||||
background: #0066cc;
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 10px 20px;
|
||||
border-radius: 4px;
|
||||
cursor: pointer;
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
}}
|
||||
.viz-btn:hover {{
|
||||
background: #0052a3;
|
||||
}}
|
||||
.viz-btn-secondary {{
|
||||
background: #6c757d;
|
||||
color: white;
|
||||
border: none;
|
||||
padding: 6px 12px;
|
||||
border-radius: 4px;
|
||||
cursor: pointer;
|
||||
font-size: 13px;
|
||||
margin-bottom: 12px;
|
||||
}}
|
||||
.viz-btn-secondary:hover {{
|
||||
background: #5a6268;
|
||||
}}
|
||||
#viz-plot-container {{
|
||||
width: 100%;
|
||||
height: 600px;
|
||||
position: relative;
|
||||
}}
|
||||
#viz-plot {{
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
}}
|
||||
.viz-loading {{
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #666;
|
||||
}}
|
||||
.viz-loading-overlay {{
|
||||
position: absolute;
|
||||
inset: 0;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
background: white;
|
||||
color: #666;
|
||||
}}
|
||||
.viz-no-results {{
|
||||
text-align: center;
|
||||
padding: 40px;
|
||||
color: #666;
|
||||
font-style: italic;
|
||||
}}
|
||||
.viz-advanced-section {{
|
||||
margin-top: 16px;
|
||||
padding: 16px;
|
||||
background: #f8f9fa;
|
||||
border-radius: 4px;
|
||||
border: 1px solid #dee2e6;
|
||||
}}
|
||||
.viz-advanced-grid {{
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr;
|
||||
gap: 20px;
|
||||
}}
|
||||
.viz-info-box {{
|
||||
background: #e3f2fd;
|
||||
border-left: 4px solid #2196f3;
|
||||
padding: 12px;
|
||||
margin-bottom: 20px;
|
||||
font-size: 14px;
|
||||
}}
|
||||
</style>
|
||||
|
||||
<div x-data="vizApp()">
|
||||
<div class="viz-card">
|
||||
<h2>Vector Visualization</h2>
|
||||
<div class="viz-info-box">
|
||||
Testing search algorithms on your indexed documents. User: <strong>{username}</strong>
|
||||
</div>
|
||||
|
||||
<form @submit.prevent="executeSearch">
|
||||
<div class="viz-controls">
|
||||
<!-- Main Controls -->
|
||||
<div class="viz-control-group">
|
||||
<label>Search Query</label>
|
||||
<input type="text" x-model="query" placeholder="Enter search query..." required />
|
||||
</div>
|
||||
|
||||
<div class="viz-control-row">
|
||||
<div class="viz-control-group" style="margin-bottom: 0;">
|
||||
<label>Algorithm</label>
|
||||
<select x-model="algorithm">
|
||||
<option value="semantic">Semantic (Dense Vectors)</option>
|
||||
<option value="bm25_hybrid" selected>BM25 Hybrid (Dense + Sparse RRF)</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div style="display: flex; align-items: flex-end;">
|
||||
<button type="submit" class="viz-btn" style="width: 100%;">Search & Visualize</button>
|
||||
</div>
|
||||
|
||||
<div style="display: flex; align-items: flex-end;">
|
||||
<button type="button" class="viz-btn-secondary" @click="showAdvanced = !showAdvanced" style="white-space: nowrap;">
|
||||
<span x-text="showAdvanced ? 'Hide Advanced' : 'Advanced'"></span>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Advanced Options (Collapsible) -->
|
||||
<div class="viz-advanced-section" x-show="showAdvanced" x-transition.opacity.duration.200ms>
|
||||
<h3 style="margin-top: 0; margin-bottom: 16px; font-size: 16px;">Advanced Options</h3>
|
||||
|
||||
<div class="viz-advanced-grid">
|
||||
<div class="viz-control-group">
|
||||
<label>Document Types</label>
|
||||
<select x-model="docTypes" multiple>
|
||||
<option value="">All Types (cross-app search)</option>
|
||||
<option value="note">Notes</option>
|
||||
<option value="file">Files</option>
|
||||
<option value="calendar">Calendar Events</option>
|
||||
<option value="contact">Contacts</option>
|
||||
<option value="deck">Deck Cards</option>
|
||||
</select>
|
||||
<small style="color: #666; display: block; margin-top: 4px;">
|
||||
Hold Ctrl/Cmd to select multiple
|
||||
</small>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<div class="viz-control-group">
|
||||
<label>Score Threshold (Semantic/Hybrid)</label>
|
||||
<input type="number" x-model.number="scoreThreshold" min="0" max="1" step="0.1" />
|
||||
</div>
|
||||
|
||||
<div class="viz-control-group">
|
||||
<label>Result Limit</label>
|
||||
<input type="number" x-model.number="limit" min="1" max="100" />
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Info: BM25 Hybrid uses native RRF fusion (no manual weights) -->
|
||||
<div x-show="algorithm === 'bm25_hybrid'" style="margin-top: 16px; padding: 12px; background: #e9ecef; border-radius: 4px;">
|
||||
<p style="margin: 0; font-size: 14px; color: #666;">
|
||||
<strong>BM25 Hybrid Search:</strong> Uses Qdrant's native Reciprocal Rank Fusion (RRF)
|
||||
to automatically combine dense semantic vectors with sparse BM25 keyword vectors.
|
||||
No manual weight tuning required.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
|
||||
<div class="viz-card">
|
||||
<div id="viz-plot-container">
|
||||
<div x-show="loading" class="viz-loading-overlay" x-transition.opacity.duration.200ms>
|
||||
Executing search and computing PCA projection...
|
||||
</div>
|
||||
<div id="viz-plot" x-show="!loading" x-transition.opacity.duration.200ms></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="viz-card">
|
||||
<h3>Search Results (<span x-text="loading ? '...' : results.length"></span>)</h3>
|
||||
|
||||
<div x-show="loading" class="viz-loading" x-transition.opacity.duration.200ms>
|
||||
Loading results...
|
||||
</div>
|
||||
|
||||
<div x-show="!loading && results.length === 0" class="viz-no-results" x-transition.opacity.duration.200ms>
|
||||
No results found. Try a different query or adjust your search parameters.
|
||||
</div>
|
||||
|
||||
<template x-if="!loading && results.length > 0">
|
||||
<div x-transition.opacity.duration.200ms>
|
||||
<template x-for="result in results" :key="result.id">
|
||||
<div style="padding: 12px; border-bottom: 1px solid #eee;">
|
||||
<a :href="getNextcloudUrl(result)" target="_blank" style="font-weight: 500; color: #0066cc; text-decoration: none;">
|
||||
<span x-text="result.title"></span>
|
||||
</a>
|
||||
<div style="font-size: 14px; color: #666; margin-top: 4px;" x-text="result.excerpt"></div>
|
||||
<div style="font-size: 12px; color: #999; margin-top: 4px;">
|
||||
Score: <span x-text="result.score.toFixed(3)"></span> |
|
||||
Type: <span x-text="result.doc_type"></span>
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
</template>
|
||||
</div>
|
||||
</div>
|
||||
"""
|
||||
|
||||
# Load and render template
|
||||
template = _jinja_env.get_template("vector_viz.html")
|
||||
html_content = template.render(username=username)
|
||||
return HTMLResponse(content=html_content)
|
||||
|
||||
|
||||
@@ -352,6 +115,7 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
|
||||
algorithm = request.query_params.get("algorithm", "bm25_hybrid")
|
||||
limit = int(request.query_params.get("limit", "50"))
|
||||
score_threshold = float(request.query_params.get("score_threshold", "0.0"))
|
||||
fusion = request.query_params.get("fusion", "rrf") # Default to RRF
|
||||
|
||||
# Parse doc_types (comma-separated list, None = all types)
|
||||
doc_types_param = request.query_params.get("doc_types", "")
|
||||
@@ -359,7 +123,7 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
|
||||
|
||||
logger.info(
|
||||
f"Viz search: user={username}, query='{query}', "
|
||||
f"algorithm={algorithm}, limit={limit}, doc_types={doc_types}"
|
||||
f"algorithm={algorithm}, fusion={fusion}, limit={limit}, doc_types={doc_types}"
|
||||
)
|
||||
|
||||
try:
|
||||
@@ -377,7 +141,9 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
|
||||
if algorithm == "semantic":
|
||||
search_algo = SemanticSearchAlgorithm(score_threshold=score_threshold)
|
||||
elif algorithm == "bm25_hybrid":
|
||||
search_algo = BM25HybridSearchAlgorithm(score_threshold=score_threshold)
|
||||
search_algo = BM25HybridSearchAlgorithm(
|
||||
score_threshold=score_threshold, fusion=fusion
|
||||
)
|
||||
else:
|
||||
return JSONResponse(
|
||||
{"success": False, "error": f"Unknown algorithm: {algorithm}"},
|
||||
@@ -552,6 +318,8 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
|
||||
"title": r.title,
|
||||
"excerpt": r.excerpt,
|
||||
"score": r.score,
|
||||
"chunk_start_offset": r.chunk_start_offset,
|
||||
"chunk_end_offset": r.chunk_end_offset,
|
||||
}
|
||||
for r in search_results
|
||||
]
|
||||
@@ -594,3 +362,125 @@ async def vector_visualization_search(request: Request) -> JSONResponse:
|
||||
{"success": False, "error": str(e)},
|
||||
status_code=500,
|
||||
)
|
||||
|
||||
|
||||
@requires("authenticated", redirect="oauth_login")
|
||||
async def chunk_context_endpoint(request: Request) -> JSONResponse:
|
||||
"""Fetch chunk text with surrounding context for visualization.
|
||||
|
||||
This endpoint retrieves the matched chunk along with surrounding text
|
||||
to provide context for the search result. Used by the viz pane to
|
||||
display chunks inline.
|
||||
|
||||
Query parameters:
|
||||
doc_type: Document type (e.g., "note")
|
||||
doc_id: Document ID
|
||||
start: Chunk start offset (character position)
|
||||
end: Chunk end offset (character position)
|
||||
context: Characters of context before/after (default: 500)
|
||||
|
||||
Returns:
|
||||
JSON with chunk_text, before_context, after_context, and flags
|
||||
"""
|
||||
try:
|
||||
# Get query parameters
|
||||
doc_type = request.query_params.get("doc_type")
|
||||
doc_id = request.query_params.get("doc_id")
|
||||
start_str = request.query_params.get("start")
|
||||
end_str = request.query_params.get("end")
|
||||
context_chars = int(request.query_params.get("context", "500"))
|
||||
|
||||
# Validate required parameters
|
||||
if not all([doc_type, doc_id, start_str, end_str]):
|
||||
return JSONResponse(
|
||||
{
|
||||
"success": False,
|
||||
"error": "Missing required parameters: doc_type, doc_id, start, end",
|
||||
},
|
||||
status_code=400,
|
||||
)
|
||||
|
||||
start = int(start_str)
|
||||
end = int(end_str)
|
||||
|
||||
# Currently only support notes
|
||||
if doc_type != "note":
|
||||
return JSONResponse(
|
||||
{"success": False, "error": f"Unsupported doc_type: {doc_type}"},
|
||||
status_code=400,
|
||||
)
|
||||
|
||||
# Get authenticated HTTP client and fetch note
|
||||
from nextcloud_mcp_server.auth.userinfo_routes import (
|
||||
_get_authenticated_client_for_userinfo,
|
||||
)
|
||||
from nextcloud_mcp_server.client.notes import NotesClient
|
||||
|
||||
# Get username from request auth
|
||||
username = (
|
||||
request.user.display_name
|
||||
if hasattr(request.user, "display_name")
|
||||
else "unknown"
|
||||
)
|
||||
|
||||
# Create notes client with authenticated HTTP client
|
||||
http_client = await _get_authenticated_client_for_userinfo(request)
|
||||
notes_client = NotesClient(http_client, username)
|
||||
|
||||
# Fetch full note content
|
||||
note = await notes_client.get_note(int(doc_id))
|
||||
full_content = f"{note['title']}\n\n{note['content']}"
|
||||
|
||||
# Validate offsets
|
||||
if start < 0 or end > len(full_content) or start >= end:
|
||||
return JSONResponse(
|
||||
{
|
||||
"success": False,
|
||||
"error": f"Invalid offsets: start={start}, end={end}, content_length={len(full_content)}",
|
||||
},
|
||||
status_code=400,
|
||||
)
|
||||
|
||||
# Extract chunk
|
||||
chunk_text = full_content[start:end]
|
||||
|
||||
# Extract context before and after
|
||||
before_start = max(0, start - context_chars)
|
||||
before_context = full_content[before_start:start]
|
||||
|
||||
after_end = min(len(full_content), end + context_chars)
|
||||
after_context = full_content[end:after_end]
|
||||
|
||||
# Determine if there's more content
|
||||
has_more_before = before_start > 0
|
||||
has_more_after = after_end < len(full_content)
|
||||
|
||||
logger.info(
|
||||
f"Fetched chunk context for {doc_type}_{doc_id}: "
|
||||
f"chunk_len={len(chunk_text)}, before_len={len(before_context)}, "
|
||||
f"after_len={len(after_context)}"
|
||||
)
|
||||
|
||||
return JSONResponse(
|
||||
{
|
||||
"success": True,
|
||||
"chunk_text": chunk_text,
|
||||
"before_context": before_context,
|
||||
"after_context": after_context,
|
||||
"has_more_before": has_more_before,
|
||||
"has_more_after": has_more_after,
|
||||
}
|
||||
)
|
||||
|
||||
except ValueError as e:
|
||||
logger.error(f"Invalid parameter format: {e}")
|
||||
return JSONResponse(
|
||||
{"success": False, "error": f"Invalid parameter format: {e}"},
|
||||
status_code=400,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"Chunk context error: {e}", exc_info=True)
|
||||
return JSONResponse(
|
||||
{"success": False, "error": str(e)},
|
||||
status_code=500,
|
||||
)
|
||||
|
||||
@@ -1,57 +1,30 @@
|
||||
"""Embedding service with provider detection."""
|
||||
"""Embedding service with provider detection.
|
||||
|
||||
DEPRECATED: This module is maintained for backward compatibility.
|
||||
New code should use nextcloud_mcp_server.providers.get_provider() directly.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
from .base import EmbeddingProvider
|
||||
from nextcloud_mcp_server.providers import get_provider
|
||||
|
||||
from .bm25_provider import BM25SparseEmbeddingProvider
|
||||
from .ollama_provider import OllamaEmbeddingProvider
|
||||
from .simple_provider import SimpleEmbeddingProvider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class EmbeddingService:
|
||||
"""Unified embedding service with automatic provider detection."""
|
||||
"""
|
||||
Unified embedding service with automatic provider detection.
|
||||
|
||||
DEPRECATED: This class wraps the new unified provider infrastructure
|
||||
for backward compatibility. New code should use
|
||||
nextcloud_mcp_server.providers.get_provider() directly.
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize embedding service with auto-detected provider."""
|
||||
self.provider = self._detect_provider()
|
||||
|
||||
def _detect_provider(self) -> EmbeddingProvider:
|
||||
"""
|
||||
Auto-detect available embedding provider.
|
||||
|
||||
Checks environment variables in order:
|
||||
1. OLLAMA_BASE_URL - Use Ollama provider (production)
|
||||
2. OPENAI_API_KEY - Use OpenAI provider (future)
|
||||
3. Fallback to SimpleEmbeddingProvider (testing/development)
|
||||
|
||||
Returns:
|
||||
Configured embedding provider
|
||||
"""
|
||||
# Ollama provider (production)
|
||||
ollama_url = os.getenv("OLLAMA_BASE_URL")
|
||||
if ollama_url:
|
||||
logger.info(f"Using Ollama embedding provider: {ollama_url}")
|
||||
return OllamaEmbeddingProvider(
|
||||
base_url=ollama_url,
|
||||
model=os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text"),
|
||||
verify_ssl=os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true",
|
||||
)
|
||||
|
||||
# OpenAI provider (future implementation)
|
||||
# openai_key = os.getenv("OPENAI_API_KEY")
|
||||
# if openai_key:
|
||||
# return OpenAIEmbeddingProvider(api_key=openai_key)
|
||||
|
||||
# Fallback to simple provider for development/testing
|
||||
logger.warning(
|
||||
"No embedding provider configured (OLLAMA_BASE_URL or OPENAI_API_KEY not set). "
|
||||
"Using SimpleEmbeddingProvider for testing/development. "
|
||||
"For production, configure an external embedding service."
|
||||
)
|
||||
return SimpleEmbeddingProvider(dimension=384)
|
||||
self.provider = get_provider()
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
|
||||
@@ -19,9 +19,22 @@ class SemanticSearchResult(BaseModel):
|
||||
default="", description="Document category (notes) or location (calendar)"
|
||||
)
|
||||
excerpt: str = Field(description="Excerpt from matching chunk")
|
||||
score: float = Field(description="Semantic similarity score (0-1)")
|
||||
score: float = Field(
|
||||
description=(
|
||||
"Relevance score (≥ 0.0, higher is better). "
|
||||
"Score range depends on fusion method: "
|
||||
"RRF produces scores in [0.0, 1.0], "
|
||||
"DBSF can exceed 1.0 (sum of normalized scores from multiple systems)"
|
||||
)
|
||||
)
|
||||
chunk_index: int = Field(description="Index of matching chunk in document")
|
||||
total_chunks: int = Field(description="Total number of chunks in document")
|
||||
chunk_start_offset: Optional[int] = Field(
|
||||
default=None, description="Character position where chunk starts in document"
|
||||
)
|
||||
chunk_end_offset: Optional[int] = Field(
|
||||
default=None, description="Character position where chunk ends in document"
|
||||
)
|
||||
|
||||
|
||||
class SemanticSearchResponse(BaseResponse):
|
||||
|
||||
@@ -0,0 +1,18 @@
|
||||
"""Unified provider infrastructure for embeddings and text generation."""
|
||||
|
||||
from .anthropic import AnthropicProvider
|
||||
from .base import Provider
|
||||
from .bedrock import BedrockProvider
|
||||
from .ollama import OllamaProvider
|
||||
from .registry import get_provider, reset_provider
|
||||
from .simple import SimpleProvider
|
||||
|
||||
__all__ = [
|
||||
"Provider",
|
||||
"OllamaProvider",
|
||||
"AnthropicProvider",
|
||||
"SimpleProvider",
|
||||
"BedrockProvider",
|
||||
"get_provider",
|
||||
"reset_provider",
|
||||
]
|
||||
@@ -0,0 +1,97 @@
|
||||
"""Unified Anthropic provider for text generation."""
|
||||
|
||||
import logging
|
||||
|
||||
from anthropic import AsyncAnthropic
|
||||
|
||||
from .base import Provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class AnthropicProvider(Provider):
|
||||
"""
|
||||
Anthropic provider for text generation.
|
||||
|
||||
Supports Claude models via the Anthropic API.
|
||||
Note: Anthropic doesn't provide embedding models, only text generation.
|
||||
"""
|
||||
|
||||
def __init__(self, api_key: str, model: str = "claude-3-5-sonnet-20241022"):
|
||||
"""
|
||||
Initialize Anthropic provider.
|
||||
|
||||
Args:
|
||||
api_key: Anthropic API key
|
||||
model: Model name (e.g., "claude-3-5-sonnet-20241022")
|
||||
"""
|
||||
self.client = AsyncAnthropic(api_key=api_key)
|
||||
self.model = model
|
||||
|
||||
logger.info(f"Initialized Anthropic provider (model={model})")
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return False
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return True
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Anthropic doesn't provide embedding models
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||
)
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Anthropic doesn't provide embedding models
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||
)
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Anthropic doesn't provide embedding models
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported by Anthropic - use Ollama or Bedrock for embeddings"
|
||||
)
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text using Anthropic API.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
"""
|
||||
message = await self.client.messages.create(
|
||||
model=self.model,
|
||||
max_tokens=max_tokens,
|
||||
temperature=0.7,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return message.content[0].text
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the client (no-op for Anthropic SDK)."""
|
||||
pass
|
||||
@@ -0,0 +1,91 @@
|
||||
"""Unified provider interface for embeddings and text generation."""
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
|
||||
|
||||
class Provider(ABC):
|
||||
"""
|
||||
Unified base class for LLM providers.
|
||||
|
||||
Providers can support embeddings, text generation, or both.
|
||||
Use capability properties to determine what features are available.
|
||||
"""
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
pass
|
||||
|
||||
@property
|
||||
@abstractmethod
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support embeddings
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts (optimized).
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support embeddings
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension for this provider.
|
||||
|
||||
Returns:
|
||||
Vector dimension (e.g., 768 for nomic-embed-text)
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support embeddings
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If provider doesn't support generation
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def close(self) -> None:
|
||||
"""Close the provider and release resources."""
|
||||
pass
|
||||
@@ -0,0 +1,397 @@
|
||||
"""Amazon Bedrock provider for embeddings and text generation."""
|
||||
|
||||
import json
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
try:
|
||||
import boto3
|
||||
from botocore.exceptions import BotoCoreError, ClientError
|
||||
|
||||
BOTO3_AVAILABLE = True
|
||||
except ImportError:
|
||||
BOTO3_AVAILABLE = False
|
||||
|
||||
from .base import Provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class BedrockProvider(Provider):
|
||||
"""
|
||||
Amazon Bedrock provider supporting both embeddings and text generation.
|
||||
|
||||
Uses AWS Bedrock Runtime API with boto3. Supports various model families:
|
||||
- Embeddings: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2, cohere.embed-*
|
||||
- Text Generation: anthropic.claude-*, meta.llama3-*, amazon.titan-text-*, mistral.*, etc.
|
||||
|
||||
Requires AWS credentials configured via:
|
||||
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION)
|
||||
- AWS credentials file (~/.aws/credentials)
|
||||
- IAM role (when running on AWS)
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
region_name: str | None = None,
|
||||
embedding_model: str | None = None,
|
||||
generation_model: str | None = None,
|
||||
aws_access_key_id: str | None = None,
|
||||
aws_secret_access_key: str | None = None,
|
||||
):
|
||||
"""
|
||||
Initialize Bedrock provider.
|
||||
|
||||
Args:
|
||||
region_name: AWS region (e.g., "us-east-1"). Defaults to AWS_REGION env var.
|
||||
embedding_model: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0").
|
||||
None disables embeddings.
|
||||
generation_model: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0").
|
||||
None disables generation.
|
||||
aws_access_key_id: AWS access key (optional, uses default credential chain if not provided)
|
||||
aws_secret_access_key: AWS secret key (optional, uses default credential chain if not provided)
|
||||
|
||||
Raises:
|
||||
ImportError: If boto3 is not installed
|
||||
"""
|
||||
if not BOTO3_AVAILABLE:
|
||||
raise ImportError(
|
||||
"boto3 is required for Bedrock provider. Install with: pip install boto3"
|
||||
)
|
||||
|
||||
self.embedding_model = embedding_model
|
||||
self.generation_model = generation_model
|
||||
self._dimension: int | None = None # Detected dynamically
|
||||
|
||||
# Initialize bedrock-runtime client
|
||||
client_kwargs: dict[str, Any] = {}
|
||||
if region_name:
|
||||
client_kwargs["region_name"] = region_name
|
||||
if aws_access_key_id:
|
||||
client_kwargs["aws_access_key_id"] = aws_access_key_id
|
||||
if aws_secret_access_key:
|
||||
client_kwargs["aws_secret_access_key"] = aws_secret_access_key
|
||||
|
||||
self.client = boto3.client("bedrock-runtime", **client_kwargs)
|
||||
|
||||
logger.info(
|
||||
f"Initialized Bedrock provider in region {region_name or 'default'} "
|
||||
f"(embedding_model={embedding_model}, generation_model={generation_model})"
|
||||
)
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return self.embedding_model is not None
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return self.generation_model is not None
|
||||
|
||||
def _create_embedding_request(self, text: str) -> dict[str, Any]:
|
||||
"""
|
||||
Create model-specific embedding request payload.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Request payload dict for the embedding model
|
||||
"""
|
||||
if not self.embedding_model:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
# Titan Embed models
|
||||
if self.embedding_model.startswith("amazon.titan-embed"):
|
||||
return {"inputText": text}
|
||||
|
||||
# Cohere Embed models
|
||||
elif self.embedding_model.startswith("cohere.embed"):
|
||||
return {"texts": [text], "input_type": "search_document"}
|
||||
|
||||
# Unknown model - try Titan format as default
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown embedding model format for {self.embedding_model}, "
|
||||
"using Titan format as default"
|
||||
)
|
||||
return {"inputText": text}
|
||||
|
||||
def _parse_embedding_response(self, response: dict[str, Any]) -> list[float]:
|
||||
"""
|
||||
Parse model-specific embedding response.
|
||||
|
||||
Args:
|
||||
response: Raw response from Bedrock
|
||||
|
||||
Returns:
|
||||
Embedding vector as list of floats
|
||||
"""
|
||||
# Titan Embed models
|
||||
if self.embedding_model and self.embedding_model.startswith(
|
||||
"amazon.titan-embed"
|
||||
):
|
||||
return response["embedding"]
|
||||
|
||||
# Cohere Embed models
|
||||
elif self.embedding_model and self.embedding_model.startswith("cohere.embed"):
|
||||
return response["embeddings"][0]
|
||||
|
||||
# Unknown model - try Titan format as default
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown embedding response format for {self.embedding_model}, "
|
||||
"trying Titan format"
|
||||
)
|
||||
return response.get("embedding", response.get("embeddings", [None])[0])
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
ClientError: If Bedrock API call fails
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
try:
|
||||
request_body = self._create_embedding_request(text)
|
||||
|
||||
response = self.client.invoke_model(
|
||||
modelId=self.embedding_model,
|
||||
body=json.dumps(request_body),
|
||||
accept="application/json",
|
||||
contentType="application/json",
|
||||
)
|
||||
|
||||
response_body = json.loads(response["body"].read())
|
||||
embedding = self._parse_embedding_response(response_body)
|
||||
|
||||
return embedding
|
||||
|
||||
except (BotoCoreError, ClientError) as e:
|
||||
logger.error(f"Bedrock embedding error: {e}")
|
||||
raise
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts.
|
||||
|
||||
Note: Current implementation sends requests sequentially.
|
||||
Future optimization could use asyncio for concurrent requests.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
ClientError: If Bedrock API call fails
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
embeddings = []
|
||||
for text in texts:
|
||||
embedding = await self.embed(text)
|
||||
embeddings.append(embedding)
|
||||
return embeddings
|
||||
|
||||
async def _detect_dimension(self):
|
||||
"""
|
||||
Detect embedding dimension by generating a test embedding.
|
||||
"""
|
||||
if self._dimension is None and self.supports_embeddings:
|
||||
logger.debug(
|
||||
f"Detecting embedding dimension for model {self.embedding_model}..."
|
||||
)
|
||||
test_embedding = await self.embed("test")
|
||||
self._dimension = len(test_embedding)
|
||||
logger.info(
|
||||
f"Detected embedding dimension: {self._dimension} "
|
||||
f"for model {self.embedding_model}"
|
||||
)
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension.
|
||||
|
||||
Returns:
|
||||
Vector dimension for the configured embedding model
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
RuntimeError: If dimension not detected yet (call _detect_dimension first)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
if self._dimension is None:
|
||||
raise RuntimeError(
|
||||
f"Embedding dimension not detected yet for model {self.embedding_model}. "
|
||||
"Call _detect_dimension() first or generate an embedding."
|
||||
)
|
||||
return self._dimension
|
||||
|
||||
def _create_generation_request(
|
||||
self, prompt: str, max_tokens: int
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
Create model-specific text generation request payload.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Request payload dict for the generation model
|
||||
"""
|
||||
if not self.generation_model:
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported - no generation_model configured"
|
||||
)
|
||||
|
||||
# Anthropic Claude models
|
||||
if self.generation_model.startswith("anthropic.claude"):
|
||||
return {
|
||||
"anthropic_version": "bedrock-2023-05-31",
|
||||
"max_tokens": max_tokens,
|
||||
"temperature": 0.7,
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
}
|
||||
|
||||
# Meta Llama models
|
||||
elif self.generation_model.startswith("meta.llama"):
|
||||
return {"prompt": prompt, "max_gen_len": max_tokens, "temperature": 0.7}
|
||||
|
||||
# Amazon Titan Text models
|
||||
elif self.generation_model.startswith("amazon.titan-text"):
|
||||
return {
|
||||
"inputText": prompt,
|
||||
"textGenerationConfig": {
|
||||
"maxTokenCount": max_tokens,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
}
|
||||
|
||||
# Mistral models
|
||||
elif self.generation_model.startswith("mistral"):
|
||||
return {"prompt": prompt, "max_tokens": max_tokens, "temperature": 0.7}
|
||||
|
||||
# Unknown model - try Claude format as default
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown generation model format for {self.generation_model}, "
|
||||
"using Claude format as default"
|
||||
)
|
||||
return {
|
||||
"anthropic_version": "bedrock-2023-05-31",
|
||||
"max_tokens": max_tokens,
|
||||
"temperature": 0.7,
|
||||
"messages": [{"role": "user", "content": prompt}],
|
||||
}
|
||||
|
||||
def _parse_generation_response(self, response: dict[str, Any]) -> str:
|
||||
"""
|
||||
Parse model-specific text generation response.
|
||||
|
||||
Args:
|
||||
response: Raw response from Bedrock
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
"""
|
||||
# Anthropic Claude models
|
||||
if self.generation_model and self.generation_model.startswith(
|
||||
"anthropic.claude"
|
||||
):
|
||||
return response["content"][0]["text"]
|
||||
|
||||
# Meta Llama models
|
||||
elif self.generation_model and self.generation_model.startswith("meta.llama"):
|
||||
return response["generation"]
|
||||
|
||||
# Amazon Titan Text models
|
||||
elif self.generation_model and self.generation_model.startswith(
|
||||
"amazon.titan-text"
|
||||
):
|
||||
return response["results"][0]["outputText"]
|
||||
|
||||
# Mistral models
|
||||
elif self.generation_model and self.generation_model.startswith("mistral"):
|
||||
return response["outputs"][0]["text"]
|
||||
|
||||
# Unknown model - try common response fields
|
||||
else:
|
||||
logger.warning(
|
||||
f"Unknown generation response format for {self.generation_model}, "
|
||||
"trying common fields"
|
||||
)
|
||||
# Try common response field names
|
||||
for field in ["text", "generation", "outputText", "completion"]:
|
||||
if field in response:
|
||||
return response[field]
|
||||
# Last resort: return JSON string
|
||||
return json.dumps(response)
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If generation not enabled (no generation_model)
|
||||
ClientError: If Bedrock API call fails
|
||||
"""
|
||||
if not self.supports_generation:
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported - no generation_model configured"
|
||||
)
|
||||
|
||||
try:
|
||||
request_body = self._create_generation_request(prompt, max_tokens)
|
||||
|
||||
response = self.client.invoke_model(
|
||||
modelId=self.generation_model,
|
||||
body=json.dumps(request_body),
|
||||
accept="application/json",
|
||||
contentType="application/json",
|
||||
)
|
||||
|
||||
response_body = json.loads(response["body"].read())
|
||||
text = self._parse_generation_response(response_body)
|
||||
|
||||
return text
|
||||
|
||||
except (BotoCoreError, ClientError) as e:
|
||||
logger.error(f"Bedrock generation error: {e}")
|
||||
raise
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the client (no-op for boto3 clients)."""
|
||||
pass
|
||||
@@ -0,0 +1,221 @@
|
||||
"""Unified Ollama provider for embeddings and text generation."""
|
||||
|
||||
import logging
|
||||
|
||||
import httpx
|
||||
|
||||
from .base import Provider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class OllamaProvider(Provider):
|
||||
"""
|
||||
Ollama provider supporting both embeddings and text generation.
|
||||
|
||||
Supports TLS, SSL verification, and automatic model loading.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
base_url: str,
|
||||
embedding_model: str | None = None,
|
||||
generation_model: str | None = None,
|
||||
verify_ssl: bool = True,
|
||||
timeout: httpx.Timeout | None = None,
|
||||
):
|
||||
"""
|
||||
Initialize Ollama provider.
|
||||
|
||||
Args:
|
||||
base_url: Ollama API base URL (e.g., https://ollama.internal.example.com:443)
|
||||
embedding_model: Model for embeddings (e.g., "nomic-embed-text"). None disables embeddings.
|
||||
generation_model: Model for text generation (e.g., "llama3.2:1b"). None disables generation.
|
||||
verify_ssl: Verify SSL certificates (default: True)
|
||||
timeout: HTTP timeout configuration
|
||||
"""
|
||||
self.base_url = base_url.rstrip("/")
|
||||
self.embedding_model = embedding_model
|
||||
self.generation_model = generation_model
|
||||
self.verify_ssl = verify_ssl
|
||||
|
||||
if timeout is None:
|
||||
timeout = httpx.Timeout(timeout=120, connect=5)
|
||||
|
||||
self.client = httpx.AsyncClient(verify=verify_ssl, timeout=timeout)
|
||||
self._dimension: int | None = None # Detected dynamically for embeddings
|
||||
|
||||
logger.info(
|
||||
f"Initialized Ollama provider: {base_url} "
|
||||
f"(embedding_model={embedding_model}, generation_model={generation_model}, "
|
||||
f"verify_ssl={verify_ssl})"
|
||||
)
|
||||
|
||||
# Pre-check and auto-load models
|
||||
if embedding_model:
|
||||
self._check_model_is_loaded(embedding_model, autoload=True)
|
||||
if generation_model:
|
||||
self._check_model_is_loaded(generation_model, autoload=True)
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return self.embedding_model is not None
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return self.generation_model is not None
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""
|
||||
Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
response = await self.client.post(
|
||||
f"{self.base_url}/api/embeddings",
|
||||
json={"model": self.embedding_model, "prompt": text},
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()["embedding"]
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""
|
||||
Generate embeddings for multiple texts (batched requests).
|
||||
|
||||
Note: Ollama doesn't have native batch API, so we send requests sequentially.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
embeddings = []
|
||||
for text in texts:
|
||||
embedding = await self.embed(text)
|
||||
embeddings.append(embedding)
|
||||
return embeddings
|
||||
|
||||
async def _detect_dimension(self):
|
||||
"""
|
||||
Detect embedding dimension by generating a test embedding.
|
||||
|
||||
This method queries the model to determine the actual dimension
|
||||
instead of relying on hardcoded values.
|
||||
"""
|
||||
if self._dimension is None and self.supports_embeddings:
|
||||
logger.debug(
|
||||
f"Detecting embedding dimension for model {self.embedding_model}..."
|
||||
)
|
||||
test_embedding = await self.embed("test")
|
||||
self._dimension = len(test_embedding)
|
||||
logger.info(
|
||||
f"Detected embedding dimension: {self._dimension} "
|
||||
f"for model {self.embedding_model}"
|
||||
)
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""
|
||||
Get embedding dimension.
|
||||
|
||||
Returns:
|
||||
Vector dimension for the configured embedding model
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If embeddings not enabled (no embedding_model)
|
||||
RuntimeError: If dimension not detected yet (call _detect_dimension first)
|
||||
"""
|
||||
if not self.supports_embeddings:
|
||||
raise NotImplementedError(
|
||||
"Embedding not supported - no embedding_model configured"
|
||||
)
|
||||
|
||||
if self._dimension is None:
|
||||
raise RuntimeError(
|
||||
f"Embedding dimension not detected yet for model {self.embedding_model}. "
|
||||
"Call _detect_dimension() first or generate an embedding."
|
||||
)
|
||||
return self._dimension
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
|
||||
Raises:
|
||||
NotImplementedError: If generation not enabled (no generation_model)
|
||||
"""
|
||||
if not self.supports_generation:
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported - no generation_model configured"
|
||||
)
|
||||
|
||||
response = await self.client.post(
|
||||
f"{self.base_url}/api/generate",
|
||||
json={
|
||||
"model": self.generation_model,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"num_predict": max_tokens,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
return data["response"]
|
||||
|
||||
def _check_model_is_loaded(self, model: str, autoload: bool = True):
|
||||
"""
|
||||
Check if model is loaded in Ollama, optionally auto-loading it.
|
||||
|
||||
Args:
|
||||
model: Model name to check
|
||||
autoload: Whether to automatically pull the model if not loaded
|
||||
"""
|
||||
response = httpx.get(f"{self.base_url}/api/tags")
|
||||
response.raise_for_status()
|
||||
|
||||
models = [m["name"] for m in response.json().get("models", [])]
|
||||
logger.info("Ollama has following models pre-loaded: %s", models)
|
||||
|
||||
if (model not in models) and autoload:
|
||||
logger.warning(
|
||||
"Model '%s' not yet available in ollama, attempting to pull now...",
|
||||
model,
|
||||
)
|
||||
response = httpx.post(f"{self.base_url}/api/pull", json={"model": model})
|
||||
response.raise_for_status()
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close HTTP client."""
|
||||
await self.client.aclose()
|
||||
@@ -0,0 +1,126 @@
|
||||
"""Provider registry and factory for auto-detection and instantiation."""
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
from .base import Provider
|
||||
from .bedrock import BedrockProvider
|
||||
from .ollama import OllamaProvider
|
||||
from .simple import SimpleProvider
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ProviderRegistry:
|
||||
"""
|
||||
Registry for provider auto-detection and instantiation.
|
||||
|
||||
Checks environment variables in priority order and creates appropriate provider:
|
||||
1. Bedrock (AWS_REGION + BEDROCK_*_MODEL)
|
||||
2. Ollama (OLLAMA_BASE_URL)
|
||||
3. Simple (fallback for testing/development)
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def create_provider() -> Provider:
|
||||
"""
|
||||
Auto-detect and create provider based on environment variables.
|
||||
|
||||
Priority order:
|
||||
1. Bedrock - if AWS_REGION or BEDROCK_EMBEDDING_MODEL is set
|
||||
2. Ollama - if OLLAMA_BASE_URL is set
|
||||
3. Simple - fallback for testing/development
|
||||
|
||||
Returns:
|
||||
Provider instance
|
||||
|
||||
Environment Variables:
|
||||
Bedrock:
|
||||
- AWS_REGION: AWS region (e.g., "us-east-1")
|
||||
- AWS_ACCESS_KEY_ID: AWS access key (optional, uses credential chain)
|
||||
- AWS_SECRET_ACCESS_KEY: AWS secret key (optional)
|
||||
- BEDROCK_EMBEDDING_MODEL: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0")
|
||||
- BEDROCK_GENERATION_MODEL: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0")
|
||||
|
||||
Ollama:
|
||||
- OLLAMA_BASE_URL: Ollama API base URL (e.g., "http://localhost:11434")
|
||||
- OLLAMA_EMBEDDING_MODEL: Model for embeddings (default: "nomic-embed-text")
|
||||
- OLLAMA_GENERATION_MODEL: Model for text generation (e.g., "llama3.2:1b")
|
||||
- OLLAMA_VERIFY_SSL: Verify SSL certificates (default: "true")
|
||||
|
||||
Simple (no configuration needed, fallback):
|
||||
- SIMPLE_EMBEDDING_DIMENSION: Embedding dimension (default: 384)
|
||||
"""
|
||||
# 1. Check for Bedrock
|
||||
aws_region = os.getenv("AWS_REGION")
|
||||
bedrock_embedding_model = os.getenv("BEDROCK_EMBEDDING_MODEL")
|
||||
bedrock_generation_model = os.getenv("BEDROCK_GENERATION_MODEL")
|
||||
|
||||
if aws_region or bedrock_embedding_model or bedrock_generation_model:
|
||||
logger.info(
|
||||
f"Using Bedrock provider: region={aws_region}, "
|
||||
f"embedding_model={bedrock_embedding_model}, "
|
||||
f"generation_model={bedrock_generation_model}"
|
||||
)
|
||||
return BedrockProvider(
|
||||
region_name=aws_region,
|
||||
embedding_model=bedrock_embedding_model,
|
||||
generation_model=bedrock_generation_model,
|
||||
aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
|
||||
aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
|
||||
)
|
||||
|
||||
# 2. Check for Ollama
|
||||
ollama_url = os.getenv("OLLAMA_BASE_URL")
|
||||
if ollama_url:
|
||||
embedding_model = os.getenv("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text")
|
||||
generation_model = os.getenv("OLLAMA_GENERATION_MODEL")
|
||||
verify_ssl = os.getenv("OLLAMA_VERIFY_SSL", "true").lower() == "true"
|
||||
|
||||
logger.info(
|
||||
f"Using Ollama provider: {ollama_url}, "
|
||||
f"embedding_model={embedding_model}, "
|
||||
f"generation_model={generation_model}"
|
||||
)
|
||||
return OllamaProvider(
|
||||
base_url=ollama_url,
|
||||
embedding_model=embedding_model,
|
||||
generation_model=generation_model,
|
||||
verify_ssl=verify_ssl,
|
||||
)
|
||||
|
||||
# 3. Fallback to Simple provider for development/testing
|
||||
dimension = int(os.getenv("SIMPLE_EMBEDDING_DIMENSION", "384"))
|
||||
logger.warning(
|
||||
"No provider configured (AWS_REGION, OLLAMA_BASE_URL not set). "
|
||||
"Using SimpleProvider for testing/development. "
|
||||
"For production, configure Bedrock or Ollama."
|
||||
)
|
||||
return SimpleProvider(dimension=dimension)
|
||||
|
||||
|
||||
# Singleton instance
|
||||
_provider: Provider | None = None
|
||||
|
||||
|
||||
def get_provider() -> Provider:
|
||||
"""
|
||||
Get singleton provider instance.
|
||||
|
||||
Returns:
|
||||
Global Provider instance (auto-detected on first call)
|
||||
"""
|
||||
global _provider
|
||||
if _provider is None:
|
||||
_provider = ProviderRegistry.create_provider()
|
||||
return _provider
|
||||
|
||||
|
||||
def reset_provider():
|
||||
"""
|
||||
Reset singleton provider instance.
|
||||
|
||||
Useful for testing or reconfiguration.
|
||||
"""
|
||||
global _provider
|
||||
_provider = None
|
||||
@@ -0,0 +1,149 @@
|
||||
"""Simple in-process embedding provider for testing.
|
||||
|
||||
This provider uses a basic TF-IDF-like approach with feature hashing to generate
|
||||
deterministic embeddings without requiring external services. Suitable for testing
|
||||
but not for production use.
|
||||
"""
|
||||
|
||||
import hashlib
|
||||
import math
|
||||
import re
|
||||
from collections import Counter
|
||||
|
||||
from .base import Provider
|
||||
|
||||
|
||||
class SimpleProvider(Provider):
|
||||
"""Simple deterministic embedding provider using feature hashing.
|
||||
|
||||
This implementation:
|
||||
- Tokenizes text into words
|
||||
- Uses feature hashing to map words to fixed-size vectors
|
||||
- Applies TF-IDF-like weighting
|
||||
- Normalizes vectors to unit length
|
||||
|
||||
Not suitable for production but good for testing semantic search infrastructure.
|
||||
Only supports embeddings, not text generation.
|
||||
"""
|
||||
|
||||
def __init__(self, dimension: int = 384):
|
||||
"""Initialize simple embedding provider.
|
||||
|
||||
Args:
|
||||
dimension: Embedding dimension (default: 384)
|
||||
"""
|
||||
self.dimension = dimension
|
||||
|
||||
@property
|
||||
def supports_embeddings(self) -> bool:
|
||||
"""Whether this provider supports embedding generation."""
|
||||
return True
|
||||
|
||||
@property
|
||||
def supports_generation(self) -> bool:
|
||||
"""Whether this provider supports text generation."""
|
||||
return False
|
||||
|
||||
def _tokenize(self, text: str) -> list[str]:
|
||||
"""Tokenize text into lowercase words.
|
||||
|
||||
Args:
|
||||
text: Input text
|
||||
|
||||
Returns:
|
||||
List of lowercase word tokens
|
||||
"""
|
||||
# Simple word tokenization
|
||||
text = text.lower()
|
||||
words = re.findall(r"\b\w+\b", text)
|
||||
return words
|
||||
|
||||
def _hash_word(self, word: str) -> int:
|
||||
"""Hash word to dimension index.
|
||||
|
||||
Args:
|
||||
word: Word to hash
|
||||
|
||||
Returns:
|
||||
Index in range [0, dimension)
|
||||
"""
|
||||
hash_bytes = hashlib.md5(word.encode()).digest()
|
||||
hash_int = int.from_bytes(hash_bytes[:4], byteorder="big")
|
||||
return hash_int % self.dimension
|
||||
|
||||
def _embed_single(self, text: str) -> list[float]:
|
||||
"""Generate embedding for single text.
|
||||
|
||||
Args:
|
||||
text: Input text
|
||||
|
||||
Returns:
|
||||
Normalized embedding vector
|
||||
"""
|
||||
tokens = self._tokenize(text)
|
||||
if not tokens:
|
||||
return [0.0] * self.dimension
|
||||
|
||||
# Count term frequencies
|
||||
term_freq = Counter(tokens)
|
||||
|
||||
# Initialize vector
|
||||
vector = [0.0] * self.dimension
|
||||
|
||||
# Apply TF weighting with feature hashing
|
||||
for word, count in term_freq.items():
|
||||
idx = self._hash_word(word)
|
||||
# Simple TF weighting: log(1 + count)
|
||||
vector[idx] += math.log1p(count)
|
||||
|
||||
# Normalize to unit length
|
||||
norm = math.sqrt(sum(x * x for x in vector))
|
||||
if norm > 0:
|
||||
vector = [x / norm for x in vector]
|
||||
|
||||
return vector
|
||||
|
||||
async def embed(self, text: str) -> list[float]:
|
||||
"""Generate embedding vector for text.
|
||||
|
||||
Args:
|
||||
text: Input text to embed
|
||||
|
||||
Returns:
|
||||
Vector embedding as list of floats
|
||||
"""
|
||||
return self._embed_single(text)
|
||||
|
||||
async def embed_batch(self, texts: list[str]) -> list[list[float]]:
|
||||
"""Generate embeddings for multiple texts.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
|
||||
Returns:
|
||||
List of vector embeddings
|
||||
"""
|
||||
return [self._embed_single(text) for text in texts]
|
||||
|
||||
def get_dimension(self) -> int:
|
||||
"""Get embedding dimension.
|
||||
|
||||
Returns:
|
||||
Vector dimension
|
||||
"""
|
||||
return self.dimension
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""
|
||||
Generate text from a prompt.
|
||||
|
||||
Raises:
|
||||
NotImplementedError: Simple provider doesn't support text generation
|
||||
"""
|
||||
raise NotImplementedError(
|
||||
"Text generation not supported by Simple provider - use Ollama, Anthropic, or Bedrock"
|
||||
)
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the provider (no-op for simple provider)."""
|
||||
pass
|
||||
@@ -127,8 +127,12 @@ class SearchResult:
|
||||
doc_type: Document type (note, file, calendar, contact, etc.)
|
||||
title: Document title
|
||||
excerpt: Content excerpt showing match context
|
||||
score: Relevance score (0.0-1.0, higher is better)
|
||||
score: Relevance score (≥ 0.0, higher is better)
|
||||
- RRF fusion: scores in [0.0, 1.0]
|
||||
- DBSF fusion: scores can exceed 1.0 (sum of normalized scores)
|
||||
metadata: Additional algorithm-specific metadata
|
||||
chunk_start_offset: Character position where chunk starts (None if not available)
|
||||
chunk_end_offset: Character position where chunk ends (None if not available)
|
||||
"""
|
||||
|
||||
id: int
|
||||
@@ -137,11 +141,20 @@ class SearchResult:
|
||||
excerpt: str
|
||||
score: float
|
||||
metadata: dict[str, Any] | None = None
|
||||
chunk_start_offset: int | None = None
|
||||
chunk_end_offset: int | None = None
|
||||
|
||||
def __post_init__(self):
|
||||
"""Validate score is in valid range."""
|
||||
if not 0.0 <= self.score <= 1.0:
|
||||
raise ValueError(f"Score must be between 0.0 and 1.0, got {self.score}")
|
||||
"""Validate score is non-negative.
|
||||
|
||||
Note: Different fusion methods produce different score ranges:
|
||||
- RRF (Reciprocal Rank Fusion): Bounded to [0.0, 1.0]
|
||||
- DBSF (Distribution-Based Score Fusion): Unbounded (can exceed 1.0)
|
||||
DBSF sums normalized scores from multiple systems, so scores can be
|
||||
1.5, 2.0, etc. when multiple systems agree a document is highly relevant.
|
||||
"""
|
||||
if self.score < 0.0:
|
||||
raise ValueError(f"Score must be non-negative, got {self.score}")
|
||||
|
||||
|
||||
class SearchAlgorithm(ABC):
|
||||
|
||||
@@ -28,15 +28,27 @@ class BM25HybridSearchAlgorithm(SearchAlgorithm):
|
||||
eliminating the need for application-layer result merging.
|
||||
"""
|
||||
|
||||
def __init__(self, score_threshold: float = 0.0):
|
||||
def __init__(self, score_threshold: float = 0.0, fusion: str = "rrf"):
|
||||
"""
|
||||
Initialize BM25 hybrid search algorithm.
|
||||
|
||||
Args:
|
||||
score_threshold: Minimum RRF score (0-1, default: 0.0 to allow RRF scoring)
|
||||
Note: RRF produces normalized scores, so threshold is typically lower
|
||||
score_threshold: Minimum fusion score (0-1, default: 0.0 to allow fusion scoring)
|
||||
Note: Both RRF and DBSF produce normalized scores
|
||||
fusion: Fusion algorithm to use: "rrf" (Reciprocal Rank Fusion, default)
|
||||
or "dbsf" (Distribution-Based Score Fusion)
|
||||
|
||||
Raises:
|
||||
ValueError: If fusion is not "rrf" or "dbsf"
|
||||
"""
|
||||
if fusion not in ("rrf", "dbsf"):
|
||||
raise ValueError(
|
||||
f"Invalid fusion algorithm '{fusion}'. Must be 'rrf' or 'dbsf'"
|
||||
)
|
||||
|
||||
self.score_threshold = score_threshold
|
||||
self.fusion = models.Fusion.RRF if fusion == "rrf" else models.Fusion.DBSF
|
||||
self.fusion_name = fusion
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
@@ -78,7 +90,8 @@ class BM25HybridSearchAlgorithm(SearchAlgorithm):
|
||||
|
||||
logger.info(
|
||||
f"BM25 hybrid search: query='{query}', user={user_id}, "
|
||||
f"limit={limit}, score_threshold={score_threshold}, doc_type={doc_type}"
|
||||
f"limit={limit}, score_threshold={score_threshold}, doc_type={doc_type}, "
|
||||
f"fusion={self.fusion_name}"
|
||||
)
|
||||
|
||||
# Generate dense embedding for semantic search
|
||||
@@ -139,8 +152,8 @@ class BM25HybridSearchAlgorithm(SearchAlgorithm):
|
||||
filter=query_filter,
|
||||
),
|
||||
],
|
||||
# RRF fusion query (no additional query needed, just fusion)
|
||||
query=models.FusionQuery(fusion=models.Fusion.RRF),
|
||||
# Fusion query (RRF or DBSF based on initialization)
|
||||
query=models.FusionQuery(fusion=self.fusion),
|
||||
limit=limit * 2, # Get extra for deduplication
|
||||
score_threshold=score_threshold,
|
||||
with_payload=True,
|
||||
@@ -152,14 +165,16 @@ class BM25HybridSearchAlgorithm(SearchAlgorithm):
|
||||
raise
|
||||
|
||||
logger.info(
|
||||
f"Qdrant RRF fusion returned {len(search_response.points)} results "
|
||||
f"Qdrant {self.fusion_name.upper()} fusion returned {len(search_response.points)} results "
|
||||
f"(before deduplication)"
|
||||
)
|
||||
|
||||
if search_response.points:
|
||||
# Log top 3 RRF scores to help with threshold tuning
|
||||
# Log top 3 fusion scores to help with threshold tuning
|
||||
top_scores = [p.score for p in search_response.points[:3]]
|
||||
logger.debug(f"Top 3 RRF fusion scores: {top_scores}")
|
||||
logger.debug(
|
||||
f"Top 3 {self.fusion_name.upper()} fusion scores: {top_scores}"
|
||||
)
|
||||
|
||||
# Deduplicate by (doc_id, doc_type) - multiple chunks per document
|
||||
seen_docs = set()
|
||||
@@ -183,12 +198,14 @@ class BM25HybridSearchAlgorithm(SearchAlgorithm):
|
||||
doc_type=doc_type,
|
||||
title=result.payload.get("title", "Untitled"),
|
||||
excerpt=result.payload.get("excerpt", ""),
|
||||
score=result.score, # RRF fusion score
|
||||
score=result.score, # Fusion score (RRF or DBSF)
|
||||
metadata={
|
||||
"chunk_index": result.payload.get("chunk_index"),
|
||||
"total_chunks": result.payload.get("total_chunks"),
|
||||
"search_method": "bm25_hybrid_rrf",
|
||||
"search_method": f"bm25_hybrid_{self.fusion_name}",
|
||||
},
|
||||
chunk_start_offset=result.payload.get("chunk_start_offset"),
|
||||
chunk_end_offset=result.payload.get("chunk_end_offset"),
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@@ -150,6 +150,8 @@ class SemanticSearchAlgorithm(SearchAlgorithm):
|
||||
"chunk_index": result.payload.get("chunk_index"),
|
||||
"total_chunks": result.payload.get("total_chunks"),
|
||||
},
|
||||
chunk_start_offset=result.payload.get("chunk_start_offset"),
|
||||
chunk_end_offset=result.payload.get("chunk_end_offset"),
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@@ -42,6 +42,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
limit: int = 10,
|
||||
doc_types: list[str] | None = None,
|
||||
score_threshold: float = 0.0,
|
||||
fusion: str = "rrf",
|
||||
) -> SemanticSearchResponse:
|
||||
"""
|
||||
Search Nextcloud content using BM25 hybrid search with cross-app support.
|
||||
@@ -50,7 +51,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
- Dense semantic vectors: For conceptual similarity and natural language queries
|
||||
- BM25 sparse vectors: For precise keyword matching, acronyms, and specific terms
|
||||
|
||||
Results are automatically fused using Reciprocal Rank Fusion (RRF) in the
|
||||
Results are automatically fused using the selected fusion algorithm in the
|
||||
database for optimal relevance. This provides the best of both semantic
|
||||
understanding and keyword precision.
|
||||
|
||||
@@ -61,10 +62,13 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
query: Natural language or keyword search query
|
||||
limit: Maximum number of results to return (default: 10)
|
||||
doc_types: Document types to search (e.g., ["note", "file"]). None = search all indexed types (default)
|
||||
score_threshold: Minimum RRF fusion score (0-1, default: 0.0 for RRF scoring)
|
||||
score_threshold: Minimum fusion score (0-1, default: 0.0)
|
||||
fusion: Fusion algorithm: "rrf" (Reciprocal Rank Fusion, default) or "dbsf" (Distribution-Based Score Fusion)
|
||||
RRF: Good general-purpose fusion using reciprocal ranks
|
||||
DBSF: Uses distribution-based normalization, may better balance different score ranges
|
||||
|
||||
Returns:
|
||||
SemanticSearchResponse with matching documents ranked by RRF fusion scores
|
||||
SemanticSearchResponse with matching documents ranked by fusion scores
|
||||
"""
|
||||
from nextcloud_mcp_server.config import get_settings
|
||||
|
||||
@@ -74,7 +78,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
|
||||
logger.info(
|
||||
f"BM25 hybrid search: query='{query}', user={username}, "
|
||||
f"limit={limit}, score_threshold={score_threshold}"
|
||||
f"limit={limit}, score_threshold={score_threshold}, fusion={fusion}"
|
||||
)
|
||||
|
||||
# Check that vector sync is enabled
|
||||
@@ -87,8 +91,10 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
)
|
||||
|
||||
try:
|
||||
# Create BM25 hybrid search algorithm
|
||||
search_algo = BM25HybridSearchAlgorithm(score_threshold=score_threshold)
|
||||
# Create BM25 hybrid search algorithm with specified fusion
|
||||
search_algo = BM25HybridSearchAlgorithm(
|
||||
score_threshold=score_threshold, fusion=fusion
|
||||
)
|
||||
|
||||
# Execute search across requested document types
|
||||
# If doc_types is None, search all indexed types (cross-app search)
|
||||
@@ -152,6 +158,8 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
total_chunks=r.metadata.get("total_chunks", 1)
|
||||
if r.metadata
|
||||
else 1,
|
||||
chunk_start_offset=r.chunk_start_offset,
|
||||
chunk_end_offset=r.chunk_end_offset,
|
||||
)
|
||||
)
|
||||
|
||||
@@ -161,7 +169,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
results=results,
|
||||
query=query,
|
||||
total_found=len(results),
|
||||
search_method="bm25_hybrid",
|
||||
search_method=f"bm25_hybrid_{fusion}",
|
||||
)
|
||||
|
||||
except ValueError as e:
|
||||
@@ -193,6 +201,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
limit: int = 5,
|
||||
score_threshold: float = 0.7,
|
||||
max_answer_tokens: int = 500,
|
||||
fusion: str = "rrf",
|
||||
) -> SamplingSearchResponse:
|
||||
"""
|
||||
Semantic search with LLM-generated answer using MCP sampling.
|
||||
@@ -217,6 +226,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
limit: Maximum number of documents to retrieve (default: 5)
|
||||
score_threshold: Minimum similarity score 0-1 (default: 0.7)
|
||||
max_answer_tokens: Maximum tokens for generated answer (default: 500)
|
||||
fusion: Fusion algorithm: "rrf" (Reciprocal Rank Fusion, default) or "dbsf" (Distribution-Based Score Fusion)
|
||||
|
||||
Returns:
|
||||
SamplingSearchResponse containing:
|
||||
@@ -256,6 +266,7 @@ def configure_semantic_tools(mcp: FastMCP):
|
||||
ctx=ctx,
|
||||
limit=limit,
|
||||
score_threshold=score_threshold,
|
||||
fusion=fusion,
|
||||
)
|
||||
|
||||
# 2. Handle no results case - don't waste a sampling call
|
||||
|
||||
@@ -1,10 +1,21 @@
|
||||
"""Document chunking for large texts."""
|
||||
|
||||
import logging
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class ChunkWithPosition:
|
||||
"""A text chunk with its character position in the original document."""
|
||||
|
||||
text: str
|
||||
start_offset: int # Character position where chunk starts
|
||||
end_offset: int # Character position where chunk ends (exclusive)
|
||||
|
||||
|
||||
class DocumentChunker:
|
||||
"""Chunk large documents for optimal embedding."""
|
||||
|
||||
@@ -19,33 +30,66 @@ class DocumentChunker:
|
||||
self.chunk_size = chunk_size
|
||||
self.overlap = overlap
|
||||
|
||||
def chunk_text(self, content: str) -> list[str]:
|
||||
def chunk_text(self, content: str) -> list[ChunkWithPosition]:
|
||||
"""
|
||||
Split text into overlapping chunks.
|
||||
Split text into overlapping chunks with position tracking.
|
||||
|
||||
Uses simple word-based chunking with configurable overlap to preserve
|
||||
context across chunk boundaries.
|
||||
context across chunk boundaries. Tracks character positions for each chunk.
|
||||
|
||||
Args:
|
||||
content: Text content to chunk
|
||||
|
||||
Returns:
|
||||
List of text chunks (may be single item if content is small)
|
||||
List of chunks with their character positions in the original content
|
||||
"""
|
||||
# Simple word-based chunking
|
||||
words = content.split()
|
||||
# Use regex to find all words and their positions
|
||||
# This preserves the original spacing and allows accurate position tracking
|
||||
word_pattern = re.compile(r"\S+")
|
||||
word_matches = list(word_pattern.finditer(content))
|
||||
|
||||
if len(words) <= self.chunk_size:
|
||||
return [content]
|
||||
if len(word_matches) <= self.chunk_size:
|
||||
# Single chunk - use entire content
|
||||
return [
|
||||
ChunkWithPosition(text=content, start_offset=0, end_offset=len(content))
|
||||
]
|
||||
|
||||
chunks = []
|
||||
start = 0
|
||||
start_idx = 0
|
||||
|
||||
while start < len(words):
|
||||
end = start + self.chunk_size
|
||||
chunk_words = words[start:end]
|
||||
chunks.append(" ".join(chunk_words))
|
||||
start = end - self.overlap
|
||||
while start_idx < len(word_matches):
|
||||
end_idx = min(start_idx + self.chunk_size, len(word_matches))
|
||||
|
||||
logger.debug(f"Chunked document into {len(chunks)} chunks ({len(words)} words)")
|
||||
# Get the first and last word positions
|
||||
first_word = word_matches[start_idx]
|
||||
last_word = word_matches[end_idx - 1]
|
||||
|
||||
# Extract chunk using character positions
|
||||
start_offset = first_word.start()
|
||||
end_offset = last_word.end()
|
||||
chunk_text = content[start_offset:end_offset]
|
||||
|
||||
chunks.append(
|
||||
ChunkWithPosition(
|
||||
text=chunk_text, start_offset=start_offset, end_offset=end_offset
|
||||
)
|
||||
)
|
||||
|
||||
# If we've reached the end, break
|
||||
if end_idx >= len(word_matches):
|
||||
break
|
||||
|
||||
# Move to next chunk with overlap
|
||||
next_start_idx = end_idx - self.overlap
|
||||
|
||||
# Safety check: ensure we're making forward progress
|
||||
# If we're not advancing (overlap >= chunk processed), break to prevent infinite loop
|
||||
if next_start_idx <= start_idx:
|
||||
break
|
||||
|
||||
start_idx = next_start_idx
|
||||
|
||||
logger.debug(
|
||||
f"Chunked document into {len(chunks)} chunks ({len(word_matches)} words)"
|
||||
)
|
||||
return chunks
|
||||
|
||||
@@ -233,13 +233,16 @@ async def _index_document(
|
||||
)
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
# Extract chunk texts for embedding
|
||||
chunk_texts = [chunk.text for chunk in chunks]
|
||||
|
||||
# Generate dense embeddings (I/O bound - external API call)
|
||||
embedding_service = get_embedding_service()
|
||||
dense_embeddings = await embedding_service.embed_batch(chunks)
|
||||
dense_embeddings = await embedding_service.embed_batch(chunk_texts)
|
||||
|
||||
# Generate sparse embeddings (BM25 for keyword matching)
|
||||
bm25_service = get_bm25_service()
|
||||
sparse_embeddings = bm25_service.encode_batch(chunks)
|
||||
sparse_embeddings = bm25_service.encode_batch(chunk_texts)
|
||||
|
||||
# Prepare Qdrant points
|
||||
indexed_at = int(time.time())
|
||||
@@ -265,12 +268,15 @@ async def _index_document(
|
||||
"doc_id": doc_task.doc_id,
|
||||
"doc_type": doc_task.doc_type,
|
||||
"title": title,
|
||||
"excerpt": chunk[:200],
|
||||
"excerpt": chunk.text[:200],
|
||||
"indexed_at": indexed_at,
|
||||
"modified_at": doc_task.modified_at,
|
||||
"etag": etag,
|
||||
"chunk_index": i,
|
||||
"total_chunks": len(chunks),
|
||||
"chunk_start_offset": chunk.start_offset,
|
||||
"chunk_end_offset": chunk.end_offset,
|
||||
"metadata_version": 2, # v2 includes position metadata
|
||||
},
|
||||
)
|
||||
)
|
||||
|
||||
+6
-4
@@ -1,6 +1,6 @@
|
||||
[project]
|
||||
name = "nextcloud-mcp-server"
|
||||
version = "0.38.0"
|
||||
version = "0.41.0"
|
||||
description = "Model Context Protocol (MCP) server for Nextcloud integration - enables AI assistants to interact with Nextcloud data"
|
||||
authors = [
|
||||
{name = "Chris Coutinho", email = "chris@coutinho.io"}
|
||||
@@ -12,7 +12,7 @@ keywords = ["nextcloud", "mcp", "model-context-protocol", "llm", "ai", "claude",
|
||||
dependencies = [
|
||||
"mcp[cli] (>=1.21,<1.22)",
|
||||
"httpx (>=0.28.1,<0.29.0)",
|
||||
"pillow (>=10.3.0,<12.0.0)", # Compatible with fastembed
|
||||
"pillow (>=10.3.0,<12.0.0)", # Compatible with fastembed
|
||||
"icalendar (>=6.0.0,<7.0.0)",
|
||||
"pythonvcard4>=0.2.0",
|
||||
"pydantic>=2.11.4",
|
||||
@@ -22,7 +22,9 @@ dependencies = [
|
||||
"aiosqlite>=0.20.0", # Async SQLite for refresh token storage
|
||||
"authlib>=1.6.5",
|
||||
"qdrant-client>=1.7.0",
|
||||
"fastembed>=0.4.2", # BM25 sparse vector embeddings for hybrid search
|
||||
"fastembed>=0.4.2", # BM25 sparse vector embeddings for hybrid search
|
||||
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
|
||||
"boto3>=1.35.0", # For Amazon Bedrock provider (optional)
|
||||
# Observability dependencies
|
||||
"prometheus-client>=0.21.0", # Prometheus metrics
|
||||
"opentelemetry-api>=1.28.2", # OpenTelemetry API
|
||||
@@ -32,6 +34,7 @@ dependencies = [
|
||||
"opentelemetry-instrumentation-logging>=0.49b2", # Logging integration
|
||||
"opentelemetry-exporter-otlp-proto-grpc>=1.28.2", # OTLP gRPC exporter
|
||||
"python-json-logger>=3.2.0", # Structured JSON logging
|
||||
"jinja2>=3.1.6",
|
||||
]
|
||||
classifiers = [
|
||||
"Development Status :: 4 - Beta",
|
||||
@@ -103,7 +106,6 @@ module-root = ""
|
||||
|
||||
[dependency-groups]
|
||||
dev = [
|
||||
"anthropic>=0.42.0", # For RAG evaluation with Anthropic LLMs
|
||||
"commitizen>=4.8.2",
|
||||
"datasets>=3.3.0", # For BeIR nfcorpus dataset loading
|
||||
"ipython>=9.2.0",
|
||||
|
||||
@@ -1,99 +1,20 @@
|
||||
"""LLM provider abstraction for RAG evaluation.
|
||||
|
||||
Supports Ollama (local) and Anthropic (cloud) providers for both ground truth
|
||||
DEPRECATED: This module is maintained for backward compatibility with RAG evaluation tests.
|
||||
New code should use nextcloud_mcp_server.providers directly.
|
||||
|
||||
Supports Ollama (local), Anthropic (cloud), and Bedrock (AWS) providers for both ground truth
|
||||
generation and evaluation.
|
||||
"""
|
||||
|
||||
import os
|
||||
from typing import Protocol
|
||||
|
||||
import httpx
|
||||
from anthropic import AsyncAnthropic
|
||||
|
||||
|
||||
class LLMProvider(Protocol):
|
||||
"""Protocol for LLM providers."""
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate from
|
||||
max_tokens: Maximum tokens to generate
|
||||
|
||||
Returns:
|
||||
Generated text
|
||||
"""
|
||||
...
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the provider and release resources."""
|
||||
...
|
||||
|
||||
|
||||
class OllamaProvider:
|
||||
"""Ollama provider for local LLM inference."""
|
||||
|
||||
def __init__(self, base_url: str, model: str):
|
||||
"""Initialize Ollama provider.
|
||||
|
||||
Args:
|
||||
base_url: Ollama API base URL (e.g., http://localhost:11434)
|
||||
model: Model name (e.g., llama3.1:8b)
|
||||
"""
|
||||
self.base_url = base_url.rstrip("/")
|
||||
self.model = model
|
||||
self.client = httpx.AsyncClient(timeout=600.0) # 10 min timeout for generation
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text using Ollama API."""
|
||||
response = await self.client.post(
|
||||
f"{self.base_url}/api/generate",
|
||||
json={
|
||||
"model": self.model,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"num_predict": max_tokens,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
},
|
||||
)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
return data["response"]
|
||||
|
||||
async def close(self):
|
||||
"""Close the HTTP client."""
|
||||
await self.client.aclose()
|
||||
|
||||
|
||||
class AnthropicProvider:
|
||||
"""Anthropic provider for cloud LLM inference."""
|
||||
|
||||
def __init__(self, api_key: str, model: str):
|
||||
"""Initialize Anthropic provider.
|
||||
|
||||
Args:
|
||||
api_key: Anthropic API key
|
||||
model: Model name (e.g., claude-3-5-sonnet-20241022)
|
||||
"""
|
||||
self.client = AsyncAnthropic(api_key=api_key)
|
||||
self.model = model
|
||||
|
||||
async def generate(self, prompt: str, max_tokens: int = 500) -> str:
|
||||
"""Generate text using Anthropic API."""
|
||||
message = await self.client.messages.create(
|
||||
model=self.model,
|
||||
max_tokens=max_tokens,
|
||||
temperature=0.7,
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
)
|
||||
return message.content[0].text
|
||||
|
||||
async def close(self):
|
||||
"""Close the client (no-op for Anthropic)."""
|
||||
pass
|
||||
from nextcloud_mcp_server.providers import (
|
||||
AnthropicProvider,
|
||||
BedrockProvider,
|
||||
OllamaProvider,
|
||||
Provider,
|
||||
)
|
||||
|
||||
|
||||
def create_llm_provider(
|
||||
@@ -102,18 +23,24 @@ def create_llm_provider(
|
||||
ollama_model: str | None = None,
|
||||
anthropic_api_key: str | None = None,
|
||||
anthropic_model: str | None = None,
|
||||
) -> LLMProvider:
|
||||
bedrock_region: str | None = None,
|
||||
bedrock_model: str | None = None,
|
||||
) -> Provider:
|
||||
"""Create an LLM provider from environment variables or arguments.
|
||||
|
||||
Args:
|
||||
provider: Provider type ('ollama' or 'anthropic'). Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
||||
provider: Provider type ('ollama', 'anthropic', or 'bedrock').
|
||||
Defaults to RAG_EVAL_PROVIDER env var or 'ollama'
|
||||
ollama_base_url: Ollama base URL. Defaults to RAG_EVAL_OLLAMA_BASE_URL or 'http://localhost:11434'
|
||||
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.1:8b'
|
||||
ollama_model: Ollama model. Defaults to RAG_EVAL_OLLAMA_MODEL or 'llama3.2:1b'
|
||||
anthropic_api_key: Anthropic API key. Defaults to RAG_EVAL_ANTHROPIC_API_KEY env var
|
||||
anthropic_model: Anthropic model. Defaults to RAG_EVAL_ANTHROPIC_MODEL or 'claude-3-5-sonnet-20241022'
|
||||
bedrock_region: AWS region. Defaults to RAG_EVAL_BEDROCK_REGION or AWS_REGION env var
|
||||
bedrock_model: Bedrock model ID. Defaults to RAG_EVAL_BEDROCK_MODEL or
|
||||
'anthropic.claude-3-sonnet-20240229-v1:0'
|
||||
|
||||
Returns:
|
||||
LLMProvider instance
|
||||
Provider instance
|
||||
|
||||
Raises:
|
||||
ValueError: If provider is invalid or required credentials are missing
|
||||
@@ -130,7 +57,9 @@ def create_llm_provider(
|
||||
or "http://localhost:11434"
|
||||
)
|
||||
model = ollama_model or os.environ.get("RAG_EVAL_OLLAMA_MODEL", "llama3.2:1b")
|
||||
return OllamaProvider(base_url=base_url, model=model)
|
||||
return OllamaProvider(
|
||||
base_url=base_url, embedding_model=None, generation_model=model
|
||||
)
|
||||
|
||||
elif provider == "anthropic":
|
||||
api_key = anthropic_api_key or os.environ.get("RAG_EVAL_ANTHROPIC_API_KEY")
|
||||
@@ -143,7 +72,18 @@ def create_llm_provider(
|
||||
)
|
||||
return AnthropicProvider(api_key=api_key, model=model)
|
||||
|
||||
elif provider == "bedrock":
|
||||
region = bedrock_region or os.environ.get(
|
||||
"RAG_EVAL_BEDROCK_REGION", os.environ.get("AWS_REGION", "us-east-1")
|
||||
)
|
||||
model = bedrock_model or os.environ.get(
|
||||
"RAG_EVAL_BEDROCK_MODEL", "anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
)
|
||||
return BedrockProvider(
|
||||
region_name=region, embedding_model=None, generation_model=model
|
||||
)
|
||||
|
||||
else:
|
||||
raise ValueError(
|
||||
f"Invalid provider: {provider}. Must be 'ollama' or 'anthropic'."
|
||||
f"Invalid provider: {provider}. Must be 'ollama', 'anthropic', or 'bedrock'."
|
||||
)
|
||||
|
||||
@@ -0,0 +1 @@
|
||||
"""Unit tests for provider infrastructure."""
|
||||
@@ -0,0 +1,280 @@
|
||||
"""Unit tests for Bedrock provider."""
|
||||
|
||||
import json
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
from nextcloud_mcp_server.providers.bedrock import BOTO3_AVAILABLE, BedrockProvider
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_bedrock_client(mocker):
|
||||
"""Mock boto3 bedrock-runtime client."""
|
||||
if not BOTO3_AVAILABLE:
|
||||
pytest.skip("boto3 not installed")
|
||||
|
||||
mock_client = MagicMock()
|
||||
mocker.patch("boto3.client", return_value=mock_client)
|
||||
return mock_client
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_embedding_titan(mock_bedrock_client):
|
||||
"""Test Bedrock embedding with Titan model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"embedding": [0.1, 0.2, 0.3]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model=None,
|
||||
)
|
||||
|
||||
# Test embedding
|
||||
embedding = await provider.embed("test text")
|
||||
|
||||
assert embedding == [0.1, 0.2, 0.3]
|
||||
mock_bedrock_client.invoke_model.assert_called_once()
|
||||
call_args = mock_bedrock_client.invoke_model.call_args
|
||||
|
||||
assert call_args.kwargs["modelId"] == "amazon.titan-embed-text-v2:0"
|
||||
body = json.loads(call_args.kwargs["body"])
|
||||
assert body == {"inputText": "test text"}
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_embedding_batch(mock_bedrock_client):
|
||||
"""Test Bedrock batch embedding."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"embedding": [0.1, 0.2, 0.3]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model=None,
|
||||
)
|
||||
|
||||
# Test batch embedding
|
||||
embeddings = await provider.embed_batch(["text1", "text2"])
|
||||
|
||||
assert len(embeddings) == 2
|
||||
assert embeddings[0] == [0.1, 0.2, 0.3]
|
||||
assert embeddings[1] == [0.1, 0.2, 0.3]
|
||||
assert mock_bedrock_client.invoke_model.call_count == 2
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_generation_claude(mock_bedrock_client):
|
||||
"""Test Bedrock text generation with Claude model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps(
|
||||
{"content": [{"text": "Generated response"}]}
|
||||
).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model=None,
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
# Test generation
|
||||
text = await provider.generate("test prompt", max_tokens=100)
|
||||
|
||||
assert text == "Generated response"
|
||||
mock_bedrock_client.invoke_model.assert_called_once()
|
||||
call_args = mock_bedrock_client.invoke_model.call_args
|
||||
|
||||
assert call_args.kwargs["modelId"] == "anthropic.claude-3-sonnet-20240229-v1:0"
|
||||
body = json.loads(call_args.kwargs["body"])
|
||||
assert body["messages"][0]["content"] == "test prompt"
|
||||
assert body["max_tokens"] == 100
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_generation_llama(mock_bedrock_client):
|
||||
"""Test Bedrock text generation with Llama model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"generation": "Llama response"}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
# Create provider
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model=None,
|
||||
generation_model="meta.llama3-8b-instruct-v1:0",
|
||||
)
|
||||
|
||||
# Test generation
|
||||
text = await provider.generate("test prompt")
|
||||
|
||||
assert text == "Llama response"
|
||||
body = json.loads(mock_bedrock_client.invoke_model.call_args.kwargs["body"])
|
||||
assert body["prompt"] == "test prompt"
|
||||
assert "max_gen_len" in body
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_both_capabilities(mock_bedrock_client):
|
||||
"""Test Bedrock with both embedding and generation models."""
|
||||
# Mock responses
|
||||
embed_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(return_value=json.dumps({"embedding": [0.1, 0.2]}).encode())
|
||||
)
|
||||
}
|
||||
gen_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"content": [{"text": "Response"}]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
|
||||
# Mock to return different responses based on modelId
|
||||
def mock_invoke(modelId, body, **kwargs):
|
||||
if "embed" in modelId:
|
||||
return embed_response
|
||||
else:
|
||||
return gen_response
|
||||
|
||||
mock_bedrock_client.invoke_model.side_effect = mock_invoke
|
||||
|
||||
# Create provider with both models
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
assert provider.supports_embeddings is True
|
||||
assert provider.supports_generation is True
|
||||
|
||||
# Test both capabilities
|
||||
embedding = await provider.embed("test")
|
||||
assert embedding == [0.1, 0.2]
|
||||
|
||||
text = await provider.generate("test")
|
||||
assert text == "Response"
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_no_embeddings():
|
||||
"""Test Bedrock provider with no embedding model raises error."""
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model=None,
|
||||
generation_model="anthropic.claude-3-sonnet-20240229-v1:0",
|
||||
)
|
||||
|
||||
assert provider.supports_embeddings is False
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||
await provider.embed("test")
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||
await provider.embed_batch(["test"])
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no embedding_model configured"):
|
||||
provider.get_dimension()
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_no_generation():
|
||||
"""Test Bedrock provider with no generation model raises error."""
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
generation_model=None,
|
||||
)
|
||||
|
||||
assert provider.supports_generation is False
|
||||
|
||||
with pytest.raises(NotImplementedError, match="no generation_model configured"):
|
||||
await provider.generate("test")
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_dimension_detection(mock_bedrock_client):
|
||||
"""Test dimension detection for Bedrock embeddings."""
|
||||
# Mock response with specific dimension
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps(
|
||||
{"embedding": [0.1] * 1536} # 1536-dim embedding
|
||||
).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="amazon.titan-embed-text-v2:0",
|
||||
)
|
||||
|
||||
# Dimension not detected yet
|
||||
with pytest.raises(RuntimeError, match="not detected yet"):
|
||||
provider.get_dimension()
|
||||
|
||||
# Detect dimension
|
||||
await provider._detect_dimension()
|
||||
|
||||
# Now dimension should be available
|
||||
assert provider.get_dimension() == 1536
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
async def test_bedrock_cohere_embedding(mock_bedrock_client):
|
||||
"""Test Bedrock with Cohere embedding model."""
|
||||
# Mock response
|
||||
mock_response = {
|
||||
"body": MagicMock(
|
||||
read=MagicMock(
|
||||
return_value=json.dumps({"embeddings": [[0.1, 0.2, 0.3]]}).encode()
|
||||
)
|
||||
)
|
||||
}
|
||||
mock_bedrock_client.invoke_model.return_value = mock_response
|
||||
|
||||
provider = BedrockProvider(
|
||||
region_name="us-east-1",
|
||||
embedding_model="cohere.embed-english-v3",
|
||||
)
|
||||
|
||||
embedding = await provider.embed("test text")
|
||||
|
||||
assert embedding == [0.1, 0.2, 0.3]
|
||||
body = json.loads(mock_bedrock_client.invoke_model.call_args.kwargs["body"])
|
||||
assert body == {"texts": ["test text"], "input_type": "search_document"}
|
||||
@@ -0,0 +1 @@
|
||||
"""Unit tests for search algorithms."""
|
||||
@@ -0,0 +1,54 @@
|
||||
"""Unit tests for BM25 hybrid search algorithm."""
|
||||
|
||||
import pytest
|
||||
from qdrant_client import models
|
||||
|
||||
from nextcloud_mcp_server.search.bm25_hybrid import BM25HybridSearchAlgorithm
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_bm25_hybrid_initialization_default():
|
||||
"""Test BM25HybridSearchAlgorithm initializes with default RRF fusion."""
|
||||
algo = BM25HybridSearchAlgorithm()
|
||||
|
||||
assert algo.score_threshold == 0.0
|
||||
assert algo.fusion == models.Fusion.RRF
|
||||
assert algo.fusion_name == "rrf"
|
||||
assert algo.name == "bm25_hybrid"
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_bm25_hybrid_initialization_with_rrf():
|
||||
"""Test BM25HybridSearchAlgorithm initializes with explicit RRF fusion."""
|
||||
algo = BM25HybridSearchAlgorithm(score_threshold=0.5, fusion="rrf")
|
||||
|
||||
assert algo.score_threshold == 0.5
|
||||
assert algo.fusion == models.Fusion.RRF
|
||||
assert algo.fusion_name == "rrf"
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_bm25_hybrid_initialization_with_dbsf():
|
||||
"""Test BM25HybridSearchAlgorithm initializes with DBSF fusion."""
|
||||
algo = BM25HybridSearchAlgorithm(score_threshold=0.7, fusion="dbsf")
|
||||
|
||||
assert algo.score_threshold == 0.7
|
||||
assert algo.fusion == models.Fusion.DBSF
|
||||
assert algo.fusion_name == "dbsf"
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_bm25_hybrid_invalid_fusion_raises_error():
|
||||
"""Test BM25HybridSearchAlgorithm raises ValueError for invalid fusion."""
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
BM25HybridSearchAlgorithm(fusion="invalid")
|
||||
|
||||
assert "Invalid fusion algorithm 'invalid'" in str(exc_info.value)
|
||||
assert "Must be 'rrf' or 'dbsf'" in str(exc_info.value)
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_bm25_hybrid_requires_vector_db():
|
||||
"""Test BM25HybridSearchAlgorithm reports it requires vector database."""
|
||||
algo = BM25HybridSearchAlgorithm()
|
||||
assert algo.requires_vector_db is True
|
||||
@@ -0,0 +1,135 @@
|
||||
"""Unit tests for SearchResult validation."""
|
||||
|
||||
import pytest
|
||||
|
||||
from nextcloud_mcp_server.search.algorithms import SearchResult
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_rrf_score_in_range():
|
||||
"""Test SearchResult accepts RRF scores in [0.0, 1.0] range."""
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Test Note",
|
||||
excerpt="Test excerpt",
|
||||
score=0.85,
|
||||
)
|
||||
|
||||
assert result.score == 0.85
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_rrf_score_at_lower_bound():
|
||||
"""Test SearchResult accepts RRF score at lower bound (0.0)."""
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Test Note",
|
||||
excerpt="Test excerpt",
|
||||
score=0.0,
|
||||
)
|
||||
|
||||
assert result.score == 0.0
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_rrf_score_at_upper_bound():
|
||||
"""Test SearchResult accepts RRF score at upper bound (1.0)."""
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Test Note",
|
||||
excerpt="Test excerpt",
|
||||
score=1.0,
|
||||
)
|
||||
|
||||
assert result.score == 1.0
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_dbsf_score_above_one():
|
||||
"""Test SearchResult accepts DBSF scores > 1.0.
|
||||
|
||||
DBSF (Distribution-Based Score Fusion) sums normalized scores from multiple
|
||||
systems (dense semantic + sparse BM25), so scores can exceed 1.0 when both
|
||||
systems strongly agree a document is relevant.
|
||||
"""
|
||||
# Typical DBSF score when both systems agree
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Highly Relevant Note",
|
||||
excerpt="Contains keywords and is semantically similar",
|
||||
score=1.55,
|
||||
)
|
||||
|
||||
assert result.score == 1.55
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_dbsf_score_edge_case():
|
||||
"""Test SearchResult accepts DBSF maximum theoretical score (2.0).
|
||||
|
||||
Maximum DBSF score with 2 systems: 1.0 (dense) + 1.0 (sparse) = 2.0
|
||||
"""
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Perfect Match",
|
||||
excerpt="Perfect semantic and keyword match",
|
||||
score=2.0,
|
||||
)
|
||||
|
||||
assert result.score == 2.0
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_negative_score_raises_error():
|
||||
"""Test SearchResult rejects negative scores."""
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Test Note",
|
||||
excerpt="Test excerpt",
|
||||
score=-0.1,
|
||||
)
|
||||
|
||||
assert "Score must be non-negative" in str(exc_info.value)
|
||||
assert "got -0.1" in str(exc_info.value)
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_with_metadata():
|
||||
"""Test SearchResult with optional metadata field."""
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Test Note",
|
||||
excerpt="Test excerpt",
|
||||
score=1.25,
|
||||
metadata={"fusion_method": "dbsf", "dense_score": 0.8, "sparse_score": 0.45},
|
||||
)
|
||||
|
||||
assert result.score == 1.25
|
||||
assert result.metadata["fusion_method"] == "dbsf"
|
||||
assert result.metadata["dense_score"] == 0.8
|
||||
assert result.metadata["sparse_score"] == 0.45
|
||||
|
||||
|
||||
@pytest.mark.unit
|
||||
def test_search_result_with_chunk_offsets():
|
||||
"""Test SearchResult with chunk offset information."""
|
||||
result = SearchResult(
|
||||
id=1,
|
||||
doc_type="note",
|
||||
title="Test Note",
|
||||
excerpt="matching chunk text",
|
||||
score=0.9,
|
||||
chunk_start_offset=100,
|
||||
chunk_end_offset=500,
|
||||
)
|
||||
|
||||
assert result.chunk_start_offset == 100
|
||||
assert result.chunk_end_offset == 500
|
||||
@@ -0,0 +1,190 @@
|
||||
"""Unit tests for DocumentChunker with position tracking."""
|
||||
|
||||
from nextcloud_mcp_server.vector.document_chunker import (
|
||||
ChunkWithPosition,
|
||||
DocumentChunker,
|
||||
)
|
||||
|
||||
|
||||
class TestDocumentChunkerPositions:
|
||||
"""Test suite for DocumentChunker position tracking functionality."""
|
||||
|
||||
def test_single_chunk_simple_text(self):
|
||||
"""Test that single-chunk documents return correct positions."""
|
||||
chunker = DocumentChunker(chunk_size=512, overlap=50)
|
||||
content = "This is a short document."
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
assert len(chunks) == 1
|
||||
assert isinstance(chunks[0], ChunkWithPosition)
|
||||
assert chunks[0].text == content
|
||||
assert chunks[0].start_offset == 0
|
||||
assert chunks[0].end_offset == len(content)
|
||||
|
||||
def test_multiple_chunks_positions(self):
|
||||
"""Test that multi-chunk documents have correct positions."""
|
||||
chunker = DocumentChunker(chunk_size=10, overlap=2) # Small chunks for testing
|
||||
# Create content with exactly 30 words
|
||||
words = [f"word{i:02d}" for i in range(30)]
|
||||
content = " ".join(words)
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
# Verify we got multiple chunks (30 words, 10 per chunk, 2 overlap = 4 chunks)
|
||||
assert len(chunks) == 4
|
||||
|
||||
# Verify all chunks are ChunkWithPosition
|
||||
for chunk in chunks:
|
||||
assert isinstance(chunk, ChunkWithPosition)
|
||||
|
||||
# Verify first chunk starts at 0
|
||||
assert chunks[0].start_offset == 0
|
||||
|
||||
# Verify last chunk ends at content length
|
||||
assert chunks[-1].end_offset == len(content)
|
||||
|
||||
# Verify chunks are contiguous or overlap (no gaps)
|
||||
for i in range(len(chunks) - 1):
|
||||
# Next chunk should start at or before current chunk ends
|
||||
assert chunks[i + 1].start_offset <= chunks[i].end_offset
|
||||
|
||||
# Verify we can reconstruct the content using positions
|
||||
for chunk in chunks:
|
||||
extracted = content[chunk.start_offset : chunk.end_offset]
|
||||
assert extracted == chunk.text
|
||||
|
||||
def test_chunk_positions_with_whitespace(self):
|
||||
"""Test position tracking with various whitespace."""
|
||||
chunker = DocumentChunker(chunk_size=5, overlap=1)
|
||||
content = "word1 word2\n\nword3\tword4 word5 word6"
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
# Verify positions correctly handle whitespace
|
||||
for chunk in chunks:
|
||||
extracted = content[chunk.start_offset : chunk.end_offset]
|
||||
assert extracted == chunk.text
|
||||
# Verify no leading/trailing whitespace unless in original
|
||||
if chunk != chunks[0] and chunk != chunks[-1]:
|
||||
# Middle chunks should be extracted correctly
|
||||
assert len(chunk.text.strip()) > 0
|
||||
|
||||
def test_empty_content(self):
|
||||
"""Test that empty content returns empty chunk."""
|
||||
chunker = DocumentChunker(chunk_size=512, overlap=50)
|
||||
content = ""
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
assert len(chunks) == 1
|
||||
assert chunks[0].text == ""
|
||||
assert chunks[0].start_offset == 0
|
||||
assert chunks[0].end_offset == 0
|
||||
|
||||
def test_chunk_overlap_positions(self):
|
||||
"""Test that overlapping chunks have correct positions."""
|
||||
chunker = DocumentChunker(chunk_size=10, overlap=3)
|
||||
words = [f"word{i:02d}" for i in range(25)]
|
||||
content = " ".join(words)
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
# Verify overlap exists
|
||||
for i in range(len(chunks) - 1):
|
||||
current_chunk = chunks[i]
|
||||
next_chunk = chunks[i + 1]
|
||||
|
||||
# Next chunk should start before current ends (overlap)
|
||||
# This happens because we move back by overlap words
|
||||
# The actual character overlap depends on word lengths
|
||||
assert next_chunk.start_offset >= 0
|
||||
assert current_chunk.end_offset <= len(content)
|
||||
|
||||
def test_unicode_content_positions(self):
|
||||
"""Test position tracking with Unicode characters."""
|
||||
chunker = DocumentChunker(chunk_size=10, overlap=2)
|
||||
content = "Hello 世界 こんにちは мир Привет שלום مرحبا 你好"
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
# Verify all chunks extract correctly
|
||||
for chunk in chunks:
|
||||
extracted = content[chunk.start_offset : chunk.end_offset]
|
||||
assert extracted == chunk.text
|
||||
|
||||
# Verify full coverage
|
||||
if len(chunks) == 1:
|
||||
assert chunks[0].start_offset == 0
|
||||
assert chunks[0].end_offset == len(content)
|
||||
|
||||
def test_single_word_chunks(self):
|
||||
"""Test position tracking with single-word chunks."""
|
||||
chunker = DocumentChunker(chunk_size=1, overlap=0)
|
||||
content = "one two three"
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
assert len(chunks) == 3
|
||||
assert chunks[0].text == "one"
|
||||
assert chunks[1].text == "two"
|
||||
assert chunks[2].text == "three"
|
||||
|
||||
# Verify positions
|
||||
assert content[chunks[0].start_offset : chunks[0].end_offset] == "one"
|
||||
assert content[chunks[1].start_offset : chunks[1].end_offset] == "two"
|
||||
assert content[chunks[2].start_offset : chunks[2].end_offset] == "three"
|
||||
|
||||
def test_realistic_note_content(self):
|
||||
"""Test with realistic note content similar to Nextcloud Notes."""
|
||||
chunker = DocumentChunker(chunk_size=50, overlap=10)
|
||||
content = """My Project Notes
|
||||
|
||||
This is a note about my project. It contains several paragraphs of text
|
||||
that should be chunked appropriately for embedding.
|
||||
|
||||
## Key Points
|
||||
|
||||
- First important point with some details
|
||||
- Second point that needs to be remembered
|
||||
- Third point for future reference
|
||||
|
||||
The document continues with more content here. We want to make sure that
|
||||
the chunking preserves context across boundaries while maintaining proper
|
||||
position tracking for each chunk.
|
||||
|
||||
This allows us to highlight the exact chunk that matched a search query,
|
||||
which builds trust in the RAG system."""
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
# Should have multiple chunks
|
||||
assert len(chunks) > 1
|
||||
|
||||
# Verify all chunks
|
||||
for chunk in chunks:
|
||||
assert isinstance(chunk, ChunkWithPosition)
|
||||
# Verify extraction
|
||||
extracted = content[chunk.start_offset : chunk.end_offset]
|
||||
assert extracted == chunk.text
|
||||
# Verify positions are valid
|
||||
assert chunk.start_offset >= 0
|
||||
assert chunk.end_offset <= len(content)
|
||||
assert chunk.start_offset < chunk.end_offset
|
||||
|
||||
def test_chunk_boundaries(self):
|
||||
"""Test that chunk boundaries are word-aligned."""
|
||||
chunker = DocumentChunker(chunk_size=10, overlap=2)
|
||||
words = [f"word{i:02d}" for i in range(30)]
|
||||
content = " ".join(words)
|
||||
|
||||
chunks = chunker.chunk_text(content)
|
||||
|
||||
for chunk in chunks:
|
||||
# Verify chunk text starts and ends with word characters (no split words)
|
||||
# Unless it's the full content
|
||||
if len(chunks) > 1:
|
||||
# Each chunk should start with a word (not whitespace)
|
||||
assert chunk.text[0].strip() != ""
|
||||
# Each chunk should end with a word (not whitespace)
|
||||
assert chunk.text[-1].strip() != ""
|
||||
Vendored
+1
-1
Submodule third_party/oidc updated: 9616294911...5670bc7e30
@@ -205,11 +205,11 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "asttokens"
|
||||
version = "3.0.0"
|
||||
version = "3.0.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/4a/e7/82da0a03e7ba5141f05cce0d302e6eed121ae055e0456ca228bf693984bc/asttokens-3.0.0.tar.gz", hash = "sha256:0dcd8baa8d62b0c1d118b399b2ddba3c4aff271d0d7a9e0d4c1681c79035bbc7", size = 61978, upload-time = "2024-11-30T04:30:14.439Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/be/a5/8e3f9b6771b0b408517c82d97aed8f2036509bc247d46114925e32fe33f0/asttokens-3.0.1.tar.gz", hash = "sha256:71a4ee5de0bde6a31d64f6b13f2293ac190344478f081c3d1bccfcf5eacb0cb7", size = 62308, upload-time = "2025-11-15T16:43:48.578Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/25/8a/c46dcc25341b5bce5472c718902eb3d38600a903b14fa6aeecef3f21a46f/asttokens-3.0.0-py3-none-any.whl", hash = "sha256:e3078351a059199dd5138cb1c706e6430c05eff2ff136af5eb4790f9d28932e2", size = 26918, upload-time = "2024-11-30T04:30:10.946Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d2/39/e7eaf1799466a4aef85b6a4fe7bd175ad2b1c6345066aa33f1f58d4b18d0/asttokens-3.0.1-py3-none-any.whl", hash = "sha256:15a3ebc0f43c2d0a50eeafea25e19046c68398e487b9f1f5b517f7c0f40f976a", size = 27047, upload-time = "2025-11-15T16:43:16.109Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -233,6 +233,34 @@ wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/f8/aa/5082412d1ee302e9e7d80b6949bc4d2a8fa1149aaab610c5fc24709605d6/authlib-1.6.5-py2.py3-none-any.whl", hash = "sha256:3e0e0507807f842b02175507bdee8957a1d5707fd4afb17c32fb43fee90b6e3a", size = 243608, upload-time = "2025-10-02T13:36:07.637Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "boto3"
|
||||
version = "1.40.74"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "botocore" },
|
||||
{ name = "jmespath" },
|
||||
{ name = "s3transfer" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/a2/37/0db5fc46548b347255310893f1a47971a1d8eb0dbc46dfb5ace8a1e7d45e/boto3-1.40.74.tar.gz", hash = "sha256:484e46bf394b03a7c31b34f90945ebe1390cb1e2ac61980d128a9079beac87d4", size = 111592, upload-time = "2025-11-14T20:29:10.991Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/d2/08/c52751748762901c0ca3c3019e3aa950010217f0fdf9940ebe68e6bb2f5a/boto3-1.40.74-py3-none-any.whl", hash = "sha256:41fc8844b37ae27b24bcabf8369769df246cc12c09453988d0696ad06d6aa9ef", size = 139360, upload-time = "2025-11-14T20:29:09.477Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "botocore"
|
||||
version = "1.40.74"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "jmespath" },
|
||||
{ name = "python-dateutil" },
|
||||
{ name = "urllib3" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/81/dc/0412505f05286f282a75bb0c650e525ddcfaf3f6f1a05cd8e99d32a2db06/botocore-1.40.74.tar.gz", hash = "sha256:57de0b9ffeada06015b3c7e5186c77d0692b210d9e5efa294f3214df97e2f8ee", size = 14452479, upload-time = "2025-11-14T20:29:00.949Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/7d/a2/306dec16e3c84f3ca7aaead0084358c1c7fbe6501f6160844cbc93bc871e/botocore-1.40.74-py3-none-any.whl", hash = "sha256:f39f5763e35e75f0bd91212b7b36120b1536203e8003cd952ef527db79702b15", size = 14117911, upload-time = "2025-11-14T20:28:58.153Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "caldav"
|
||||
version = "2.0.2.dev47+g3e44cf827"
|
||||
@@ -246,11 +274,11 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "certifi"
|
||||
version = "2025.10.5"
|
||||
version = "2025.11.12"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/4c/5b/b6ce21586237c77ce67d01dc5507039d444b630dd76611bbca2d8e5dcd91/certifi-2025.10.5.tar.gz", hash = "sha256:47c09d31ccf2acf0be3f701ea53595ee7e0b8fa08801c6624be771df09ae7b43", size = 164519, upload-time = "2025-10-05T04:12:15.808Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/a2/8c/58f469717fa48465e4a50c014a0400602d3c437d7c0c468e17ada824da3a/certifi-2025.11.12.tar.gz", hash = "sha256:d8ab5478f2ecd78af242878415affce761ca6bc54a22a27e026d7c25357c3316", size = 160538, upload-time = "2025-11-12T02:54:51.517Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/e4/37/af0d2ef3967ac0d6113837b44a4f0bfe1328c2b9763bd5b1744520e5cfed/certifi-2025.10.5-py3-none-any.whl", hash = "sha256:0f212c2744a9bb6de0c56639a6f68afe01ecd92d91f14ae897c4fe7bbeeef0de", size = 163286, upload-time = "2025-10-05T04:12:14.03Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/70/7d/9bc192684cea499815ff478dfcdc13835ddf401365057044fb721ec6bddb/certifi-2025.11.12-py3-none-any.whl", hash = "sha256:97de8790030bbd5c2d96b7ec782fc2f7820ef8dba6db909ccf95449f2d062d4b", size = 159438, upload-time = "2025-11-12T02:54:49.735Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -398,14 +426,14 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "click"
|
||||
version = "8.3.0"
|
||||
version = "8.3.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "colorama", marker = "sys_platform == 'win32'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/46/61/de6cd827efad202d7057d93e0fed9294b96952e188f7384832791c7b2254/click-8.3.0.tar.gz", hash = "sha256:e7b8232224eba16f4ebe410c25ced9f7875cb5f3263ffc93cc3e8da705e229c4", size = 276943, upload-time = "2025-09-18T17:32:23.696Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/3d/fa/656b739db8587d7b5dfa22e22ed02566950fbfbcdc20311993483657a5c0/click-8.3.1.tar.gz", hash = "sha256:12ff4785d337a1bb490bb7e9c2b1ee5da3112e94a8622f26a6c77f5d2fc6842a", size = 295065, upload-time = "2025-11-15T20:45:42.706Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/db/d3/9dcc0f5797f070ec8edf30fbadfb200e71d9db6b84d211e3b2085a7589a0/click-8.3.0-py3-none-any.whl", hash = "sha256:9b9f285302c6e3064f4330c05f05b81945b2a39544279343e6e7c5f27a9baddc", size = 107295, upload-time = "2025-09-18T17:32:22.42Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/98/78/01c019cdb5d6498122777c1a43056ebb3ebfeef2076d9d026bfe15583b2b/click-8.3.1-py3-none-any.whl", hash = "sha256:981153a64e25f12d547d3426c367a4857371575ee7ad18df2a6183ab0545b2a6", size = 108274, upload-time = "2025-11-15T20:45:41.139Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -431,7 +459,7 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "commitizen"
|
||||
version = "4.9.1"
|
||||
version = "4.10.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "argcomplete" },
|
||||
@@ -447,9 +475,9 @@ dependencies = [
|
||||
{ name = "termcolor" },
|
||||
{ name = "tomlkit" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/77/19/927ac5b0eabb9451e2d5bb45b30813915c9a1260713b5b68eeb31358ea23/commitizen-4.9.1.tar.gz", hash = "sha256:b076b24657718f7a35b1068f2083bd39b4065d250164a1398d1dac235c51753b", size = 56610, upload-time = "2025-09-10T14:19:33.746Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/ab/b3/cc29794fc2ecd7aa7353105773ca18ecd761c3ba5b38879bd106b3fc8840/commitizen-4.10.0.tar.gz", hash = "sha256:cc58067403b9eff21d0423b3d9a29bda05254bd51ad5bdd1fd0594bff31277e1", size = 56820, upload-time = "2025-11-10T14:08:49.365Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/cf/49/577035b841442fe031b017027c3d99278b46104d227f0353c69dbbe55148/commitizen-4.9.1-py3-none-any.whl", hash = "sha256:4241b2ecae97b8109af8e587c36bc3b805a09b9a311084d159098e12d6ead497", size = 80624, upload-time = "2025-09-10T14:19:32.102Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b3/5d/2bd8881737d6a5652ae3ebc37736893b9a7425f0eb16e605d1ff2957267e/commitizen-4.10.0-py3-none-any.whl", hash = "sha256:3fe56c168b30b30b84b8329cba6b132e77b4eb304a5460bbe2186aad0f78c966", size = 81269, upload-time = "2025-11-10T14:08:48.001Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1296,6 +1324,15 @@ wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/2f/9c/6753e6522b8d0ef07d3a3d239426669e984fb0eba15a315cdbc1253904e4/jiter-0.12.0-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c24e864cb30ab82311c6425655b0cdab0a98c5d973b065c66a3f020740c2324c", size = 346110, upload-time = "2025-11-09T20:49:21.817Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jmespath"
|
||||
version = "1.0.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/00/2a/e867e8531cf3e36b41201936b7fa7ba7b5702dbef42922193f05c8976cd6/jmespath-1.0.1.tar.gz", hash = "sha256:90261b206d6defd58fdd5e85f478bf633a2901798906be2ad389150c5c60edbe", size = 25843, upload-time = "2022-06-17T18:00:12.224Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/31/b4/b9b800c45527aadd64d5b442f9b932b00648617eb5d63d2c7a6587b7cafc/jmespath-1.0.1-py3-none-any.whl", hash = "sha256:02e2e4cc71b5bcab88332eebf907519190dd9e6e82107fa7f83b1003a6252980", size = 20256, upload-time = "2022-06-17T18:00:10.251Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "jsonschema"
|
||||
version = "4.25.1"
|
||||
@@ -1538,7 +1575,7 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "mcp"
|
||||
version = "1.21.0"
|
||||
version = "1.21.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "anyio" },
|
||||
@@ -1552,11 +1589,13 @@ dependencies = [
|
||||
{ name = "pywin32", marker = "sys_platform == 'win32'" },
|
||||
{ name = "sse-starlette" },
|
||||
{ name = "starlette" },
|
||||
{ name = "typing-extensions" },
|
||||
{ name = "typing-inspection" },
|
||||
{ name = "uvicorn", marker = "sys_platform != 'emscripten'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/33/54/dd2330ef4611c27ae59124820863c34e1d3edb1133c58e6375e2d938c9c5/mcp-1.21.0.tar.gz", hash = "sha256:bab0a38e8f8c48080d787233343f8d301b0e1e95846ae7dead251b2421d99855", size = 452697, upload-time = "2025-11-06T23:19:58.432Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/f7/25/4df633e7574254ada574822db2245bbee424725d1b01bccae10bf128794e/mcp-1.21.1.tar.gz", hash = "sha256:540e6ac4b12b085c43f14879fde04cbdb10148a09ea9492ff82d8c7ba651a302", size = 469071, upload-time = "2025-11-13T20:33:46.139Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/39/47/850b6edc96c03bd44b00de9a0ca3c1cc71e0ba1cd5822955bc9e4eb3fad3/mcp-1.21.0-py3-none-any.whl", hash = "sha256:598619e53eb0b7a6513db38c426b28a4bdf57496fed04332100d2c56acade98b", size = 173672, upload-time = "2025-11-06T23:19:56.508Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/49/af/01fb42df59ad15925ffc1e2e609adafddd3ac4572f606faae0dc8b55ba0c/mcp-1.21.1-py3-none-any.whl", hash = "sha256:dd35abe36d68530a8a1291daa25d50276d8731e545c0434d6e250a3700dd2a6d", size = 174852, upload-time = "2025-11-13T20:33:44.502Z" },
|
||||
]
|
||||
|
||||
[package.optional-dependencies]
|
||||
@@ -1818,16 +1857,19 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "nextcloud-mcp-server"
|
||||
version = "0.38.0"
|
||||
version = "0.41.0"
|
||||
source = { editable = "." }
|
||||
dependencies = [
|
||||
{ name = "aiosqlite" },
|
||||
{ name = "anthropic" },
|
||||
{ name = "authlib" },
|
||||
{ name = "boto3" },
|
||||
{ name = "caldav" },
|
||||
{ name = "click" },
|
||||
{ name = "fastembed" },
|
||||
{ name = "httpx" },
|
||||
{ name = "icalendar" },
|
||||
{ name = "jinja2" },
|
||||
{ name = "mcp", extra = ["cli"] },
|
||||
{ name = "opentelemetry-api" },
|
||||
{ name = "opentelemetry-exporter-otlp-proto-grpc" },
|
||||
@@ -1846,7 +1888,6 @@ dependencies = [
|
||||
|
||||
[package.dev-dependencies]
|
||||
dev = [
|
||||
{ name = "anthropic" },
|
||||
{ name = "commitizen" },
|
||||
{ name = "datasets" },
|
||||
{ name = "ipython" },
|
||||
@@ -1864,12 +1905,15 @@ dev = [
|
||||
[package.metadata]
|
||||
requires-dist = [
|
||||
{ name = "aiosqlite", specifier = ">=0.20.0" },
|
||||
{ name = "anthropic", specifier = ">=0.42.0" },
|
||||
{ name = "authlib", specifier = ">=1.6.5" },
|
||||
{ name = "boto3", specifier = ">=1.35.0" },
|
||||
{ name = "caldav", git = "https://github.com/cbcoutinho/caldav?branch=feature%2Fhttpx" },
|
||||
{ name = "click", specifier = ">=8.1.8" },
|
||||
{ name = "fastembed", specifier = ">=0.4.2" },
|
||||
{ name = "httpx", specifier = ">=0.28.1,<0.29.0" },
|
||||
{ name = "icalendar", specifier = ">=6.0.0,<7.0.0" },
|
||||
{ name = "jinja2", specifier = ">=3.1.6" },
|
||||
{ name = "mcp", extras = ["cli"], specifier = ">=1.21,<1.22" },
|
||||
{ name = "opentelemetry-api", specifier = ">=1.28.2" },
|
||||
{ name = "opentelemetry-exporter-otlp-proto-grpc", specifier = ">=1.28.2" },
|
||||
@@ -1888,7 +1932,6 @@ requires-dist = [
|
||||
|
||||
[package.metadata.requires-dev]
|
||||
dev = [
|
||||
{ name = "anthropic", specifier = ">=0.42.0" },
|
||||
{ name = "commitizen", specifier = ">=4.8.2" },
|
||||
{ name = "datasets", specifier = ">=3.3.0" },
|
||||
{ name = "ipython", specifier = ">=9.2.0" },
|
||||
@@ -2337,21 +2380,21 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "playwright"
|
||||
version = "1.55.0"
|
||||
version = "1.56.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "greenlet" },
|
||||
{ name = "pyee" },
|
||||
]
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/80/3a/c81ff76df266c62e24f19718df9c168f49af93cabdbc4608ae29656a9986/playwright-1.55.0-py3-none-macosx_10_13_x86_64.whl", hash = "sha256:d7da108a95001e412effca4f7610de79da1637ccdf670b1ae3fdc08b9694c034", size = 40428109, upload-time = "2025-08-28T15:46:20.357Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/cf/f5/bdb61553b20e907196a38d864602a9b4a461660c3a111c67a35179b636fa/playwright-1.55.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:8290cf27a5d542e2682ac274da423941f879d07b001f6575a5a3a257b1d4ba1c", size = 38687254, upload-time = "2025-08-28T15:46:23.925Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/4a/64/48b2837ef396487807e5ab53c76465747e34c7143fac4a084ef349c293a8/playwright-1.55.0-py3-none-macosx_11_0_universal2.whl", hash = "sha256:25b0d6b3fd991c315cca33c802cf617d52980108ab8431e3e1d37b5de755c10e", size = 40428108, upload-time = "2025-08-28T15:46:27.119Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/08/33/858312628aa16a6de97839adc2ca28031ebc5391f96b6fb8fdf1fcb15d6c/playwright-1.55.0-py3-none-manylinux1_x86_64.whl", hash = "sha256:c6d4d8f6f8c66c483b0835569c7f0caa03230820af8e500c181c93509c92d831", size = 45905643, upload-time = "2025-08-28T15:46:30.312Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/83/83/b8d06a5b5721931aa6d5916b83168e28bd891f38ff56fe92af7bdee9860f/playwright-1.55.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:29a0777c4ce1273acf90c87e4ae2fe0130182100d99bcd2ae5bf486093044838", size = 45296647, upload-time = "2025-08-28T15:46:33.221Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/06/2e/9db64518aebcb3d6ef6cd6d4d01da741aff912c3f0314dadb61226c6a96a/playwright-1.55.0-py3-none-win32.whl", hash = "sha256:29e6d1558ad9d5b5c19cbec0a72f6a2e35e6353cd9f262e22148685b86759f90", size = 35476046, upload-time = "2025-08-28T15:46:36.184Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/46/4f/9ba607fa94bb9cee3d4beb1c7b32c16efbfc9d69d5037fa85d10cafc618b/playwright-1.55.0-py3-none-win_amd64.whl", hash = "sha256:7eb5956473ca1951abb51537e6a0da55257bb2e25fc37c2b75af094a5c93736c", size = 35476048, upload-time = "2025-08-28T15:46:38.867Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/21/98/5ca173c8ec906abde26c28e1ecb34887343fd71cc4136261b90036841323/playwright-1.55.0-py3-none-win_arm64.whl", hash = "sha256:012dc89ccdcbd774cdde8aeee14c08e0dd52ddb9135bf10e9db040527386bd76", size = 31225543, upload-time = "2025-08-28T15:46:41.613Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6b/31/a5362cee43f844509f1f10d8a27c9cc0e2f7bdce5353d304d93b2151c1b1/playwright-1.56.0-py3-none-macosx_10_13_x86_64.whl", hash = "sha256:b33eb89c516cbc6723f2e3523bada4a4eb0984a9c411325c02d7016a5d625e9c", size = 40611424, upload-time = "2025-11-11T18:39:10.175Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ef/95/347eef596d8778fb53590dc326c344d427fa19ba3d42b646fce2a4572eb3/playwright-1.56.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:b228b3395212b9472a4ee5f1afe40d376eef9568eb039fcb3e563de8f4f4657b", size = 39400228, upload-time = "2025-11-11T18:39:13.915Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b9/54/6ad97b08b2ca1dfcb4fbde4536c4f45c0d9d8b1857a2d20e7bbfdf43bf15/playwright-1.56.0-py3-none-macosx_11_0_universal2.whl", hash = "sha256:0ef7e6fd653267798a8a968ff7aa2dcac14398b7dd7440ef57524e01e0fbbd65", size = 40611424, upload-time = "2025-11-11T18:39:17.093Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e4/76/6d409e37e82cdd5dda3df1ab958130ae32b46e42458bd4fc93d7eb8749cb/playwright-1.56.0-py3-none-manylinux1_x86_64.whl", hash = "sha256:404be089b49d94bc4c1fe0dfb07664bda5ffe87789034a03bffb884489bdfb5c", size = 46263122, upload-time = "2025-11-11T18:39:20.619Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/4f/84/fb292cc5d45f3252e255ea39066cd1d2385c61c6c1596548dfbf59c88605/playwright-1.56.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:64cda7cf4e51c0d35dab55190841bfcdfb5871685ec22cb722cd0ad2df183e34", size = 46110645, upload-time = "2025-11-11T18:39:24.005Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/61/bd/8c02c3388ae14edc374ac9f22cbe4e14826c6a51b2d8eaf86e89fabee264/playwright-1.56.0-py3-none-win32.whl", hash = "sha256:d87b79bcb082092d916a332c27ec9732e0418c319755d235d93cc6be13bdd721", size = 35639837, upload-time = "2025-11-11T18:39:27.174Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/64/27/f13b538fbc6b7a00152f4379054a49f6abc0bf55ac86f677ae54bc49fb82/playwright-1.56.0-py3-none-win_amd64.whl", hash = "sha256:3c7fc49bb9e673489bf2622855f9486d41c5101bbed964638552b864c4591f94", size = 35639843, upload-time = "2025-11-11T18:39:30.851Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f2/c7/3ee8b556107995846576b4fe42a08ed49b8677619421f2afacf6ee421138/playwright-1.56.0-py3-none-win_arm64.whl", hash = "sha256:2745490ae8dd58d27e5ea4d9aa28402e8e2991eb84fb4b2fd5fbde2106716f6f", size = 31248959, upload-time = "2025-11-11T18:39:33.998Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -2497,17 +2540,17 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "protobuf"
|
||||
version = "6.33.0"
|
||||
version = "6.33.1"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/19/ff/64a6c8f420818bb873713988ca5492cba3a7946be57e027ac63495157d97/protobuf-6.33.0.tar.gz", hash = "sha256:140303d5c8d2037730c548f8c7b93b20bb1dc301be280c378b82b8894589c954", size = 443463, upload-time = "2025-10-15T20:39:52.159Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/0a/03/a1440979a3f74f16cab3b75b0da1a1a7f922d56a8ddea96092391998edc0/protobuf-6.33.1.tar.gz", hash = "sha256:97f65757e8d09870de6fd973aeddb92f85435607235d20b2dfed93405d00c85b", size = 443432, upload-time = "2025-11-13T16:44:18.895Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/7e/ee/52b3fa8feb6db4a833dfea4943e175ce645144532e8a90f72571ad85df4e/protobuf-6.33.0-cp310-abi3-win32.whl", hash = "sha256:d6101ded078042a8f17959eccd9236fb7a9ca20d3b0098bbcb91533a5680d035", size = 425593, upload-time = "2025-10-15T20:39:40.29Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7b/c6/7a465f1825872c55e0341ff4a80198743f73b69ce5d43ab18043699d1d81/protobuf-6.33.0-cp310-abi3-win_amd64.whl", hash = "sha256:9a031d10f703f03768f2743a1c403af050b6ae1f3480e9c140f39c45f81b13ee", size = 436882, upload-time = "2025-10-15T20:39:42.841Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e1/a9/b6eee662a6951b9c3640e8e452ab3e09f117d99fc10baa32d1581a0d4099/protobuf-6.33.0-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:905b07a65f1a4b72412314082c7dbfae91a9e8b68a0cc1577515f8df58ecf455", size = 427521, upload-time = "2025-10-15T20:39:43.803Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/10/35/16d31e0f92c6d2f0e77c2a3ba93185130ea13053dd16200a57434c882f2b/protobuf-6.33.0-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:e0697ece353e6239b90ee43a9231318302ad8353c70e6e45499fa52396debf90", size = 324445, upload-time = "2025-10-15T20:39:44.932Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e6/eb/2a981a13e35cda8b75b5585aaffae2eb904f8f351bdd3870769692acbd8a/protobuf-6.33.0-cp39-abi3-manylinux2014_s390x.whl", hash = "sha256:e0a1715e4f27355afd9570f3ea369735afc853a6c3951a6afe1f80d8569ad298", size = 339159, upload-time = "2025-10-15T20:39:46.186Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/21/51/0b1cbad62074439b867b4e04cc09b93f6699d78fd191bed2bbb44562e077/protobuf-6.33.0-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:35be49fd3f4fefa4e6e2aacc35e8b837d6703c37a2168a55ac21e9b1bc7559ef", size = 323172, upload-time = "2025-10-15T20:39:47.465Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/07/d1/0a28c21707807c6aacd5dc9c3704b2aa1effbf37adebd8caeaf68b17a636/protobuf-6.33.0-py3-none-any.whl", hash = "sha256:25c9e1963c6734448ea2d308cfa610e692b801304ba0908d7bfa564ac5132995", size = 170477, upload-time = "2025-10-15T20:39:51.311Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/06/f1/446a9bbd2c60772ca36556bac8bfde40eceb28d9cc7838755bc41e001d8f/protobuf-6.33.1-cp310-abi3-win32.whl", hash = "sha256:f8d3fdbc966aaab1d05046d0240dd94d40f2a8c62856d41eaa141ff64a79de6b", size = 425593, upload-time = "2025-11-13T16:44:06.275Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a6/79/8780a378c650e3df849b73de8b13cf5412f521ca2ff9b78a45c247029440/protobuf-6.33.1-cp310-abi3-win_amd64.whl", hash = "sha256:923aa6d27a92bf44394f6abf7ea0500f38769d4b07f4be41cb52bd8b1123b9ed", size = 436883, upload-time = "2025-11-13T16:44:09.222Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/cd/93/26213ff72b103ae55bb0d73e7fb91ea570ef407c3ab4fd2f1f27cac16044/protobuf-6.33.1-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:fe34575f2bdde76ac429ec7b570235bf0c788883e70aee90068e9981806f2490", size = 427522, upload-time = "2025-11-13T16:44:10.475Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c2/32/df4a35247923393aa6b887c3b3244a8c941c32a25681775f96e2b418f90e/protobuf-6.33.1-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:f8adba2e44cde2d7618996b3fc02341f03f5bc3f2748be72dc7b063319276178", size = 324445, upload-time = "2025-11-13T16:44:11.869Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8e/d0/d796e419e2ec93d2f3fa44888861c3f88f722cde02b7c3488fcc6a166820/protobuf-6.33.1-cp39-abi3-manylinux2014_s390x.whl", hash = "sha256:0f4cf01222c0d959c2b399142deb526de420be8236f22c71356e2a544e153c53", size = 339161, upload-time = "2025-11-13T16:44:12.778Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/1d/2a/3c5f05a4af06649547027d288747f68525755de692a26a7720dced3652c0/protobuf-6.33.1-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:8fd7d5e0eb08cd5b87fd3df49bc193f5cfd778701f47e11d127d0afc6c39f1d1", size = 323171, upload-time = "2025-11-13T16:44:14.035Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/08/b4/46310463b4f6ceef310f8348786f3cff181cea671578e3d9743ba61a459e/protobuf-6.33.1-py3-none-any.whl", hash = "sha256:d595a9fd694fdeb061a62fbe10eb039cc1e444df81ec9bb70c7fc59ebcb1eafa", size = 170477, upload-time = "2025-11-13T16:44:17.633Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -2739,16 +2782,16 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "pydantic-settings"
|
||||
version = "2.11.0"
|
||||
version = "2.12.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "pydantic" },
|
||||
{ name = "python-dotenv" },
|
||||
{ name = "typing-inspection" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/20/c5/dbbc27b814c71676593d1c3f718e6cd7d4f00652cefa24b75f7aa3efb25e/pydantic_settings-2.11.0.tar.gz", hash = "sha256:d0e87a1c7d33593beb7194adb8470fc426e95ba02af83a0f23474a04c9a08180", size = 188394, upload-time = "2025-09-24T14:19:11.764Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/43/4b/ac7e0aae12027748076d72a8764ff1c9d82ca75a7a52622e67ed3f765c54/pydantic_settings-2.12.0.tar.gz", hash = "sha256:005538ef951e3c2a68e1c08b292b5f2e71490def8589d4221b95dab00dafcfd0", size = 194184, upload-time = "2025-11-10T14:25:47.013Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/83/d6/887a1ff844e64aa823fb4905978d882a633cfe295c32eacad582b78a7d8b/pydantic_settings-2.11.0-py3-none-any.whl", hash = "sha256:fe2cea3413b9530d10f3a5875adffb17ada5c1e1bab0b2885546d7310415207c", size = 48608, upload-time = "2025-09-24T14:19:10.015Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c1/60/5d4751ba3f4a40a6891f24eec885f51afd78d208498268c734e256fb13c4/pydantic_settings-2.12.0-py3-none-any.whl", hash = "sha256:fddb9fd99a5b18da837b29710391e945b1e30c135477f484084ee513adb93809", size = 51880, upload-time = "2025-11-10T14:25:45.546Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -2813,15 +2856,15 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "pytest-asyncio"
|
||||
version = "1.2.0"
|
||||
version = "1.3.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "pytest" },
|
||||
{ name = "typing-extensions", marker = "python_full_version < '3.13'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/42/86/9e3c5f48f7b7b638b216e4b9e645f54d199d7abbbab7a64a13b4e12ba10f/pytest_asyncio-1.2.0.tar.gz", hash = "sha256:c609a64a2a8768462d0c99811ddb8bd2583c33fd33cf7f21af1c142e824ffb57", size = 50119, upload-time = "2025-09-12T07:33:53.816Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/90/2c/8af215c0f776415f3590cac4f9086ccefd6fd463befeae41cd4d3f193e5a/pytest_asyncio-1.3.0.tar.gz", hash = "sha256:d7f52f36d231b80ee124cd216ffb19369aa168fc10095013c6b014a34d3ee9e5", size = 50087, upload-time = "2025-11-10T16:07:47.256Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/04/93/2fa34714b7a4ae72f2f8dad66ba17dd9a2c793220719e736dda28b7aec27/pytest_asyncio-1.2.0-py3-none-any.whl", hash = "sha256:8e17ae5e46d8e7efe51ab6494dd2010f4ca8dae51652aa3c8d55acf50bfb2e99", size = 15095, upload-time = "2025-09-12T07:33:52.639Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e5/35/f8b19922b6a25bc0880171a2f1a003eaeb93657475193ab516fd87cac9da/pytest_asyncio-1.3.0-py3-none-any.whl", hash = "sha256:611e26147c7f77640e6d0a92a38ed17c3e9848063698d5c93d5aa7aa11cebff5", size = 15075, upload-time = "2025-11-10T16:07:45.537Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -3244,28 +3287,40 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "ruff"
|
||||
version = "0.14.4"
|
||||
version = "0.14.5"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/df/55/cccfca45157a2031dcbb5a462a67f7cf27f8b37d4b3b1cd7438f0f5c1df6/ruff-0.14.4.tar.gz", hash = "sha256:f459a49fe1085a749f15414ca76f61595f1a2cc8778ed7c279b6ca2e1fd19df3", size = 5587844, upload-time = "2025-11-06T22:07:45.033Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/82/fa/fbb67a5780ae0f704876cb8ac92d6d76da41da4dc72b7ed3565ab18f2f52/ruff-0.14.5.tar.gz", hash = "sha256:8d3b48d7d8aad423d3137af7ab6c8b1e38e4de104800f0d596990f6ada1a9fc1", size = 5615944, upload-time = "2025-11-13T19:58:51.155Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/17/b9/67240254166ae1eaa38dec32265e9153ac53645a6c6670ed36ad00722af8/ruff-0.14.4-py3-none-linux_armv6l.whl", hash = "sha256:e6604613ffbcf2297cd5dcba0e0ac9bd0c11dc026442dfbb614504e87c349518", size = 12606781, upload-time = "2025-11-06T22:07:01.841Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/46/c8/09b3ab245d8652eafe5256ab59718641429f68681ee713ff06c5c549f156/ruff-0.14.4-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:d99c0b52b6f0598acede45ee78288e5e9b4409d1ce7f661f0fa36d4cbeadf9a4", size = 12946765, upload-time = "2025-11-06T22:07:05.858Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/14/bb/1564b000219144bf5eed2359edc94c3590dd49d510751dad26202c18a17d/ruff-0.14.4-py3-none-macosx_11_0_arm64.whl", hash = "sha256:9358d490ec030f1b51d048a7fd6ead418ed0826daf6149e95e30aa67c168af33", size = 11928120, upload-time = "2025-11-06T22:07:08.023Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a3/92/d5f1770e9988cc0742fefaa351e840d9aef04ec24ae1be36f333f96d5704/ruff-0.14.4-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:81b40d27924f1f02dfa827b9c0712a13c0e4b108421665322218fc38caf615c2", size = 12370877, upload-time = "2025-11-06T22:07:10.015Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e2/29/e9282efa55f1973d109faf839a63235575519c8ad278cc87a182a366810e/ruff-0.14.4-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f5e649052a294fe00818650712083cddc6cc02744afaf37202c65df9ea52efa5", size = 12408538, upload-time = "2025-11-06T22:07:13.085Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8e/01/930ed6ecfce130144b32d77d8d69f5c610e6d23e6857927150adf5d7379a/ruff-0.14.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:aa082a8f878deeba955531f975881828fd6afd90dfa757c2b0808aadb437136e", size = 13141942, upload-time = "2025-11-06T22:07:15.386Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6a/46/a9c89b42b231a9f487233f17a89cbef9d5acd538d9488687a02ad288fa6b/ruff-0.14.4-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:1043c6811c2419e39011890f14d0a30470f19d47d197c4858b2787dfa698f6c8", size = 14544306, upload-time = "2025-11-06T22:07:17.631Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/78/96/9c6cf86491f2a6d52758b830b89b78c2ae61e8ca66b86bf5a20af73d20e6/ruff-0.14.4-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a9f3a936ac27fb7c2a93e4f4b943a662775879ac579a433291a6f69428722649", size = 14210427, upload-time = "2025-11-06T22:07:19.832Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/71/f4/0666fe7769a54f63e66404e8ff698de1dcde733e12e2fd1c9c6efb689cb5/ruff-0.14.4-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:95643ffd209ce78bc113266b88fba3d39e0461f0cbc8b55fb92505030fb4a850", size = 13658488, upload-time = "2025-11-06T22:07:22.32Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ee/79/6ad4dda2cfd55e41ac9ed6d73ef9ab9475b1eef69f3a85957210c74ba12c/ruff-0.14.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:456daa2fa1021bc86ca857f43fe29d5d8b3f0e55e9f90c58c317c1dcc2afc7b5", size = 13354908, upload-time = "2025-11-06T22:07:24.347Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b5/60/f0b6990f740bb15c1588601d19d21bcc1bd5de4330a07222041678a8e04f/ruff-0.14.4-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:f911bba769e4a9f51af6e70037bb72b70b45a16db5ce73e1f72aefe6f6d62132", size = 13587803, upload-time = "2025-11-06T22:07:26.327Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c9/da/eaaada586f80068728338e0ef7f29ab3e4a08a692f92eb901a4f06bbff24/ruff-0.14.4-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:76158a7369b3979fa878612c623a7e5430c18b2fd1c73b214945c2d06337db67", size = 12279654, upload-time = "2025-11-06T22:07:28.46Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/66/d4/b1d0e82cf9bf8aed10a6d45be47b3f402730aa2c438164424783ac88c0ed/ruff-0.14.4-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:f3b8f3b442d2b14c246e7aeca2e75915159e06a3540e2f4bed9f50d062d24469", size = 12357520, upload-time = "2025-11-06T22:07:31.468Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/04/f4/53e2b42cc82804617e5c7950b7079d79996c27e99c4652131c6a1100657f/ruff-0.14.4-py3-none-musllinux_1_2_i686.whl", hash = "sha256:c62da9a06779deecf4d17ed04939ae8b31b517643b26370c3be1d26f3ef7dbde", size = 12719431, upload-time = "2025-11-06T22:07:33.831Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a2/94/80e3d74ed9a72d64e94a7b7706b1c1ebaa315ef2076fd33581f6a1cd2f95/ruff-0.14.4-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:5a443a83a1506c684e98acb8cb55abaf3ef725078be40237463dae4463366349", size = 13464394, upload-time = "2025-11-06T22:07:35.905Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/54/1a/a49f071f04c42345c793d22f6cf5e0920095e286119ee53a64a3a3004825/ruff-0.14.4-py3-none-win32.whl", hash = "sha256:643b69cb63cd996f1fc7229da726d07ac307eae442dd8974dbc7cf22c1e18fff", size = 12493429, upload-time = "2025-11-06T22:07:38.43Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/bc/22/e58c43e641145a2b670328fb98bc384e20679b5774258b1e540207580266/ruff-0.14.4-py3-none-win_amd64.whl", hash = "sha256:26673da283b96fe35fa0c939bf8411abec47111644aa9f7cfbd3c573fb125d2c", size = 13635380, upload-time = "2025-11-06T22:07:40.496Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/30/bd/4168a751ddbbf43e86544b4de8b5c3b7be8d7167a2a5cb977d274e04f0a1/ruff-0.14.4-py3-none-win_arm64.whl", hash = "sha256:dd09c292479596b0e6fec8cd95c65c3a6dc68e9ad17b8f2382130f87ff6a75bb", size = 12663065, upload-time = "2025-11-06T22:07:42.603Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/68/31/c07e9c535248d10836a94e4f4e8c5a31a1beed6f169b31405b227872d4f4/ruff-0.14.5-py3-none-linux_armv6l.whl", hash = "sha256:f3b8248123b586de44a8018bcc9fefe31d23dda57a34e6f0e1e53bd51fd63594", size = 13171630, upload-time = "2025-11-13T19:57:54.894Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8e/5c/283c62516dca697cd604c2796d1487396b7a436b2f0ecc3fd412aca470e0/ruff-0.14.5-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:f7a75236570318c7a30edd7f5491945f0169de738d945ca8784500b517163a72", size = 13413925, upload-time = "2025-11-13T19:57:59.181Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b6/f3/aa319f4afc22cb6fcba2b9cdfc0f03bbf747e59ab7a8c5e90173857a1361/ruff-0.14.5-py3-none-macosx_11_0_arm64.whl", hash = "sha256:6d146132d1ee115f8802356a2dc9a634dbf58184c51bff21f313e8cd1c74899a", size = 12574040, upload-time = "2025-11-13T19:58:02.056Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f9/7f/cb5845fcc7c7e88ed57f58670189fc2ff517fe2134c3821e77e29fd3b0c8/ruff-0.14.5-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e2380596653dcd20b057794d55681571a257a42327da8894b93bbd6111aa801f", size = 13009755, upload-time = "2025-11-13T19:58:05.172Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/21/d2/bcbedbb6bcb9253085981730687ddc0cc7b2e18e8dc13cf4453de905d7a0/ruff-0.14.5-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:2d1fa985a42b1f075a098fa1ab9d472b712bdb17ad87a8ec86e45e7fa6273e68", size = 12937641, upload-time = "2025-11-13T19:58:08.345Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a4/58/e25de28a572bdd60ffc6bb71fc7fd25a94ec6a076942e372437649cbb02a/ruff-0.14.5-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:88f0770d42b7fa02bbefddde15d235ca3aa24e2f0137388cc15b2dcbb1f7c7a7", size = 13610854, upload-time = "2025-11-13T19:58:11.419Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7d/24/43bb3fd23ecee9861970978ea1a7a63e12a204d319248a7e8af539984280/ruff-0.14.5-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:3676cb02b9061fee7294661071c4709fa21419ea9176087cb77e64410926eb78", size = 15061088, upload-time = "2025-11-13T19:58:14.551Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/23/44/a022f288d61c2f8c8645b24c364b719aee293ffc7d633a2ca4d116b9c716/ruff-0.14.5-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b595bedf6bc9cab647c4a173a61acf4f1ac5f2b545203ba82f30fcb10b0318fb", size = 14734717, upload-time = "2025-11-13T19:58:17.518Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/58/81/5c6ba44de7e44c91f68073e0658109d8373b0590940efe5bd7753a2585a3/ruff-0.14.5-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f55382725ad0bdb2e8ee2babcbbfb16f124f5a59496a2f6a46f1d9d99d93e6e2", size = 14028812, upload-time = "2025-11-13T19:58:20.533Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ad/ef/41a8b60f8462cb320f68615b00299ebb12660097c952c600c762078420f8/ruff-0.14.5-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7497d19dce23976bdaca24345ae131a1d38dcfe1b0850ad8e9e6e4fa321a6e19", size = 13825656, upload-time = "2025-11-13T19:58:23.345Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7c/00/207e5de737fdb59b39eb1fac806904fe05681981b46d6a6db9468501062e/ruff-0.14.5-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:410e781f1122d6be4f446981dd479470af86537fb0b8857f27a6e872f65a38e4", size = 13959922, upload-time = "2025-11-13T19:58:26.537Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/bc/7e/fa1f5c2776db4be405040293618846a2dece5c70b050874c2d1f10f24776/ruff-0.14.5-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:c01be527ef4c91a6d55e53b337bfe2c0f82af024cc1a33c44792d6844e2331e1", size = 12932501, upload-time = "2025-11-13T19:58:29.822Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/67/d8/d86bf784d693a764b59479a6bbdc9515ae42c340a5dc5ab1dabef847bfaa/ruff-0.14.5-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:f66e9bb762e68d66e48550b59c74314168ebb46199886c5c5aa0b0fbcc81b151", size = 12927319, upload-time = "2025-11-13T19:58:32.923Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ac/de/ee0b304d450ae007ce0cb3e455fe24fbcaaedae4ebaad6c23831c6663651/ruff-0.14.5-py3-none-musllinux_1_2_i686.whl", hash = "sha256:d93be8f1fa01022337f1f8f3bcaa7ffee2d0b03f00922c45c2207954f351f465", size = 13206209, upload-time = "2025-11-13T19:58:35.952Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/33/aa/193ca7e3a92d74f17d9d5771a765965d2cf42c86e6f0fd95b13969115723/ruff-0.14.5-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:c135d4b681f7401fe0e7312017e41aba9b3160861105726b76cfa14bc25aa367", size = 13953709, upload-time = "2025-11-13T19:58:39.002Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/cc/f1/7119e42aa1d3bf036ffc9478885c2e248812b7de9abea4eae89163d2929d/ruff-0.14.5-py3-none-win32.whl", hash = "sha256:c83642e6fccfb6dea8b785eb9f456800dcd6a63f362238af5fc0c83d027dd08b", size = 12925808, upload-time = "2025-11-13T19:58:42.779Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3b/9d/7c0a255d21e0912114784e4a96bf62af0618e2190cae468cd82b13625ad2/ruff-0.14.5-py3-none-win_amd64.whl", hash = "sha256:9d55d7af7166f143c94eae1db3312f9ea8f95a4defef1979ed516dbb38c27621", size = 14331546, upload-time = "2025-11-13T19:58:45.691Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e5/80/69756670caedcf3b9be597a6e12276a6cf6197076eb62aad0c608f8efce0/ruff-0.14.5-py3-none-win_arm64.whl", hash = "sha256:4b700459d4649e2594b31f20a9de33bc7c19976d4746d8d0798ad959621d64a4", size = 13433331, upload-time = "2025-11-13T19:58:48.434Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "s3transfer"
|
||||
version = "0.14.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "botocore" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/62/74/8d69dcb7a9efe8baa2046891735e5dfe433ad558ae23d9e3c14c633d1d58/s3transfer-0.14.0.tar.gz", hash = "sha256:eff12264e7c8b4985074ccce27a3b38a485bb7f7422cc8046fee9be4983e4125", size = 151547, upload-time = "2025-09-09T19:23:31.089Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/48/f0/ae7ca09223a81a1d890b2557186ea015f6e0502e9b8cb8e1813f1d8cfa4e/s3transfer-0.14.0-py3-none-any.whl", hash = "sha256:ea3b790c7077558ed1f02a3072fb3cb992bbbd253392f4b6e9e8976941c7d456", size = 85712, upload-time = "2025-09-09T19:23:30.041Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -3470,27 +3525,27 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "ty"
|
||||
version = "0.0.1a25"
|
||||
version = "0.0.1a26"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/f6/6b/e73bc3c1039ea72936158a08313155a49e5aa5e7db5205a149fe516a4660/ty-0.0.1a25.tar.gz", hash = "sha256:5550b24b9dd0e0f8b4b2c1f0fcc608a55d0421dd67b6c364bc7bf25762334511", size = 4403670, upload-time = "2025-10-29T19:40:23.647Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/39/39/b4b4ecb6ca6d7e937fa56f0b92a8f48d7719af8fe55bdbf667638e9f93e2/ty-0.0.1a26.tar.gz", hash = "sha256:65143f8efeb2da1644821b710bf6b702a31ddcf60a639d5a576db08bded91db4", size = 4432154, upload-time = "2025-11-10T18:02:30.142Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/8f/3b/4457231238a2eeb04cba4ba7cc33d735be68ee46ca40a98ae30e187de864/ty-0.0.1a25-py3-none-linux_armv6l.whl", hash = "sha256:d35b2c1f94a014a22875d2745aa0432761d2a9a8eb7212630d5caf547daeef6d", size = 8878803, upload-time = "2025-10-29T19:39:42.243Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8a/fa/a328713dd310018fc7a381693d8588185baa2fdae913e01a6839187215df/ty-0.0.1a25-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:192edac94675a468bac7f6e04687a77a64698e4e1fe01f6a048bf9b6dde5b703", size = 8695667, upload-time = "2025-10-29T19:39:45.179Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/22/e8/5707939118992ced2bf5385adc3ede7723c1b717b07ad14c495eea1e47b4/ty-0.0.1a25-py3-none-macosx_11_0_arm64.whl", hash = "sha256:949523621f336e01bc7d687b7bd08fe838edadbdb6563c2c057ed1d264e820cf", size = 8159012, upload-time = "2025-10-29T19:39:47.011Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/eb/fb/ff313aa71602225cd78f1bce3017713d6d1b1c1e0fa8101ead4594a60d95/ty-0.0.1a25-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:94f78f621458c05e59e890061021198197f29a7b51a33eda82bbb036e7ed73d7", size = 8433675, upload-time = "2025-10-29T19:39:48.443Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c0/8d/cc7e7fb57215a15b575a43ed042bdd92971871e0decec1b26d2e7d969465/ty-0.0.1a25-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:d9656fca8062a2c6709c30d76d662c96d2e7dbfee8f70e55ec6b6afd67b5d447", size = 8668456, upload-time = "2025-10-29T19:39:50.412Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b8/6d/d7bf5909ed2dcdcbc1e2ca7eea80929893e2d188d9c36b3fcb2b36532ff6/ty-0.0.1a25-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a9f3bbf523b49935bbd76e230408d858dce0d614f44f5807bbbd0954f64e0f01", size = 9023543, upload-time = "2025-10-29T19:39:52.292Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b4/b8/72bcefb4be32e5a84f0b21de2552f16cdb4cae3eb271ac891c8199c26b1a/ty-0.0.1a25-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:f13ea9815f4a54a0a303ca7bf411b0650e3c2a24fc6c7889ffba2c94f5e97a6a", size = 9700013, upload-time = "2025-10-29T19:39:57.283Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/90/0d/cf7e794b840cf6b0bbecb022e593c543f85abad27a582241cf2095048cb1/ty-0.0.1a25-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:eab6e33ebe202a71a50c3d5a5580e3bc1a85cda3ffcdc48cec3f1c693b7a873b", size = 9372574, upload-time = "2025-10-29T19:40:04.532Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/1e/71/2d35e7d51b48eabd330e2f7b7e0bce541cbd95950c4d2f780e85f3366af1/ty-0.0.1a25-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f6b9a31da43424cdab483703a54a561b93aabba84630788505329fc5294a9c62", size = 9535726, upload-time = "2025-10-29T19:40:06.548Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/57/d3/01ecc23bbd8f3e0dfbcf9172d06d84e88155c5f416f1491137e8066fd859/ty-0.0.1a25-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0a90d897a7c1a5ae9b41a4c7b0a42262a06361476ad88d783dbedd7913edadbc", size = 9003380, upload-time = "2025-10-29T19:40:08.683Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/de/f9/cde9380d8a1a6ca61baeb9aecb12cbec90d489aa929be55cd78ad5c2ccd9/ty-0.0.1a25-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:93c7e7ab2859af0f866d34d27f4ae70dd4fb95b847387f082de1197f9f34e068", size = 8401833, upload-time = "2025-10-29T19:40:10.627Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/0b/39/0acf3625b0c495011795a391016b572f97a812aca1d67f7a76621fdb9ebf/ty-0.0.1a25-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:4a247061bd32bae3865a236d7f8b6c9916c80995db30ae1600999010f90623a9", size = 8706761, upload-time = "2025-10-29T19:40:12.575Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/25/73/7de1648f3563dd9d416d36ab5f1649bfd7b47a179135027f31d44b89a246/ty-0.0.1a25-py3-none-musllinux_1_2_i686.whl", hash = "sha256:1711dd587eccf04fd50c494dc39babe38f4cb345bc3901bf1d8149cac570e979", size = 8792426, upload-time = "2025-10-29T19:40:14.553Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7d/8a/b6e761a65eac7acd10b2e452f49b2d8ae0ea163ca36bb6b18b2dadae251b/ty-0.0.1a25-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:5f4c9b0cf7995e2e3de9bab4d066063dea92019f2f62673b7574e3612643dd35", size = 9103991, upload-time = "2025-10-29T19:40:16.332Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e4/25/9324ae947fcc4322470326cf8276a3fc2f08dc82adec1de79d963fdf7af5/ty-0.0.1a25-py3-none-win32.whl", hash = "sha256:168fc8aee396d617451acc44cd28baffa47359777342836060c27aa6f37e2445", size = 8387095, upload-time = "2025-10-29T19:40:18.368Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3b/2b/cb12cbc7db1ba310aa7b1de9b4e018576f653105993736c086ee67d2ec02/ty-0.0.1a25-py3-none-win_amd64.whl", hash = "sha256:a2fad3d8e92bb4d57a8872a6f56b1aef54539d36f23ebb01abe88ac4338efafb", size = 9059225, upload-time = "2025-10-29T19:40:20.278Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/2f/c1/f6be8cdd0bf387c1d8ee9d14bb299b7b5d2c0532f550a6693216a32ec0c5/ty-0.0.1a25-py3-none-win_arm64.whl", hash = "sha256:dde2962d448ed87c48736e9a4bb13715a4cced705525e732b1c0dac1d4c66e3d", size = 8536832, upload-time = "2025-10-29T19:40:22.014Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/cc/6a/661833ecacc4d994f7e30a7f1307bfd3a4a91392a6b03fb6a018723e75b8/ty-0.0.1a26-py3-none-linux_armv6l.whl", hash = "sha256:09208dca99bb548e9200136d4d42618476bfe1f4d2066511f2c8e2e4dfeced5e", size = 9173869, upload-time = "2025-11-10T18:01:46.012Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/66/a8/32ea50f064342de391a7267f84349287e2f1c2eb0ad4811d6110916179d6/ty-0.0.1a26-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:91d12b66c91a1b82e698a2aa73fe043a1a9da83ff0dfd60b970500bee0963b91", size = 8973420, upload-time = "2025-11-10T18:01:49.32Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d1/f6/6659d55940cd5158a6740ae46a65be84a7ee9167738033a9b1259c36eef5/ty-0.0.1a26-py3-none-macosx_11_0_arm64.whl", hash = "sha256:c5bc6dfcea5477c81ad01d6a29ebc9bfcbdb21c34664f79c9e1b84be7aa8f289", size = 8528888, upload-time = "2025-11-10T18:01:51.511Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/79/c9/4cbe7295013cc412b4f100b509aaa21982c08c59764a2efa537ead049345/ty-0.0.1a26-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:40e5d15635e9918924138e8d3fb1cbf80822dfb8dc36ea8f3e72df598c0c4bea", size = 8801867, upload-time = "2025-11-10T18:01:53.888Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ed/b3/25099b219a6444c4b29f175784a275510c1cd85a23a926d687ab56915027/ty-0.0.1a26-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:86dc147ed0790c7c8fd3f0d6c16c3c5135b01e99c440e89c6ca1e0e592bb6682", size = 8975519, upload-time = "2025-11-10T18:01:56.231Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/73/3e/3ad570f4f592cb1d11982dd2c426c90d2aa9f3d38bf77a7e2ce8aa614302/ty-0.0.1a26-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:fbe0e07c9d5e624edfc79a468f2ef191f9435581546a5bb6b92713ddc86ad4a6", size = 9331932, upload-time = "2025-11-10T18:01:58.476Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/04/fa/62c72eead0302787f9cc0d613fc671107afeecdaf76ebb04db8f91bb9f7e/ty-0.0.1a26-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:0dcebbfe9f24b43d98a078f4a41321ae7b08bea40f5c27d81394b3f54e9f7fb5", size = 9921353, upload-time = "2025-11-10T18:02:00.749Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6c/1f/3b329c4b60d878704e09eb9d05467f911f188e699961c044b75932893e0a/ty-0.0.1a26-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:0901b75afc7738224ffc98bbc8ea03a20f167a2a83a4b23a6550115e8b3ddbc6", size = 9700800, upload-time = "2025-11-10T18:02:03.544Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/92/24/13fcba20dd86a7c3f83c814279aa3eb6a29c5f1b38a3b3a4a0fd22159189/ty-0.0.1a26-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4788f34d384c132977958d76fef7f274f8d181b22e33933c4d16cff2bb5ca3b9", size = 9728289, upload-time = "2025-11-10T18:02:06.386Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/40/7a/798894ff0b948425570b969be35e672693beeb6b852815b7340bc8de1575/ty-0.0.1a26-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b98851c11c560ce63cd972ed9728aa079d9cf40483f2cdcf3626a55849bfe107", size = 9279735, upload-time = "2025-11-10T18:02:09.425Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/1a/54/71261cc1b8dc7d3c4ad92a83b4d1681f5cb7ea5965ebcbc53311ae8c6424/ty-0.0.1a26-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:c20b4625a20059adecd86fe2c4df87cd6115fea28caee45d3bdcf8fb83d29510", size = 8767428, upload-time = "2025-11-10T18:02:11.956Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8e/07/b248b73a640badba2b301e6845699b7dd241f40a321b9b1bce684d440f70/ty-0.0.1a26-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:d9909e96276f8d16382d285db92ae902174cae842aa953003ec0c06642db2f8a", size = 9009170, upload-time = "2025-11-10T18:02:14.878Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f8/35/ec8353f2bb7fd2f41bca6070b29ecb58e2de9af043e649678b8c132d5439/ty-0.0.1a26-py3-none-musllinux_1_2_i686.whl", hash = "sha256:a76d649ceefe9baa9bbae97d217bee076fd8eeb2a961f66f1dff73cc70af4ac8", size = 9119215, upload-time = "2025-11-10T18:02:18.329Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/70/48/db49fe1b7e66edf90dc285869043f99c12aacf7a99c36ee760e297bac6d5/ty-0.0.1a26-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a0ee0f6366bcf70fae114e714d45335cacc8daa936037441e02998a9110b7a29", size = 9398655, upload-time = "2025-11-10T18:02:21.031Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/10/f8/d869492bdbb21ae8cf4c99b02f20812bbbf49aa187cfeb387dfaa03036a8/ty-0.0.1a26-py3-none-win32.whl", hash = "sha256:86689b90024810cac7750bf0c6e1652e4b4175a9de7b82b8b1583202aeb47287", size = 8645669, upload-time = "2025-11-10T18:02:23.23Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b4/18/8a907575d2b335afee7556cb92233ebb5efcefe17752fc9dcab21cffb23b/ty-0.0.1a26-py3-none-win_amd64.whl", hash = "sha256:829e6e6dbd7d9d370f97b2398b4804552554bdcc2d298114fed5e2ea06cbc05c", size = 9442975, upload-time = "2025-11-10T18:02:25.68Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e9/22/af92dcfdd84b78dd97ac6b7154d6a763781f04a400140444885c297cc213/ty-0.0.1a26-py3-none-win_arm64.whl", hash = "sha256:b8f431c784d4cf5b4195a3521b2eca9c15902f239b91154cb920da33f943c62b", size = 8958958, upload-time = "2025-11-10T18:02:28.071Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
||||
Reference in New Issue
Block a user