Commit Graph

2 Commits

Author SHA1 Message Date
Chris Coutinho 5c73b85f65 fix: Increase MCP sampling timeout to 5 minutes for slower LLMs
- Increase sampling timeout from 30s to 300s in semantic.py to accommodate
  slower local LLMs like Ollama
- Refactor RAG integration tests to support multiple providers (ollama,
  openai, anthropic, bedrock)
- Remove unnecessary embedding_provider fixture since MCP server handles
  embeddings internally
- Add --provider flag via tests/integration/conftest.py
- Add provider_fixtures.py with factory functions for generation providers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 05:43:48 +01:00
Chris Coutinho 208365cd3d feat: Add OpenAI provider support for embeddings and generation
Adds OpenAI provider to the unified provider architecture (ADR-015),
supporting:
- OpenAI API (api.openai.com)
- GitHub Models API (models.github.ai/inference)
- OpenAI-compatible endpoints (Fireworks, Together, etc.)

Features:
- Embedding support with text-embedding-3-small/large models
- Text generation via chat completions API
- Automatic retry with exponential backoff for rate limits
- Provider auto-detection in registry (priority after Bedrock)

Environment variables:
- OPENAI_API_KEY: API key (required)
- OPENAI_BASE_URL: Base URL override (optional)
- OPENAI_EMBEDDING_MODEL: Embedding model (default: text-embedding-3-small)
- OPENAI_GENERATION_MODEL: Generation model (default: gpt-4o-mini)

Also adds:
- Integration tests for RAG pipeline with MCP sampling
- MCP client sampling support for integration tests
- Ground truth Q&A pairs for Nextcloud User Manual

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-23 00:33:32 +01:00