Skip to content

Conversation

@PratikDavidson
Copy link

Description

This PR introduces Ollama as an alternative embedding provider to Hugging Face, enabling local CPU/GPU embedding inference, enhancing data privacy, and reducing cloud dependency and cost. It also improves flexibility by allowing users to choose between cloud-based (Hugging Face) and self-hosted (Ollama) embedding generation.

Changes

  • Added ollama client dependency to pyproject.toml.
  • Implemented Ollama configuration and centralized error handling in config.py.
  • Added model definitions for Ollama embedding models and extended EmbeddingService to process Ollama-based embedding requests in embeddings.py.
  • Introduced unit tests for Ollama integration via TestEmbeddingServiceOllama in test_embedding_service.py.
  • Updated README.md with documentation on using Ollama as an embedding provider.

Implementation Details

  • Run Ollama server
  • Config .env with:
    • EMBEDDING_PROVIDER=ollama
    • OLLAMA_HOST=localhost
    • OLLAMA_PORT=11434
    • OLLAMA_MODEL="nomic-embed-text"
  • uv run server.py

Checklist

✅Code compiles successfully
✅Unit tests added and passing
✅Documentation updated
✅No breaking changes introduced

@PratikDavidson PratikDavidson changed the title Added Ollama as an Embedding Provider Oct 18, 2025
@nicholasericksen
Copy link

I was also looking for this capability. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants