Skip to content

[FEATURE] Add api_base support for LiteLLM semantic embedding providers #1005

@lastguru-net

Description

@lastguru-net

Problem

Basic Memory's experimental LiteLLM embedding provider currently supports provider/model selection, dimensions, role-specific input_type, batching, and request concurrency, but it does not expose an api_base / custom endpoint setting for embedding calls.

That makes it awkward or impossible to use Basic Memory with OpenAI-compatible embedding servers that are not hosted at the default provider endpoint.

A concrete example is running llama-server --embedding directly and exposing its OpenAI-compatible /v1/embeddings API. LiteLLM supports this shape by calling embeddings with an openai/... model string and an api_base, but Basic Memory does not currently provide a config field that reaches litellm.aembedding(...).

Why this matters

This is useful for local and self-hosted embedding deployments.

For example, Ollama can accept batched embedding input, but in some local setups it still runs embedding work through a single llama.cpp slot, which makes large Basic Memory reindexes extremely slow. Running llama-server --embedding directly with multiple slots can be materially faster while still exposing an OpenAI-compatible embeddings endpoint.

Current code shape

From a quick read of current main:

  • src/basic_memory/repository/litellm_provider.py

    • LiteLLMEmbeddingProvider accepts api_key, timeout, dimensions, batch size, concurrency, and role settings.
    • _embed() builds params for litellm.aembedding(...).
    • The params include model, input, drop_params, timeout, optionally dimensions, api_key, and input_type.
    • There is no api_base.
  • src/basic_memory/config.py

    • Semantic embedding config includes provider, model, dimensions, dimension forwarding, batch size, request concurrency, document/query input types, sync batch size, and FastEmbed runtime knobs.
    • There is no endpoint/base URL field for semantic embedding providers.
  • src/basic_memory/repository/embedding_provider_factory.py

    • The factory passes LiteLLM model, batch size, request concurrency, input types, and dimension forwarding into LiteLLMEmbeddingProvider.
    • There is no endpoint/base URL field in the provider cache key or provider constructor call.

Suggested solution

Add api_base option and pass it to the LiteLLM embedding providers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions