unch supports two embedding providers:
llama.cppfor local GGUF modelsopenrouterfor remote embedding APIs
index and search must match.
Providers
llama.cpp
- default provider
- works with built-in model ids such as
embeddinggemmaandqwen3 - also accepts a direct
.ggufpath - may download a default embedding model and local
yzmaruntime on first run
openrouter
- remote provider
- uses an OpenRouter embedding model id such as
openai/text-embedding-3-small - token can be stored with
unch auth openrouter --token ...
Known model ids
embeddinggemma
- default model id
- downloaded automatically when
--modelis omitted - mean-pooling profile
qwen3
- built-in known model id
- auto-download profile
- last-token pooling profile
Supported ways to select a model
Provider-scoped snapshots
Each provider and model pair keeps its own active snapshot in the local index state. That means:- rebuilding
qwen3does not replace the activeembeddinggemmasnapshot - rebuilding
openrouter/openai/text-embedding-3-smalldoes not replace the activellama.cpp/embeddinggemmasnapshot - switching providers or models does require a matching reindex before search
Default cache location
IfSEMSEARCH_HOME is set, models live under:
unch uses the system user cache directory for the current platform.
Token storage for OpenRouter
unch auth openrouter writes to:
OPENROUTER_API_KEY~/.config/unch/tokens.json.semsearch/tokens.json
Practical rule
Always use the same provider and model family for bothunch index and unch search. If they differ, ranking quality will be wrong.