Skip to main contentBy default, local state lives under:
You can override that with --state-dir.
Files in .semsearch
index.db
The local search index. This is a rebuildable cache artifact, not durable user data.
It stores immutable snapshots for each provider/model pair. Embedding vectors are kept in dimension-specific vector tables, so a 768-dimension local model and a 1536-dimension OpenRouter model can coexist without replacing each other.
manifest.json
Tracks local state metadata, schema versioning, and optional remote binding.
filehashes.db
Stores file-hash state used to skip unchanged files on warm reindex runs.
file_weights.json
Local path weighting configuration used during ranking.
yzma/
Local runtime libraries used for embedding inference.
State-dir vs db-path
The CLI now treats the state directory as the primary unit:
--state-dir is the preferred flag
--db is a deprecated compatibility alias
That matches the real storage model better, because index.db is only one piece of the full local state.
Provider and model snapshots
Each provider/model pair has its own active snapshot.
Examples:
- rebuilding
llama.cpp/embeddinggemma does not replace llama.cpp/qwen3
- rebuilding
openrouter/openai/text-embedding-3-small does not replace llama.cpp/embeddinggemma
- switching provider or model requires indexing once with that same provider/model before searching
index.db schema changes may reset old cache data automatically. If local search fails after an upgrade, rebuild with unch index --root ..
Remote-published state
Remote CI publishes:
index.db
manifest.json
filehashes.db
That allows published state to restore both the active search snapshot and the file-hash cache used by warm reindex scenarios.