i had the same experience with vector db and ditched them. i like no database approach. do you add something to agents.md or claude.md so that claude code know how to use this tool?
## Session Memory & Recall
- When asked to remember, recall, or look up something from past sessions, use the `/search-sessions` skill
- Start with index search; if no results, suggest `--deep` for full message content search
- Use `--project` filter when the context is clearly tied to a specific project
it's 5x the price of llama3/qwen2 70b. the performance on the benchmark is similar. but with 70b you can break a task in steps and do 5+ steps. doesn't seem like it is worth it in general cases for the price. is 340 better for synthetic data generation (which is my primary usecase) are there tests for that? seems like synthetic data would benefit from multi step reasoning and reduction of hallucination and in those tests, the difference is small.
is it hosted somewhere so I can use it over an API call? or any suggestions for people without access to a gpu and want to do just a few calls a month?