I'm the author of Theine (both Go and Python). I actually started with the Python version, using Ristretto as a reference implementation, including its hit ratio benchmarks. Naturally, I had to run Ristretto's benchmark first to ensure it was working correctly, which is how I discovered the issue in the first place. After completing the Python version, I moved on to develop the Go version of Theine, which focus on better hit ratio than Ristretto.
Recently, I refactored both the Go and Python versions to adopt Caffeine’s adaptive algorithm for improved hit ratio performance. But now that Otter v2 has switched to adaptive W-TinyLFU approach and more closely aligned with Caffeine’s implementation, I’m considering focusing more on the Python version.
This feels like a good time to do so: the Python community is actively working toward free-threading, and once the GIL is no longer a bottleneck, larger machines and multi-threads will become more viable. Then a high-performance, free-threading compatible caching libraries in Python will be important.
I think Jules does a good job at "generating code I'm willing to maintain." I never use Jules to write code from scratch. Instead, I usually write about 90% of the code myself, then use the agent to refactor, add tests (based on some I've already written), or make small improvements.
Most of the time, the output isn't perfect, but it's good enough to keep moving forward. And since I’ve already written most of the code, Jules tends to follow my style. The final result isn’t just 100%, it’s more like 120%. Because of those little refactors and improvements I’d probably be too lazy to do if I were writing everything myself.
I think Hugging Face will soon add an MCP category to their homepage, similar to what modelscope has done(the Chinese equivalent of Hugging Face): https://www.modelscope.cn/mcp
It would be interesting if there were an AI tool to analyze the growth pattern of an OSS project. The tool should work based on star info from the GitHub API and perform some web searches based on that info.
For example: the project gets 1,000 stars on 2024-07-23 because it was posted on Hacker News and received 100 comments (<link>). Below is the static info of stargazers during this period: ...
Yeah I thought about this and maybe down the line, but wanted to start with the pure statistics part as the base so it's as little of a black box as possible.
I think the results came out quiet well. Be aware I don't generate a text prompt based on row data for image generation. Instead, the raw row data(ingredients, instructions...) and table metadata(column names and descriptions) are sent directly to gemini-2.0-flash-exp-image-generation.
https://rowzero.io/ can handle 1 billion+ rows and offers native Python support. Also compatible with Excel and Google Sheets. However it’s a cloud based solution, and the private hosting option is only available to Enterprise users.
"Gemini Flash fails even for simple tasks." On the Gemini Flash page (https://deepmind.google/technologies/gemini/flash/), it claims to be 'best for fast performance on complex tasks.'. I always use Gemini Flash in my project for demos and testing, and it performs very well, if a project requires a large, expensive model to handle simple tasks, that could be an issue to users.
Recently, I refactored both the Go and Python versions to adopt Caffeine’s adaptive algorithm for improved hit ratio performance. But now that Otter v2 has switched to adaptive W-TinyLFU approach and more closely aligned with Caffeine’s implementation, I’m considering focusing more on the Python version.
This feels like a good time to do so: the Python community is actively working toward free-threading, and once the GIL is no longer a bottleneck, larger machines and multi-threads will become more viable. Then a high-performance, free-threading compatible caching libraries in Python will be important.