Hacker Newsnew | past | comments | ask | show | jobs | submit | neomantra's commentslogin

https://wethinkt.com

The second bubble there is a tool for 3D visualization and analytics of Claude Code sessions. The sample conversation is the one that made the tool itself!

That was a fun toy I learned a lot from. I’m not expanding that but am working intensely on the first bubble:

thinkt a CLI/TUI/Webapp for exploring your LLM conversations. Makes it easy to see all your local projects, view them, and export them. It has an embedded OpenAPI server and MCP server.

So you can open Kimi and say “use thinkt mcp to look at my last Claude session in this project, look at the thinking at the end and report on the issues we were facing”.

I added Claude Teams support by launching a Team and having that team look at its own traces and the changing ~/.Claude folder. Similar for Gemini CLI and Copilot (which still need work).

Doing it in the open. Only 2 weeks old - usable, but early. I’m only posting as it’s what I’m working on. Still working on polish and deeper review (it is vibe-crafted). There’s ergonomic issues with ports and DuckDB. Coming up next is VSCode extension and an exporter/collecter for remote agents.


The Claude Code analytics space is really interesting to me right now as well, this is cool.

I'm coming at it from more of the data infrastructure side (e.g. send all of your logs and metrics to a cheap Iceberg catalog in the cloud so you have a central place to query[1]) but also check out https://github.com/tobilg/ai-observer -- duckdb popping up everything to make this interesting and easy.

[1] https://github.com/smithclay/otlp2pipeline


That is great, thanks for sharing your work and that other link. Codex also supports OTel.

I love that you made the OTel DuckDB extension last year and then were able to flex it months later for these pursuits.


That's precisely how I refactored dank-extract from dank-mcp and finally got dank-data to archive CT canna-data every Sunday at 4:20pm Pacific.

[1] https://github.com/AgentDank/dank-extract

[2] https://github.com/AgentDank/dank-data


Over the weekend I took pictures of the four walls of my office and asked Claude Desktop to examine them and give me a plan for tackling it. It absolutely “understood” my room, identifying the different (messy) workspaces and various piles of stuff on the ground. It generated a checklist with targeted advice and said that I should be motivated to clean up because the “welcome back daddy” sign up on the wall indicates that my kids love me and want a nice space to share with me.

I vibe-code TUI and GUI by making statements like “make the panel on the right side two pixels thinner”.

Related to this thread, I explored agentic looping for 3d models (with a swift library, could be done with this Rust one by following the workflow: https://github.com/ConAcademy/WeaselToonCadova


My running joke after showing off some amazing LLM-driven work is...

if you think this is impressive, I once opened a modal dialog on an Apple IIGS in 65C816 assembly

I don't think you need to learn BASIC, if you know concepts like conditionals and looping and indexing. It is interesting to compare the higher-level language of the time with its companion assembly. And you might find yourself writing BASIC programs to complement your assembly, if you stick to that platform.

<lore> A friend dropped me a BASIC program that ran and wrote text to the Apple IIGS border. He asked me to figure it out, because it wasn't obvious what was going on. OG hacker puzzle... it was a BASIC program that jumped to hidden assembly after the apparent end of the text file (hidden chars maybe, I forget) and the assembly was changing the border at appropriate rate to "draw" on it. Those were the days... trying to find some reference to this and am failing. </lore>

I certainly credit my stack-frame debugging capability to dealing with that stuff so long ago. Oddly enough, I didn't really find it helpful for computer architecture class. Just because you know registers exists and how to manipulate them, doesn't exactly map architecting modern hardware system. But being fluent in logic operations and bit-twiddling and indexing does help a lot.


An elderly friend had some identity theft and we went to handle it at the three credit bureaus…

She has an old MacBook and old iPhone (circa 2017?). Apple no longer updates these OS’ even the bundled Safari.

One of the bureaus, Experian I think, has a TLS cert that is not compatible with the old OSs so all the “don’t trust this site” warnings come up.

How many people have incomplete credit freezes because of this? Apple is of a size that this hurts society.


I really appreciate all of his message -- responsibility and actual engineering are critical and can't be (deceptively) lost even though Pull Request and CI/CD workflows exist. I hate the term vibe-coding because it seems flippant, and I've leaned into LLM-assistance to frame it better.


I consider vibe coding and LLM-assistance to be distinctively separate things.

I am vibe coding, if I needed x, I lay that out task with any degree of specificity, and ask for the whole result. Maybe it’s good, I gave the LLM a lot of rope to hang me.

I am using an LLM for assistance if I need something like this file renamed, and all its functions renamed to match, and all the project meta to change, and every comment that mentions the old name to be updated. There is an objectively correct result.

It’s a matter of scope.


I've been evangelizing vibe coding, because we are wielding something much more powerful now than even ~3 months prior (Nov was the turning point).

Now that Prometheus (the myth, not the o11y tool) has dropped these LLMs on us, I've been using this thought experiment to consider the multi-layered implications:

In a world where everyone can cook, why would anybody buy prepared food?


>In a world where everyone can cook, why would anybody buy prepared food?

I would guess for convenience and saving time. While vibe-coding might be faster, you still have to "do it". As in, think about what you want your software to do, write out or dictate your prompts, test that it works etc. That takes time (might be less time than writing it out by hand but it's still a non-zero amount of time).


I think my comment was misconstrued as siding either way. I didn't communicate it well. It's that it frames questions -- like you were exploring.

Because of course we all buy prepared foods of all sorts from street vendors to fast food to local restaurants to chains to Michelin Stars. While there are many reasons one will cook for themselves, there are many reasons one will buy from someone else too.


That is hilarious.... and to prove the point of this whole comment thread, I created reddit-kv for us. It seems to work against a mock, I did not test it against Reddit itself as I think it violates ToS. My prompts are in the repo.

https://github.com/ConAcademy/reddit-kv/blob/main/README.md


Typo-Driven Development!


Thanks for sharing this — I appreciate your motivation in the README.

One suggestion, which I have been trying to do myself, is to include a PROMPTS.md file. Since your purpose is sharing and educating, it helps others see what approaches an experienced developer is using, even if you are just figuring it out.

One can use a Claude hook to maintain this deterministically. I instruct in AGENTS.md that they can read but not write it. It’s also been helpful for jumping between LLMs, to give them some background on what you’ve been doing.


In this case, instead of a prompt I wrote a specification, but later I had to steer the models for hours. So basically the prompt is the sum of all such interactions: incredibly hard to reconstruct to something meaningful.


This steering is the main "source code" of the program that you wrote, isn't it? Why throw it away. It's like deleting the .c once you have obtained the .exe


It's more noise than signal because it's disorganized, and hard to glean value from it (speaking from experience).


I wasn’t exactly suggesting this. The source code (including SVG or DOCX or HTMl+JS for document work) is the primary ground truth which the LLM modifies. Humans might modify it too. This ground truth is then rendered (compiled, visualized) to the end product.

The PROMPTS.md is communication metadata. Indeed, if you fed the same series of prompts freshly, the resultant ground truths might not make sense because of the stochastic nature of LLMs.

Maybe “ground truth” isn’t exactly the right word, but it is the consistent, determined basis which formed from past work and will evolve with future work.


> because of the stochastic nature of LLMs.

But is this "stochastic nature" inherent to the LLM? Can't you make the outputs deterministic by specifying a version of the weights and a seed for the random number generator?

Your vibe coding log (i.e. your source code) may start like this:

    fix weights as of 18-1-2026
    set rng seed to 42

    write a program that prints hello world
Notice that the first two lines may be added automatically by the system and you don't need to write or even see them.


I see what you are saying, and perhaps we are zeroing in on the importance of ground truths (even if it is not code but rather PLANs or other docs).

For what you're saying to work, then the LLM must adhere consistently to that initial prompt. Different LLMs and the same LLM on different runs might have different adherence and how does it evolve from there? Meaning at playback of prompt #33, will the ground truth gonna be the same and the next result the same as in the first attempt?

If this is local LLM and we control all the context, then we can control that LLM's seeds and thus get consistent output. So I think your idea would work well there.

I've not started keeping thinking traces, as I'm mostly interested in how humans are using this tech. But, that could get involved in this as well, helping other LLMs understand what happened with a project up to a state.


> But is this "stochastic nature" inherent to the LLM?

At any kind of reasonable scale, yes. CUDA accelerators, like most distributed systems, are nondeterministic, even at zero temperature (which you don't want) with fixed seed.


I've only just started using it but the ralph wiggum / ralph loop plugin seems like it could be useful here.

If the spec and/or tests are sufficiently detailed maybe you can step back and let it churn until it satisfies the spec.


Isn't the "steering" in the form of prompts? You note "Even if the code was generated using AI, my help in steering towards the right design, implementation choices, and correctness has been vital during the development." You are a master of this, let others see how you cook, not just taste the sauce!

I only say this as it seems one of your motivations is education. I'm also noting it for others to consider. Much appreciation either way, thanks for sharing what you did.


Doesn’t Claude Code allow to just dump entire conversations, with everything that happened in them?


All sessions are located in the `~/.claude/projects/foldername` subdirectory.


Doesn't it lose prompts prior to the latest compaction?


I’ve sent Claude back to look at the transcript file from before compaction. It was pretty bad at it but did eventually recover the prompt and solution from the jsonl file.


It’s loses them in the current context (say 200k tokens), not in its SQLite history db (limited by your local storage).


I did not know it was SQLite, thx for noting. That gives the idea to make an MCP server or Skill or classical script which can slurp those and make a PROMPTS.md or answer other questions via SQL. Will try that this week.


It doesn't lose the prompt but slowly drains out of context. Use the PreCompact hook to write a summary.


aider keeps a log of this, which is incredibly useful.


Here's the link to the Brooking's report from the NPR article, to read it in full: https://www.brookings.edu/articles/a-new-direction-for-stude...

I've only skimmed it, but I note that all this research is before Nov 2025 and is quite broad. It does get some into coding, mentioning GitHub CoPilot and also refers to a paper about vibe-coding, where the conclusion is that not understanding the artifacts is a problem.

So all this reporting is before Gemini 3 and Opus 4.5 came out. Everything is really different with the advent of that.

While substitute teaching just before Xmas 2025, I installed Antigravity on the student account of the class computer and vibe-coded two apps on the smart board while the kids worked on Google Classroom. This was impromptu, to liven up things, but I knew it would work because I had such amazing experiences with the tool the week before.

* [1] Quadratic Formula Explorer for Algebra 2

* [2] Proving Parallelograms for Honors Geometry

Before the class ended, I then gave a quick talk the gist was: "I just made these tools to understand the coursework by conversing with an LLM. Are you going to use this to cheat on your homework or to enhance your understanding?"

I showed it to a teacher and then she pointed me to existent tools like them on educational web sites. But that was missing the point that we can just manifest the very hyper-specific tools we need... for example how should the Quadratic Formula Explorer work for someone with dyslexia?

I'm not sure what the next steps with all this is, but certainly education needs to adapt. The paper notes "AI can enrich learning when well-designed and anchored in sound pedagogy" and what I did there is neither, so imagine how sweet it is gonna be when we weave this into educational systems by skilled curriculum designers.

[1] https://conacademy.github.io/quadratic_explorer/ [2] https://conacademy.github.io/proving_parallelograms/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: