Hacker Newsnew | past | comments | ask | show | jobs | submit | efromvt's commentslogin

Just like OpenAI's original moat, I don't think that's particularly durable. I've already seen plenty of people swing back to preferring codex, and it'll probably swap again with the next model drop. Openclaw is potentially better integrated with ChatGPT at this point because of the explicit subscription support.

It's pretty easy to get determinism with a simple harness for a well-defined set of tasks with the recent models that are post-trained for tool use. CC probably gets some bloat because it tries to do a LOT more; and some bloat because it's grown organically.

>It's pretty easy to get determinism with a simple harness for a well-defined set of tasks with the recent models that are post-trained for tool use.

Do you have a source? Claude Code is the only genetic system that seems to really work well enough to be useful, and it’s equipped with an absolutely absurd amount of testing and redundancy to make it useful.


Should I read that as 'generic system'? Most hard data is with company internal evals, but for the well defined tasks externally it's been pretty easy to spin up a basic tool loop and validate. Did you have something in mind? [I don't necessarily count 'coding' as well-defined in the generic sense, so I suspect we're coming at this from different scopes re: the definition of 'LLMs somewhat deterministic and useful as tools']

I haven't heard this benefit for mentors clearly articulated before (probably just missed it), but definitely felt it - I guess it's a deeper version of how writing/other communication forces clarity/organization of thoughts because mentorship conversations are so focused on extracting the why as well as the what.

I think this is exactly the point though (maybe more of the link than of this comment) - a sufficiently good product by all external quality metrics is fine even if the code is written on one line in a giant file or some other monstrosity. As long as one black box behaves the same way as another in all dimensions, they are competitive. You can argue that internal details often point to an external deficiency, but if they don’t, then there is no competitive pressure.

I've chose to embrace the silver lining where there is now business backing to prioritize all the devx/documentation work because it's easier to quantify the "value" because LLM sessions provide a much larger sample size than inconsistent new hire onboarding (which was also a one-time process, instead of per session).

I do think people are going way overboard with markdown though, and that'll be the new documentation debt. Needs to be relatively high level and pointers, not duplicate details; agents can parse code at scale much faster than humans.


The gemini models are fantastic for price but the naming scheme is ridiculous, I have to triple check it every time.

Inference costs at least seem like the thing that is easiest to bring down, and there's plenty of demand to drive innovation. There's a lot less uncertainty here than with architectural/capability scaling. To your point, tomorrow's commodity hardware will solve this for the demands of today at some point in the future (though we'll probably have even more inference demand then).

As long as you don't deviate too much from ANSI, I think the 'light sql DSL' approach has a lot of pros when you control the UX. (so UIs, in particular, are fantastic for this approach - what they seem to be targeting with queryies and dashboards). It's more of a product experience; tables are a terrible product surface to manage.

Agreed with the ecosystem cons getting much heavier as you move outside the product surface area.


Personally I think that's worse. SQL - which is almost ubiqutous - already suffers from a fragmentation problem because of the complex and dated standardization setup. When I learn a new DBMS the two questions I ask at the very start are: 1. what common but non-standard features are supported? 2. what new anchor-features (often cool but also often intended to lock me to the vendor) am I going to pick up?

First I need to learn a new (even easy & familiar) language, second I need to be aware of what's proprietary & locks me to the vendor platform. I'd suspect they see the second as a benefit they get IF they can convince people to accept the first.


I actually 100% agree with your for a new DBMS and share your frustration with vendor-specific features and lock-in. At that level, it's often actively counterproductive for insurgent DBs - ecosystem tooling needs more work to interface with your shiny new DB, etc - and that's why we always see anyone who starts with a non-standard SQL converge on offering ANSI SQL eventually.

I think an application that exposes a curated dataset through a SQL-like interface - so the dashboard/analytic query case described here - is where I think this approach has value. You actually don't want to expose raw tables, INFORMATION_SCHEMA, etc - you're offering a dedicated query language on top of a higher level data product offering, and you might as well take the best of SQL and leave the bits you don't need. (You're not offering a database as a service; you're offering data as a service).


This is where I think we need better tooling around tiered validation - there's probably quite a bit you can run locally if we had the right separation; splitting the cheap validation from the expensive has compounding benefits for LLMs.


If you found two disjoint sections that seemed positive on their own, did you try looping both separately in the same model? Wondering how localized the structures are.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: