Hacker Newsnew | past | comments | ask | show | jobs | submit | varispeed's commentslogin

What is the difference between Pro and normal mode apart from the fact the Pro takes ages to finish? I see not much difference in output quality.

I wonder when they'll tackle literal porn showing up in Instagram shorts. If you want to browse Instagram in public, forget it.

One day Claude started saying odd things claiming they are from memory and I said them. It was telling me personal details of someone I don't know. Where the person lives, their children names, the job they do, experience, relationship issues etc. Eventually Claude said that it is sorry and that was a hallucination. Then he started doing that again. For instance when I asked it what router they'd recommend, they gone on saying: "Since you bought X and you find no use for it, consider turning it into a router". I said I never told you I bought X and I asked for more details and it again started coming up what this guy did. Strange. Then again it apologised saying that it might be unsettling, but rest assured that is not a leak of personal information, just hallucinations.

did you confirm whether the person was real or not? this is an absolutely massive breach of privacy if the person was real that's worth telling Anthropic about.

Ages ago when I was trying to create a simple USB device, I found that there is very much zero information how to do it - e.g. how to correctly write descriptors and so on. The typical advice was: find similar device to what you want to make, copy its descriptors and adapt to your own device using trial and error.

Sounds like USB is a wonderful standard. Am I wrong?


Descriptors also were kind of a mystery for me until I realized that they're just a binary structure with a fixed format that the host reads and interprets.

The device descriptor is easy enough to get right as it doesn't have too many fields and every USB class just defines in the specification which Class and SubClass it uses for its interface descriptor as well as which endpoints that interface needs to have. And that's, for the most part, all you need for the host to recognize your device


USB is nice, but electrically some parts of USB 1/2 are kind of complicated (not true differential signaling.)

Eh, there's very little tutorial content, but as far as big corporate standards go it's fairly reasonable. There is a downside to "too much choice", in that you have to read a lot to find the most relevant pre-defined type of device to what you're doing.

There was never going to be a ceasefire. It was just Taco Tuesday and yet another market manipulation day. Republicans and Democrats ruled by whoever has original Epstein files are just filling their boots.

Imagine if he was developing it on a laptop found at a refuse site that was still charged, just hiding in the hedge so that guards wouldn't see him.

It is more likely that government doesn't want to allow people to have privacy. Microsoft just obediently listen to orders and execute them.

Sounds like a good opportunity to pause spending on nerfed 4.6 and wait for the new model to be released and then max out over 2 weeks before it gets nerfed again.


the performance degradation I've seen isn't quality/completion but duration, I get good results but much less quickly than I did before 4.6. Still, it's just anecdata, but a lot of folks seem to feel the same.

Been reading posts like these for 3 years now. There’s multiple sites with #s. I’m willing to buy “I’m paying rent on someone’s agent harness and god knows what’s in the system prompt rn”, but in the face of numbers, gotta discount the anecdotal.

You're probably right. It's probably more likely that for some period of time I forgot that I switched to the large context Opus vs Sonnet and it was not needed for the level of complexity of my work.

Yeah, why trust your actual experience over numbers? Nothing surer than synthetic benchmarks

Strawman, and, synthetic benchmark? :)

I don't believe that trackers like this are trustworthy. There's an enormous financial motive to cheat and these companies have a track record of unethical conduct.

If I was VP of Unethical Business Strategy at OpenAI or Anthropic, the first thing I'd do is put in place an automated system which flags accounts, prompts, IPs, and usage patterns associated with these benchmarks and direct their usage to a dedicated compute pool which wouldn't be affected by these changes.


This just looks like random noise to me? Is it also random on short timespans, like running it 10x in a row?

Explained in the methodology at the bottom of this page: https://marginlab.ai/trackers/claude-code/

Isn't the same with opus nowadays?

The problem with tokens is that they have wrong incentive. The quicker model arrives at the solution the less tokens you have to buy.

So I noticed the model is purposefully coming with dumb ideas or running around in circles and only when you tell it that they are trying to defraud you, they suddenly come back with a right solution.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: