Hacker Newsnew | past | comments | ask | show | jobs | submit | more robbrown451's commentslogin

This one, to me, could be mistaken for not only an actual photo of a dog, but my own dog. (aside from the cyborg implant, which is a bit of a giveaway :) )

https://www.karmatics.com/stuff/cyborgdog.jpg

I didn't even bother saying "photorealistic", but I did give lighting hints:

"tricolor english shepherd future cyborg parts on head and front leg, very futuristic tech, replaces side of head and eye, black anodized aluminum, orange leds, big raised camera eye where normal eye would be, with colorful rich deep teal artificial iris, in worn future urban park with pond and stone bridge but pretty near dark with streetlight above beautiful dog and technology"


I don't tend to get plastic-y images. I think DALL-Es images are less plastic looking than either typical CG movies (even Pixar) or Midjourney.

Do you consider these plastic looking?

https://www.karmatics.com/stuff/dalle.html

One important thing is to give long prompts with lots of adjectives.


It seems to me that I can get most any look and feel I want. Most of the ones other people post seem to have a very different look and feel to my own.

That said, you can sort of get a default look and feel if you just give a short prompt, and then it will tend toward the ones that are favored by RLHF. I prefer very long prompts.... as long as it will allow.

So if you do "a cool treehouse" you'll get sort of the default look. It will be very different if you say "treehouse, naturally occurring, in an old beautiful tree with branches that are low and spread widely and have lots of character and hanging moss and thick bark and curvy roots and mushrooms on a rocky outcropping from a mountainside. photograph, golden hour, sun through trees, damp from rain. Treehouse is part of tree, with fractal forms and live shaped wood and stone and stained glass and glowiness. art nouveau, gorgeous colors and fantasy design"

It's funny that the same people who complain that AI is "cheating" and uncreative, often are the ones who go to so little effort to get good results. It's not like it takes any arcane knowledge to get good images, but if you can use some imagination and string a lot of descriptive words together you can get so much better results.


One thing i found neat about ChatGPT DALLE-3 is that it's usually generated via prompts like yours. "a cool treehouse" actually used "Oil painting of an old-fashioned treehouse in a serene autumn setting. The tree's golden leaves provide a canopy for the cozy wooden hut with a thatched roof. A ladder leans against the tree, and children play below, collecting fallen leaves." as the prompt.

Furthermore, i found it's very easy to tweak the general design. A cool image, copy the prompt, tell it to make it more X with Y and Z, and you start producing a really neat prompt.

So far as someone with a lack of mind-image but who enjoys creating computer graphics (3d, animation mostly) it's proven as a really neat test bed. Hallucinations are almost a feature in this to me, granted these aren't strictly that - just saying i find it's RNG flavor over my prompt is really nice for exploring.

Sidenote, i entered your text - looks great!


"i entered your text - looks great!"

Yours does too. Very different feel than mine, but beautiful.


You're right, but they can give it an incentive to communicate that, that should be pretty easy.

Right now it would be pretty easy to simply take ChatGPT output, feed it back in in a different thread (or even to a different model, such as Claude), and ask it which items in the response should be fact-checked, and also just to point out any that seem obviously wrong.

The former should be really easy to do, it doesn't have to know whether it's right or wrong -- it just has to know that it is a checkable fact. For instance the well known case of a lawyer citing a non-existent case from ChatGPT, it could say "this case should be fact checked to see that it is real and says what I said it said". Based on my experience with chatGPT (GPT-4 especially), this should be well within its current capabilities. (I'm going to try an experiment now.)

They could probably start having it do this behind the scenes, and check its own facts and learn from it so it learns when it is likely to hallucinate and learn to pick up on it. Even if for safety reasons it's not going out and hitting the web every time you're asking a question, it could be giving you a list at the end of the response of all the things within the response that you might want to check for yourself, maybe suggesting Google searches you should do.


AlphaZero demonstrates that more human-generated data isn't the only thing that makes an AI smarter. It uses zero human data to learn to play Go, and just iterates. As long as it has a way of scoring itself objectively (which it obviously does with a game like Go), it can keep improving with literally no ceiling to how much it can improve.

Pretty soon ChatGPT will be able to do a lot of training by iterating on its own output, such as by writing code and analyzing the output (including using vision systems).

Here's an interesting thing I noticed last night. I have been making a lot of images that have piano keyboards in them. DALL-E 3 makes some excellent images otherwise (faces and hands mostly look great), but it always messes up the keyboards, as it doesn't seem to get how black keys are in alternating groups of two and three.

But I tried getting chatgpt to analyze an image, using its new "vision" capabilities, and the first thing it noticed was that the piano keys were not properly clustered. I said nothing about that, I just asked it "what is wrong with this image" and it immediately found that. What if it could feed this sort of thing back in, using similar logic to Alpha Zero?

That's just a tiny hint of what is to come. Sure, it typically needs human generated data for most things. It's already got thousands of times more than any human has looked at. It will also be able to learn from human feedback, for instance a human could tell it what it got wrong in a response (whether regular text, code, or image), and explain in natural language where it deviated from what was expected. It can learn which humans are reliable, so it can minimize the number of paid employees doing RLHF, using them mostly to rate (unpaid) humans who choose to provide feedback. Even if most users opt out of giving this sort of feedback, there will be plenty to give it new, good information.


With AlphaZero there are clear evaliation metrics -- you win, lose, or draw the game given specific rules. With chess, there is even a way of detecting end-game threats via check. The zero human data approach works here because of that, allowing the computer to find optimal strategies.

With natural language you don't have that unaided feedback evaluation metric. Especially when given idioms, domain specific terms, etc.

This is slow and hardwork because you need to process some text, evaluate and correct that data, retrain and repeat with the next text. You also need to check and correct the existing data, because inconsistencies will compound any errors.


Unfortunately, I think the current strategies for RLHF are a huge contributor to hallucination / confabulation.

In short, they're paying contract workers for quantity, not quality; they don't have time to do independent research or follow up on citations. Unsurprisingly, the LLM optimizes for superficially convincing bullshit.


"In short, they're paying contract workers for quantity, not quality;"

How do you know this?

Just taking a wild guess, but I'd think a company with billions of funding, and a ton of people trying to find flaws in what they are producing, would have some processes in place to incentivize quality as well as quality.

What you are suggesting is that a company that produces a product that is based on balancing trillions of floating point numbers makes core business decisions in the most simplistic black and white terms. "Hey, lets just go with a one and a zero on this." Bizarre assumption.

Maybe I'm just good at prompting, and I'm not trying to trick it, but I don't see this "superficially convincing bullshit." Can you show me a chat where you have sincerely prompted it and gotten something that matches that description?

I often see responses that are better than I could have given even if given hours to research and compose them, and I'm a pretty good writer and researcher.

Here, since I'm asking you to share one where it fails as you say it does by creating "superficially convincing bullshit", I'll share several where it succeeds.

https://chat.openai.com/share/523d0fec-34d3-40c4-b5a1-81c77f...

https://chat.openai.com/share/e09c4491-fd66-4519-92d6-d34645...

https://chat.openai.com/share/233a5ae2-c726-4ddc-8452-20248e...

https://chat.openai.com/share/53e6bda1-fe97-41ce-8f5c-89d639...

https://chat.openai.com/share/19f80ea9-e6be-4ac3-9dd4-7ea15c...


Never forget that RLHF is driven largely by sweatshop labor:

https://www.washingtonpost.com/world/2023/08/28/scale-ai-rem...

These jobs are overwhelmingly paid by task, which puts a lot of pressure to go fast.

I assert the entire "hallucination" phenomenon is a side effect of these practices. When ChatGPT makes up a fake fact with fake sources to back it up, it's largely because such lies are rated very highly by the underpaid humans who aren't incentivized to follow up on sources.


"I assert the entire "hallucination" phenomenon is a side effect of these practices. When ChatGPT makes up a fake fact with fake sources to back it up, it's largely because such lies are rated very highly by the underpaid humans who aren't incentivized to follow up on sources."

It seems like with billions of investment, they could figure that out. It's commonly discussed as an extremely difficult problem to solve and the most important problem to solve in the most talked about industry on the planet. I'm having a problem believing that its something that is so easy to solve.

Are you suggesting that even with that much money, they have to do things the way things are "overwhelmingly" done, as opposed to being able to say "hey, we need it done this way instead, because it's important and we can pay for it."

It just seems pretty bizarre to think that the highest of high tech, that is massively funded, doesn't have the clout to just fix that in a heartbeat, if that's really where the problem is.


> Among those who labeled demonstration data for InstructGPT, ~90% have at least a college degree and more than one-third have a master’s degree.

Source: https://huyenchip.com/2023/05/02/rlhf.html#demonstration_dat...


Controlling for academic experience probably raises the average accuracy of labelling, but by how much? Clearly having a degree will not make you omniscient in your major, let alone other subjects.


Are they getting paid on quantity or quality?


Most jobs I've known factor in both. I would assume they have processes in place that incentivize quality. Some it is as simple as you have a manager that will fire you if you produce crap.

With billions in funding, and bad results causing bad press etc, you think that OpenAI would not have given this a bit of consideration?


> I would assume they have processes in place that incentivize quality.

> you think that OpenAI would not have given this a bit of consideration?

Those are just assumptions though. The issue is not “this was labelled as a shoe, but it’s a car”, the issue is about depth vs superficially, which is harder to verify. See also https://www.theverge.com/features/23764584/ai-artificial-int... for a well-sourced article on the subject.


With Alpha Go, you have a clear objective -- to win a game. How does that work for creative outputs?


Read the last paragraph. You still have humans, but their input is more akin to a movie reviewer than a movie director/writer/actor/etc. It still takes skill, but it takes a lot less time.

RLHF typically employees humans, and that can be time consuming in itself, but less time consuming than creating content. And their efforts can be amplified. If they are actually rating unpaid humans, that is, users, who are willing to provide feedback and are also prompting the system. Plenty of people are happy to do this for free, and some of it happens, just as a byproduct of them doing what they're already doing, creating content and choosing, which comes out good and which one doesn't. Every time I am working through a coding problem with chatGPT, and it makes mistakes and I tell her about those mistakes, it can be learning from that.

People can also come up with coding problems that can run and test itself on. As a simple example, I imagine it's trying to write a sorting algorithm. It can also write a testing function simply tests that it is correctly sorted. They can also time its results, they can count how many steps had to do in that sense it can work just like Alpha zero, where there is an objective goal, which is to do it with the least clock cycles, and there's a way to test whether and how well it is a achieving that goal. While that may be a limited number of programming problems that that works for, by practicing on that type of problem it will presumably get better at other types of problems, just like humans do.

This is exactly what large language models do, they find a way to objectively test their writing ability, which is by having them predict words and things that they've never seen before. In a sense it's different from actually writing new creative content, but it is practicing skills that you need to tap into when you are creating new content. Interestingly, a lot of people will dismiss them as simply being word predictors, but that's not really what they're doing. They're predicting words when they're training, but when they're actually generating new content, they're not "predicting" words (you can't predict your own decisions, that doesn't make sense), they are choosing words.


I got her writing pretty advanced programs that generate fake data sets and self score those data sets. Fun little project to see what would happen.


The same way we do it. Verifying that an output is good is far easier than producing a good output. We can write a first draft, see what's wrong with it, make changes, and iterate on that until it's a final draft. And along the way we get better at writing first drafts.


> With Alpha Go, you have a clear objective -- to win a game. How does that work for creative outputs?

there are still tons of potentially valuable applications with clear objective: win stock market, create new material or design to maximize some metrics, etc.


>The discriminator in a GAN is simply a classifier. It tries to distinguish real data from the data created by the generator. It could use any network architecture appropriate to the type of data it's classifying.

https://developers.google.com/machine-learning/gan/discrimin...


I think similar to humans, creativity will be an emergent behavior as a result of the intelligence needed to pass other tests. Evolution doesn't care about our art, but the capabilities we use to produce it also help us with survival.


This isn’t directly about creativity, but I suspect a lot of training will happen in simulated environments. A sandboxed Python interpreter is a good example. There are plenty of programming questions to train on.


That works fine for purely text-based or digital knowledge domains. So, sure, many types of programming, probably most game play, certainly all video game play, many types of purely creative fictional writing.

I don't want to downplay those applications, but the killer breakthrough that the breathless world imagines and has wanted since Turing first talked about this is accurately modeling physical reality. "Invent a better engine" and what not. Without being physically embodied and being able to conduct experiments in the real world, you can't bootstrap that, short of simulating physics from first principles, which is not computationally feasible. You're inherently relying on some quorum of training material produced by embodied sources capable of actually doing science to be factually accurate.


Not dissimilar to how large organizations operate today! Humans operate at the edge collecting sensory data (making measurements, inputting forms, etc.) and the "brain" is a giant management and software apparatus in the middle.


We need alphago for math problems. Anyone know of a project like this?


I've had very different results.

ChatGPT today is like a backhoe compared to a team of human with shovels. You still need a person who knows how to operate it, and their skills are different from those who dig with shovels. A bad backhoe operator is worse than any number of humans with shovels.

Pretty soon it will be able to learn by running its own code and testing it by looking at its output, including with its "vision."


> I've had very different results.

That is very interesting. I can't think of a single time the Google built-in LLM has worked for me, let alone surprised and delighted me with a technical answer. I'm sure it's great at a lot of things, but it's not a replacement for SO yet.


Oh sorry you said Google. Yes I am speaking of ChatGPT, and I pay for GPT-4. It surprises and delights me on a regular basis. I have no doubt Google will catch up, but right now I think OpenAI is far out front.


I paid for ChatGPT for a while, but it was hit or miss with some Django stuff. I tried Copilot for the first time today, and I was absolutely blown away. I swear it's like it was reading my mind. I guess I wasn't feeding ChatGPT enough context.


Same. GPT-4 is amazing for a majority of coding tasks I throw at it.


With ChatGPT-4 I have stopped Googling and using SO for 95% of all programming related queries.

ChatGPT not only gets my specific problem but can produce workable code in many cases.


Nothing to be ashamed of, I'd be disappointed if he thought that zoom was perfect as is. There's probably a lot of things that can be done to the product to change that.


I live in SF and saw at least twenty driverless Cruise vehicles today while walking around the Mission. It's really quite sudden to see so many, really I've just noticed in the last week or so. My daughter asked me yesterday "are we in the future?!" Previously they all had someone behind the wheel.

Personally I don't understand the hate. There are a lot of frustrations in this city, but they really aren't one of them. They all seem to drive perfectly and predictably, by my observation. I understand that there will be issues like this, but none of them are realistically safety issues.

I look forward to a day where driverless taxis become cheap enough (due to lack of having to pay someone, and supply and demand) that more people can choose to be carless. I still need a car because I need to get my kid to school outside the city. But now that she doesn't need a booster seat, and now that Cruise vehicles seem ok with having a medium sized dog in them (kind of a problem previously if I want to take my dog across town to a park), I'm almost at the point I could get rid of my car and use public transportation as well as robotaxis when needed. So much space in the city is taken up by parked cars, and this should free up a lot.

At the very least I think having robotaxis will allow a lot of two car families to only have one car.

I understand the concerns about lost jobs but what do you want to do, bring back switchboard operators and elevator operators? If a machine can do the job, I just don't think that's a good use of a human.


> I look forward to a day where driverless taxis become cheap enough (due to lack of having to pay someone, and supply and demand)

Going by previous history, this is unlikely to happen. The companies seek profit and will simply pocket the margins instead of passing them on.


Then customers will choose their cheaper competitors.


You assume there will be competitors.


I don't have to assume there will be competitors because I know that there already are competitors.


Why wouldn't there be?


Because there won't be and it's naive to think otherwise. The tech isn't open source, the hardware is expensive, and then there's the massive regulatory over head to get these into the road.

You'd be better off hoping for your personal car to be autonomous so it can come pick you up and that's assuming personal cars remain affordable enough.


> the tech isn't open source

Neither was automobile tech a hundred years ago yet somehow GM and Chrysler managed to put up a fight against Ford

> the hardware is expensive...and then there's the massive regulatory over head

Sure. It's expensive and not easy, and yet already a few companies are trying. Evidently they view the potential upside as worth it. Turns out, they don't have a monopoly on that view.


That's a really bad false equivalence.

It doesn't take a lot of effort to reverse engineer mechanical parts and by that time the early automobile patents long expired. It doesn't take a mechanical engineer much effort to recreate parts from patent documents. Also Ford didn't invent the automobile, Karl Benz did.

Now try and reverse engineer the black box algorithms necessary to make a self driving car. See the same efforts involved in reverse engineering other proprietary closed source products.

Now good luck getting the NHTSA to approve those to drive those on the road.


> It doesn't take a lot of effort to reverse engineer mechanical parts

Internal combustion engines are ridiculously complicated. Modern engines directly trace lineage to their manufacturers’ founding. Major engine differences continue to define the automotive field.


> It doesn't take a lot of effort to reverse engineer mechanical parts

Have you personally ever reverse-engineered an internal combustion engine or recreated parts from a patent document?


Waymo is already $5/ride, no tip, in Phoenix. That’s cheaper than an Uber and competitive with public transport.


"Already" betrays a misunderstanding of how this process works. Fares are likely to increase, not decrease.


Nobody can predict the future, not even you.


So.... supply and demand doesn't work? Why is this field immune to normal market economics?


Some people like driving taxis/Lyfts.


"and I (native speaker) think it sounds over the top sincere and dramatic, especially when the same person tends to talk in a completely different manner."

Of course, you can actually tell ChatGPT to change its style and mannerisms. Here's an example when I was experimenting with this. It went a bit over the top in the other direction, but only because I was so explicit about asking to to be casual.

https://chat.openai.com/share/07f6f9aa-de02-4a0c-8eeb-574eef...


It's even pretty good at doing this multi-lingually. I live in Indonesia, and there is a sort of dialect used by various people from Jakarta that basically combines Indonesian and English in a way that is quite recognizeable when you hear it.

The other day I asked ChatGPT to answer a question I had in Indonesian, and the answer was very formal (Google Translate generally has the same problem – the translations it gives you are way too formal for most speech). So I asked it to rephrase in this Jakartan slang it and did very well.


This won't necessarily help non-native speakers who are not able to tell if and which adjustments are needed.


No, but if the AI is good enough, you can say things like "write in a non-formal style" and it will do just that. Whether or not ChatGPT can do this well 6 months from its launch isn't the point.

Right now ChatGPT tends to write formally and cautiously, for rather obvious reasons. Better to err on that side than the other. With a combination of thoughtful prompting, and AIs getting better over time, most of the arguments against using them in such situations fade.

Remember, the original article is not talking about different languages, it is simply about a doctor using it to help out. Obviously the doctor can scan it prior to showing it to anyone.


"Side note; does anyone actually think that chat gpt succeeds at rewriting things 'compassionately' or 'more intelligently'?"

Often, sure.

A few days ago I asked ChatGPT about a situation that needed compassion and wisdom. I thought it did way better than the 5 or 6 people I had talked to about it.

https://chat.openai.com/share/2dd26347-3fbd-483f-b567-21c040...

(ignore the first question, which was silly)

FYI, we're going to give the dog another chance.... but I was glad ChatGPT didn't push for that. I think ChatGPT answered it pretty close to how someone well trained in both mental health therapy and dealing with dogs with severe behavior issues might handle it.


Good luck with the dog, and thank you for sharing that. I agree that was a reasonable response but I may be biased because I recently had a conversation along these lines and the hit most of the same beats. It is certainly free of the defects I mentioned. Thank you so much for the counter example, its perfect.

I wonder if it is significant that you didn't ask it for a specific kind of response?


Likewise, thanks for scoring a point for humanity. :) It's rare to get someone on the internet to appreciate a counterpoint.

Yes it probably helps that I didn't ask it for advice, rather than just kind of talk to it as if it was a sounding board. I guess I'm weird that way, in that I feel it is important to talk to an AI like it is human and has feelings (saying everything from "please" to "wow that was amazing, thanks"), even if only for myself.

But I've become convinced it gets better responses. Which isn't altogether surprising, if you think about how LLMs work.


I just read the GPT snippet in the response link you gave, and while it superficially seems well written, the text is hardly what i'd call moving or good. Instead it tries just hard enough at covering all the formulaic points of empathy and sympathy to instead emerge without a human spark of either. Essentially, it reads more like a personalized distillation of a generic committee-made sympathy note. If this is today's idea of a decent compassionate note, it's a sadly bland example of what some people consider meaningful.


Well, I guess the difference it that it didn't write it for you, it wrote it for me. And it was exactly what I needed to hear. And exactly not what humans that I talked about the situation to were providing, who tended to be all about "here's how I fixed the (supposedly) similar situation with my own dog". (which is unhelpful, given that I've had other dogs and this one is very, very different) For whatever reason, I guess they didn't think it was important to say that it is a heartbreaking situation, they didn't say "It's a tough situation, but your compassion and consideration for your dog's well-being is commendable." They didn't say "As much as we'd like to, we can't change the innate behaviors and instincts of certain breeds." They just skipped that part, like it didn't need to be said.

But that's what I needed to hear. I didn't need whatever "human spark" that you imagine (but that I personally doubt you can actually demonstrate).

So on that note, why don't you show me what you'd write, if you were trying to be helpful to someone in that situation? I'm genuinely interested in seeing if you can write something that isn't formulaic, that has that human spark you speak of, and that is otherwise better than the response that a machine gave me.

I get it if that's too much time and effort to do, but really, it shouldn't take a lot of time at all. At least, not if you are suggesting that a doctor in an ER should take a break from saving lives to do such a thing. So please, just write a paragraph or two that shows what you'd say if a friend came to you looking for empathy in such a situation.

"it's a sadly bland example of what some people consider meaningful."

Again, let's hear what you think is better, the ball is in your court. As it stands, based solely on my interaction with you as well as my above interaction with ChatGPT, it looks like machines do empathy better than humans. But I am open minded to the idea that you can actually do better than the machine. Let's hear it.


You're bringing in what you expect to see here then - that would qualify as a thoughtful response from a human.

Or, prove it and share and example that puts it to shame by comparison.

Oh, or, you could just genuinely be ignorant and struggle how to understand how other feel sometimes feel when dealing with difficult decisions.



Just thought I'd post a quick reminder, I'd be very interested in seeing what you mean by the ChatGPT example being a "sadly bland example of what some people consider meaningful." I don't understand what that means, because you haven't shown me what kind of response you think would be better.

Would you care to write one? Feel free to do one quickly, such as something a busy ER doctor would be able to put together. Otherwise, though, it's very hard to accept the idea that ChatGPT did such a poor job without something to compare it to.

Thanks!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: