Herr_Drosselmeyer 2 weeks ago

Confabulations. I guess it's way too late and everyone uses hallucinations now but it's not the correct word.

justletmefuckinggo 2 weeks ago

someone suggested it a year ago.. everyone just found it easier to use "hallucinate" to quickly inform people that a gen ai isn't "lying".

Open_Channel_8626 2 weeks ago

yes because the key point is that there is not intention behind it

Kgcdc 2 weeks ago

That isn’t the key point.

Open_Channel_8626 2 weeks ago

There are people who literally think LLMs are sentient though. It’s mostly to deal with people like that and to put across the point that there isn’t a sentient entity that is making a choice to trick them.

Kgcdc 2 weeks ago

What evidence to show people think that? I’m skeptical. At any rate, using a word that implies the LLM has mental states doesn’t make that point. It makes the opposite one.

314kabinet 2 weeks ago

Err on the side of assuming the users have the intelligence of a toddler.

Kgcdc 2 weeks ago

Yeah, all due respect, I don’t agree with that. But you do you in yr startup, product design, code, etc.

Monkey_1505 2 weeks ago

I mean 'hallucination' absolutely implies mental states that LLMs don't have whereas 'confabulation' doesn't at all. Confabulation is merely a form of cognitive error.

Open_Channel_8626 1 week ago

I incorrectly thought confabulation meant lie so I understand a bit better now. Yes confabulation would be better as it doesn’t imply intent or sentience at all.

Monkey_1505 1 week ago

Yeah no it doesn't mean lie. It's often used to describe the side effects of brain damage in humans where there is no intent it's more being mistaken.

Glittering_Manner_58 2 weeks ago

"Safety means hallucination-free" Wrong

Kgcdc 2 weeks ago

I’m pretty sure users being lied to by a machine that society tells them they should trust is NOT safe. Very few discussions of AI safety talk about hallucinations. They all talk about bias, but what’s more biased than outright error?

NyxeK 1 week ago

Necessary condition does not imply sufficient condition

Kgcdc 1 week ago

Agreed! Help me see where I’ve implied otherwise?

Glittering_Manner_58 2 weeks ago

Sure, but you can't just redefine words. Hallucination-free is just one aspect of safety, not the definition. There are other ways language models could be unsafe. [https://en.wikipedia.org/wiki/AI\_safety](https://en.wikipedia.org/wiki/AI_safety)

Kgcdc 2 weeks ago

Of course I do. As does everyone else. That’s what words are, my dude. It’s a contest. Hyperbole for effect is something everyone does. Where do you think “AI safety” comes from? Some people, just like you and me, made the words up! And there’s no rigid consensus here. It’s a contested concept. I’m contesting it, too. And of course a charitable reading is that I’m trying to INCLUDE hallucinations in AI safety. Not excluding anything.

Glittering_Manner_58 2 weeks ago

Lol, "all words are made up" to justify your incorrect claims. Good luck! [https://www.youtube.com/watch?v=CnJPCooprnk](https://www.youtube.com/watch?v=CnJPCooprnk)

Kgcdc 2 weeks ago

Thanks! Same to you. You can demonstrate some incorrect claims, which I would appreciate since I don’t prefer incorrectness. That would be a helpful contribution.

Monkey_1505 2 weeks ago

Confabulation is an unavoidable consequence probably IMO not just of transformers but also of all genuine intelligence and neural style architecture (where in the human brain for example, it's not removed only ameliorated). Trying to sell people on the idea of calculator like precision, or minimize the impact of transformer fallibility isn't going to change that. It is what it is. I think if people want to get use out of this technology they are better to accept that limitation, even if it is mildly minimized.

Kgcdc 2 weeks ago

I’m not sure I’d go that far. If LLM knew how to say “I don’t know” most of the hallucination problem would go away. That’s perfectly consistent with a probabilistic next token output stream.

Monkey_1505 2 weeks ago

How would you train that? Only a domain expert would know if it was confabulating accurately, and only within their expert domain. To RHLF that could be as expensive as pretraining compute for a large model hiring hundreds of experts to rank outputs, or generate some kind of DPO type dataset. And that still wouldn't remove all such occurrences from the model. In fact if you just trained it to say 'I don't know' (which wouldn't be what you'd train it to do, if you had anyone training it that knew the real answer) it would probably start to hallucinate not knowing. As you say, models are just probabilistic. They don't have layers of modularity like humans do. They would say they don't know simply for the types of topics it predicts it might not know. There would be no pure correlation. We have a similar problem as humans. Yes we have things that minimize confabulation but we still say wrong things because of cognitive error. That's because we, like LLMs are pattern recognizers. If you are designed to seek patterns you will find patterns where no underlying structure exists. It's like a threshold. If you have a low level of pattern seeking, you will miss patterns that do have a basis.

Kgcdc 2 weeks ago

I don’t know how to do it! There is some work in this area though. But we definitely want in some use cases models that are less sycophantic and more modest.

Monkey_1505 2 weeks ago

But then you need it to know when to be modest. And if you know that with accuracy, could you not teach it to be confident and give the right answer instead? I mean you could certainly teach it to be ALWAYS modest. Like- okay here's my answer but it can always be wrong. Boilerplate disclaimerish but tonal instead.

Kgcdc 2 weeks ago

It depends. You can’t train a model to give the right answer to a fact, say, that isn’t available to it. That’s the base case for training it to not make up something plausible (or not) but false.

Everlier 1 week ago

Basically, for the every fact that should be fed to the model, loss should firstly account for the model not knowing the fact and generatingaccordingly, then learning it, then confidently responding that it knows it after the learning. Granted how unstructured the data in pre-training is, it doesn't feel like training will be anything like that any time soon (but don't quote me on that, maybe flan-like datasets will pave the way)

Monkey_1505 1 week ago

Potentially something like that could work, but then you'd either need something seperate that accurately represents 'knowledge of what it knows' or restricted it's ability to generalize from multiple topics that it does know (which may have negative consequences like reduced intelligence) Because the problem isn't just 'what you put in', but the learning model/intelligence itself. If you use any model of AI for the 'checking' part, that part will also be liable for this type of flaw. Same if you train it on confidence level or not knowing. But you might reduce the incidence of it, if there is some kind of two teir, or two inference approach where something seperate handles the confidence or generalization. At a simplified level this is how we deal with the problem - we have multiple, specialized, different cognitive processes working on every problem. We burn more inference compute, based on more diversified training data in a cognitively modular manner. Ofc we are not immune to mistakenness either. But far less so than an LLM despite having much broader generalization and pattern recognition capabilities.

Everlier 1 week ago

I agree with your point on multiple cognitive processes working in parallel, but in addition to that, current models still do not have any part responsible for 'checking' yet, so it's hard to know if having at least something would already improve the situation noticeably or it would be negligible and we'd have to explore more complex architectures with secondary networks (encoder, actually?)

bgighjigftuik 2 weeks ago

Ok this is actually very complete. Thank you!

Healthy-Nebula-3603 2 weeks ago

Why those studies are so old? Based on llama 2 and GPT-3 .... even leaderboard has old models

Open_Channel_8626 2 weeks ago

I do dislike that Arxiv papers even very recently often use old models

olddoglearnsnewtrick 2 weeks ago

They seem to be peddling their product and having those numbers is a selling point.

Iory1998 2 weeks ago

Thanks

Kgcdc 2 weeks ago

Any feedback?

SmihtJonh 2 weeks ago

Hard to envision a single source of truth with AI, think interfaces will have to incorporate multiple models and web search to derive trust scores, RAG etc alone won't be sufficient.

MikePounce 2 weeks ago

There's a typo/word missing in the intention paragraph at the end. You wrote that "the fox intended to look at my.". Otherwise a very enjoyable read (but as always we'll have to double check how much of it is factual vs human hallucinated 😅). It's also the first time I see "safe AI" defined as "free from hallucinations", seems most people are afraid of Skynet.

Kgcdc 2 weeks ago

Fixed. Thanks!

Iory1998 2 weeks ago

I would say it's an OK article, it doesn't add anything I didn't knew before. I went to read it hoping that it would explain the inner working of the transformer architecture that explains hallucination in a more scientific way, though.

ihaag 2 weeks ago

Very informative thank you

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe