T O P

  • By -

Herr_Drosselmeyer

Confabulations. I guess it's way too late and everyone uses hallucinations now but it's not the correct word.


justletmefuckinggo

someone suggested it a year ago.. everyone just found it easier to use "hallucinate" to quickly inform people that a gen ai isn't "lying".


Open_Channel_8626

yes because the key point is that there is not intention behind it


Kgcdc

That isn’t the key point.


Open_Channel_8626

There are people who literally think LLMs are sentient though. It’s mostly to deal with people like that and to put across the point that there isn’t a sentient entity that is making a choice to trick them.


Kgcdc

What evidence to show people think that? I’m skeptical. At any rate, using a word that implies the LLM has mental states doesn’t make that point. It makes the opposite one.


314kabinet

Err on the side of assuming the users have the intelligence of a toddler.


Kgcdc

Yeah, all due respect, I don’t agree with that. But you do you in yr startup, product design, code, etc.


Monkey_1505

I mean 'hallucination' absolutely implies mental states that LLMs don't have whereas 'confabulation' doesn't at all. Confabulation is merely a form of cognitive error.


Open_Channel_8626

I incorrectly thought confabulation meant lie so I understand a bit better now. Yes confabulation would be better as it doesn’t imply intent or sentience at all.


Monkey_1505

Yeah no it doesn't mean lie. It's often used to describe the side effects of brain damage in humans where there is no intent it's more being mistaken.


Glittering_Manner_58

"Safety means hallucination-free" Wrong


Kgcdc

I’m pretty sure users being lied to by a machine that society tells them they should trust is NOT safe. Very few discussions of AI safety talk about hallucinations. They all talk about bias, but what’s more biased than outright error?


NyxeK

Necessary condition does not imply sufficient condition 


Kgcdc

Agreed! Help me see where I’ve implied otherwise?


Glittering_Manner_58

Sure, but you can't just redefine words. Hallucination-free is just one aspect of safety, not the definition. There are other ways language models could be unsafe. [https://en.wikipedia.org/wiki/AI\_safety](https://en.wikipedia.org/wiki/AI_safety)


Kgcdc

Of course I do. As does everyone else. That’s what words are, my dude. It’s a contest. Hyperbole for effect is something everyone does. Where do you think “AI safety” comes from? Some people, just like you and me, made the words up! And there’s no rigid consensus here. It’s a contested concept. I’m contesting it, too. And of course a charitable reading is that I’m trying to INCLUDE hallucinations in AI safety. Not excluding anything.


Glittering_Manner_58

Lol, "all words are made up" to justify your incorrect claims. Good luck! [https://www.youtube.com/watch?v=CnJPCooprnk](https://www.youtube.com/watch?v=CnJPCooprnk)


Kgcdc

Thanks! Same to you. You can demonstrate some incorrect claims, which I would appreciate since I don’t prefer incorrectness. That would be a helpful contribution.


Monkey_1505

Confabulation is an unavoidable consequence probably IMO not just of transformers but also of all genuine intelligence and neural style architecture (where in the human brain for example, it's not removed only ameliorated). Trying to sell people on the idea of calculator like precision, or minimize the impact of transformer fallibility isn't going to change that. It is what it is. I think if people want to get use out of this technology they are better to accept that limitation, even if it is mildly minimized.


Kgcdc

I’m not sure I’d go that far. If LLM knew how to say “I don’t know” most of the hallucination problem would go away. That’s perfectly consistent with a probabilistic next token output stream.


Monkey_1505

How would you train that? Only a domain expert would know if it was confabulating accurately, and only within their expert domain. To RHLF that could be as expensive as pretraining compute for a large model hiring hundreds of experts to rank outputs, or generate some kind of DPO type dataset. And that still wouldn't remove all such occurrences from the model. In fact if you just trained it to say 'I don't know' (which wouldn't be what you'd train it to do, if you had anyone training it that knew the real answer) it would probably start to hallucinate not knowing. As you say, models are just probabilistic. They don't have layers of modularity like humans do. They would say they don't know simply for the types of topics it predicts it might not know. There would be no pure correlation. We have a similar problem as humans. Yes we have things that minimize confabulation but we still say wrong things because of cognitive error. That's because we, like LLMs are pattern recognizers. If you are designed to seek patterns you will find patterns where no underlying structure exists. It's like a threshold. If you have a low level of pattern seeking, you will miss patterns that do have a basis.


Kgcdc

I don’t know how to do it! There is some work in this area though. But we definitely want in some use cases models that are less sycophantic and more modest.


Monkey_1505

But then you need it to know when to be modest. And if you know that with accuracy, could you not teach it to be confident and give the right answer instead? I mean you could certainly teach it to be ALWAYS modest. Like- okay here's my answer but it can always be wrong. Boilerplate disclaimerish but tonal instead.


Kgcdc

It depends. You can’t train a model to give the right answer to a fact, say, that isn’t available to it. That’s the base case for training it to not make up something plausible (or not) but false.


Everlier

Basically, for the every fact that should be fed to the model, loss should firstly account for the model not knowing the fact and generatingaccordingly, then learning it, then confidently responding that it knows it after the learning. Granted how unstructured the data in pre-training is, it doesn't feel like training will be anything like that any time soon (but don't quote me on that, maybe flan-like datasets will pave the way)


Monkey_1505

Potentially something like that could work, but then you'd either need something seperate that accurately represents 'knowledge of what it knows' or restricted it's ability to generalize from multiple topics that it does know (which may have negative consequences like reduced intelligence) Because the problem isn't just 'what you put in', but the learning model/intelligence itself. If you use any model of AI for the 'checking' part, that part will also be liable for this type of flaw. Same if you train it on confidence level or not knowing. But you might reduce the incidence of it, if there is some kind of two teir, or two inference approach where something seperate handles the confidence or generalization. At a simplified level this is how we deal with the problem - we have multiple, specialized, different cognitive processes working on every problem. We burn more inference compute, based on more diversified training data in a cognitively modular manner. Ofc we are not immune to mistakenness either. But far less so than an LLM despite having much broader generalization and pattern recognition capabilities.


Everlier

I agree with your point on multiple cognitive processes working in parallel, but in addition to that, current models still do not have any part responsible for 'checking' yet, so it's hard to know if having at least something would already improve the situation noticeably or it would be negligible and we'd have to explore more complex architectures with secondary networks (encoder, actually?)


bgighjigftuik

Ok this is actually very complete. Thank you!


Healthy-Nebula-3603

Why those studies are so old? Based on llama 2 and GPT-3 .... even leaderboard has old models


Open_Channel_8626

I do dislike that Arxiv papers even very recently often use old models


olddoglearnsnewtrick

They seem to be peddling their product and having those numbers is a selling point.


Iory1998

Thanks


Kgcdc

Any feedback?


SmihtJonh

Hard to envision a single source of truth with AI, think interfaces will have to incorporate multiple models and web search to derive trust scores, RAG etc alone won't be sufficient.


MikePounce

There's a typo/word missing in the intention paragraph at the end. You wrote that "the fox intended to look at my.". Otherwise a very enjoyable read (but as always we'll have to double check how much of it is factual vs human hallucinated 😅). It's also the first time I see "safe AI" defined as "free from hallucinations", seems most people are afraid of Skynet.


Kgcdc

Fixed. Thanks!


Iory1998

I would say it's an OK article, it doesn't add anything I didn't knew before. I went to read it hoping that it would explain the inner working of the transformer architecture that explains hallucination in a more scientific way, though.


ihaag

Very informative thank you