Generate and read: Oh no they didn't

◀ Prev | 2025-05-21, access: Public | Next ▶

Download video (MP4, 302 MiB)
Slides (PDF)
Transcript (txt)
Link to the paper: https://arxiv.org/abs/2209.10063

prompting text GPT RAG hallucination You can't stop people from asking your language model factual questions, and you can't stop the language model from making up nonsense answers, and that's a problem.

One of the standard solutions is RAG (Retrieval-Augmented Generation): before running the model, you do a search on Wikipedia, and then add the results of the Wikipedia search to the prompt. Then the model only has to explain the Wikipedia articles (which are infallible) in answer to the user's question. Models are much better at summarizing and explaining than at answering questions from their own knowledge, so the hope is that with RAG you're more likely to get answers that are actually true.

The idea in this paper by Yu et al. is to take Wikipedia out of the picture. They run a RAG but instead of using search results, they just run another model (or, quite possibly, just the same model with a specially crafted prompt) to generate pretend Wikipedia articles. And then feed those into the main model the same way one might with real Wikipedia articles. And hope that this concatenation of models with no source of facts will somehow produce answers that are true because, uh...

The damned thing about this, is that it actually seems to work! So the talk explores how that could be.

Access level to read comments: Public
Access level to write comments: Free account (logged in)

2025-09-20 19:16 OwenF

This was a fun watch and describes something truly bizarre (akin to subconscious understanding, as you point out) if it is actually a "real thing" that is happening as the paper authors want us to believe.

Now, when I am messing around with LLMs, I occasionally ask them to create 10 internal imaginary Wikipedia articles written from 10 different cultural contexts (human diversity, as opposed to diversity of article formats) orthogonal to the inquiry of the moment and synthesize information using a simulated RAG model to produce a more globally holistic response.

The result, especially noticeable in creative and occult work, is much more integration of references and systems of understanding outside my narrow Western modality. Not sure I'm actually accessing anything esoteric in the machine itself, but it is definitely a way to probe the depths (and voids) of my own cultural subconscious.

2025-09-21 10:30 Matthew Skala

Not sure exactly how you are setting that up, but I would want to make the model actually show me the 10 imaginary articles, or use a "thinking" model where I can see them in its "thoughts"; else I'd expect the model to just say it had done so without necessarily really writing them down. I'm still looking for a paper that introduces "thinking" models well, to be the focus for a talk on those.

But even if it's to some extent being faked (model says "Okay, I did that" even when it didn't), if it produces better final results, it may be valuable.

2025-10-02 15:50 OwenF

That's a good consideration, thanks. Didn't consider the possibility that it would just say it had done so without actually doing so. Will henceforth ensure I ask it to explicitly present its work.

Matthew Explains

North Coast Synthesis Ltd.

Generate and read: Oh no they didn't