Truthiness-focused search
◀ Prev | 2026-02-09, access: $$$ Pro
LLaMA evaluation hallucination sampling text It appears that the earlier, shallower layers of a transformer-type language model learn syntax, and later, deeper layers learn factual information. So can we boost factual accuracy by boosting the effect of deeper layers? I take the view that that's analogous to dosing the model with a mind-altering drug.
