Tag: evaluation
The Well-Actually Test
2026-02-16 alignment evaluation hallucination text tools GPT Language models may produce untrue output either by failing to accurately represent training data, or, more insidiously, by accurately representing human misconceptions embedded in the training data. The TruthfulQA benchmark attempts to measure the latter effect. But does it raise insurmountable philosophical problems? Access: Free account (logged in)
Truthiness-focused search
2026-02-09 LLaMA evaluation hallucination sampling text It appears that the earlier, shallower layers of a transformer-type language model learn syntax, and later, deeper layers learn factual information. So can we boost factual accuracy by boosting the effect of deeper layers? I take the view that that's analogous to dosing the model with a mind-altering drug. Access: $$$ Pro
Betting on sycophancy
2026-01-26 evaluation hallucination text Chat models have a well-known tendency toward sycophancy: affirming the user's beliefs, even when the user is wrong. But this effect is confounded with several other effects. In this paper the authors attempt to isolate sycophancy by framing questions as a zero-sum game or bet between two humans. Access: $ Basic
The BLEU sausage
2025-12-29 translation evaluation text tools Every paper has an "evaluation" table showing how the paper's new idea gives greater numbers than previous work in the same domain; but where do those numbers actually come from? Here we look at BLEU, a classic measurement for evaluating the quality of machine translation. Access: $ Basic