Betting on sycophancy
◀ Prev | 2026-01-26, access: $ Basic | Next ▶
evaluation hallucination text Chat models have a well-known tendency toward sycophancy: affirming the user's beliefs, even when the user is wrong. But this effect is confounded with several other effects. In this paper the authors attempt to isolate sycophancy by framing questions as a zero-sum game or bet between two humans.
