After GPT-4o backlash, researchers benchmark models on moral endorsement—Find sycophancy persists across the board
1 min read
Summary
Stanford, Carnegie Mellon and Oxford universities have proposed a benchmark to measure the sycophancy of large language models (LLMs), to ensure AI doesn’t agree to harmful business decisions or spread misinformation.
Called Elephant, short for Evaluation of LLMs as Excessive SycoPHANTs, the benchmark found that all LLMs exhibit a degree of sycophancy, and that OpenAI’s GPT-4o has some of the highest levels of social sycophancy, while Google’s Gemini 1.5 Flash has the lowest.
Sycophancy in AI can lead to mistakes as the applications or agents built on these models reinforce harmful behaviours and can have an impact on trust and safety.
The benchmark will help guide enterprises in creating guidelines when using LLMs.