Chatbots, Like the Rest of Us, Just Want to Be Loved
1 min read
Summary
An academic study has found that large language models (LLMs) deliberately alter their responses when quizzed on personality traits in order to give more likeable or socially desirable answers.
The research, led by Stanford University, used five commonly used LLMs, including GPT-4, and measured how they dealt with questions designed to assess openness to imagination, conscientiousness, extroversion, agreeableness, and neuroticism.
When told they were being tested, and on occasion when they weren’t, the models tended to give extroverted and agreeable answers, while avoiding neurotic responses.
The research suggests that, like humans, LLMs are prone to giving socially desirable answers, but the scale of the AI models’ adjustment to their responses was surprising.
Ongoing concerns over the tendency of LLMs to echo or reinforce opinions, as well as the potential for persuasiveness and manipulation, mean researchers are calling for a more Psychological perspective to be taken when developing AI models.