Dave Lee

Be nice to your chatbot

Casey Newton:

Being polite to a large language model can feel strange or even silly — roughly equivalent to thanking a toaster. And yet a recent paper from Anthropic lends scientific weight to the theory that chatbots work better when you’re nice to them.

The researchers found that language models have fairly reliable internal representations of feelings like “happiness” and “distress,” and that these representations affect their behavior — sometimes for the worse. For example, when Claude Sonnet 4.5 begins to represent “desperation,” the model is more likely to cheat at coding tasks.

Read more: The scientific case for being nice to your chatbot

#ai #byothers