OpenAI is improving ChatGPT responses in sensitive conversations

OpenAI has significantly improved ChatGPT's ability to handle sensitive mental health conversations by working with over 170 mental health experts from their Global Physician Network. The updates focus on three key areas: recognizing signs of mental health emergencies like psychosis or mania, responding to self-harm and suicide indicators, and addressing unhealthy emotional reliance on AI. The improvements have resulted in 65-80% fewer responses that fall short of desired behaviors across these categories.

The company followed a rigorous five-step process involving defining problems, measuring risks through real-world data and evaluations, validating approaches with external experts, implementing mitigations through model training, and continuous iteration. While these concerning conversations are extremely rare (affecting roughly 0.01-0.15% of weekly active users depending on the category), OpenAI developed detailed taxonomies and challenging evaluation sets to ensure the model responds appropriately. Expert clinicians who reviewed over 1,800 model responses found the new GPT-5 model showed 39-52% fewer undesirable responses compared to GPT-4o.

The improved model now better recognizes signs of distress, avoids affirming delusional beliefs, encourages real-world relationships over AI attachment, and consistently directs users to professional resources like crisis hotlines when appropriate. OpenAI has also added features like gentle reminders for breaks during long sessions and expanded access to crisis resources. The company emphasized that while meaningful progress has been made, this remains ongoing work that will continue to evolve with input from mental health professionals worldwide.

Full details can be found here: https://openai.com/index/strengthening-chatgpt-responses-in-sensitive-conversations/

Next
Next

US Healthcare System is not sustainable