OpenAI claims GPT-4o users risk getting emotionally attached to its 'voice'

While forming "social relationships" with the AI model could help lonely people, it could also impact healthy relationships, OpenAI said

We may earn a commission from links on this page.
photo illustration, the 'Chat GPT-4o' logo is displayed on a mobile phone screen in front of a computer screen displaying the OpenAI's website with "Spring Update"
Illustration: Ismail Aslandag/Anadolu (Getty Images)

As OpenAI rolls out the advanced version of voice mode for its latest model, GPT-4o, the company says the feature could increase the risk of some users seeing artificial intelligence models as “human-like.”

GPT-4o’s “human-like, high-fidelity voice” could make the issue of hallucinations, or a model’s tendency to make up fake or nonsensical information, worse, OpenAI said in a report on the AI model’s safety, which could impact human trust.

Advertisement

During early red teaming and internal user testing, OpenAI said it observed users talking to the model with “language that might indicate forming connections with the model,” such as one user telling the model: “This is our last day together.”

Advertisement

“While these instances appear benign, they signal a need for continued investigation into how these effects might manifest over longer periods of time,” OpenAI said, adding that it is continuing to look into the risk with more diverse users, and academic and internal studies.

Advertisement

While forming “social relationships” with AI models could help lonely people, it could also impact healthy relationships by reducing the need for human-to-human interaction, OpenAI said. Depending on AI models for “human-like” interaction could also “influence social norms,” the company said, such as interrupting conversations at any time, which the model allows users to do, but would be atypical in a conversation with a real person.

The voice capabilities of GPT-4o, which debuted in May, were tested with over 100 external red teamers in 45 languages, and the AI model was trained to only speak in four preset voices to protect the privacy of voice actors. GPT-4o is built to block outputs using voices that are not preset, and therefore cannot be used to impersonate individuals and public figures. OpenAI also added guardrails to block requests for copyrighted audio, including music, and for erotic, violent, and harmful content.

Advertisement

OpenAI is addressing a risk that was the focus of chief executive Sam Altman’s favorite film, Her, which depicts a man developing feelings for a virtual assistant voiced by actress Scarlett Johansson. In May, users said one of GPT-4o’s voices, Sky, sounded similar to Johansson’s, leading the company to pause the voice, saying it was not meant to imitate the actress. Johansson said in a letter she was “shocked, angered, and in disbelief” that the company would use a voice sounding “eerily similar” to hers after she had declined to work with Altman.