OpenAI’s ChatGPT Search is struggling to accurately cite news publishers, according to a study by Columbia University’s Tow Center for Digital Journalism.
The report found frequent misquotes and incorrect attributions, raising concerns among publishers about brand visibility and control over their content.
Additionally, the findings challenge OpenAI’s commitment to responsible AI development in journalism.
Background On ChatGPT Search
OpenAI launched ChatGPT Search last month, claiming it collaborated extensively with the news industry and incorporated publisher feedback.
This contrasts with the original 2022 rollout of ChatGPT, where publishers discovered their content had been used to train the AI models without notice or consent.
Now, OpenAI allows publishers to specify via the robots.txt file whether they want to be included in ChatGPT Search results.
However, the Tow Center’s findings suggest publishers face the risk of misattribution and misrepresentation regardless of their participation choice.
Accuracy Issues
The Tow Center evaluated ChatGPT Search’s ability to identify sources of quotes from 20 publications.
Key findings include:
Of 200 queries, 153 responses were incorrect.
The AI rarely acknowledged its mistakes.
Phrases like “possibly” were used in only seven responses.
ChatGPT often prioritized pleasing users over accuracy, which could mislead readers and harm publisher reputations.
Additionally, researchers found ChatGPT Search is inconsistent when asked the same question multiple times, likely due to the randomness baked into its language model.
Citing Copied & Syndicated Content
Researchers find ChatGPT Search sometimes cites copied or syndicated articles instead of original sources.
This is likely due to publisher restrictions or system limitations.
For example, when asked for a quote from a New York Times article (currently involved in a lawsuit against OpenAI and blocking its crawlers), ChatGPT linked to an unauthorized version on another site.