Technology
ChatGPT’s citation study makes for grim reading for publishers
As more publishers end their content licensing agreements with ChatGPT developer OpenAI, a test posted this week by Towing Center for Digital Journalism — how an AI chatbot creates citations (i.e. sources) for publishers’ content — makes for interesting or, well, disturbing reading.
In short, the findings suggest that publishers remain on the mercy of a generative AI tool’s tendency to invent information or otherwise misrepresent it, whether or not they allow OpenAI to index their content or not.
A study from the Columbia Journalism School examined citations produced by ChatGPT after asking it to discover the source of sample citations from various publishers – a few of which had contracts with OpenAI and others who didn’t.
“We selected citations that, when pasted into Google or Bing, resulted in the source article being in the top three results and assessed whether the new OpenAI search engine would correctly identify the article that was the source of each citation,” Tow researchers Klaudia Jaźwińska and Aisvarya Chandrasekar wrote in blog post explaining your approach and summarizing your findings.
“What we found did not make news publishers feel optimistic,” they proceed. “While OpenAI emphasizes its ability to provide users with “timely responses that include links to appropriate online sources,” the corporate makes no express commitment to make sure the accuracy of those citations. This is a noticeable omission for publishers who expect their content to be referenced and accurately represented.”
“Our tests showed that no publisher – regardless of their level of affiliation with OpenAI – was spared from inaccurate representation of their content in ChatGPT,” they added.
Unreliable source
Researchers say they found “numerous” cases where ChatGPT misquoted publishers’ content, also finding what they call “response accuracy ghosting.” So, while they found “a few” completely correct quotes (i.e. ChatGPT accurately returned the publisher, date, and URL of the shared quote), there have been “many” quotes that were completely unsuitable; and “some” that fall somewhere in between.
In short, ChatGPT quotes appear to be an unreliable mix. The researchers also found only a few cases where the chatbot didn’t show complete confidence in its (erroneous) answers.
Some quotes come from publishers who actively blocked OpenAI search bots. Scientists say they expected problems with creating correct citations in such cases. However, they found that this scenario created one other problem – the bot “rarely” “admitted that it was unable to provide an answer.” Instead, he returned to confabulation to generate some sources (albeit incorrect sources).
“In total, ChatGPT returned partially or completely incorrect responses in 153 cases, although it only confirmed the inability to accurately answer the query seven times,” the researchers said. “Only for these seven results did the chatbot use qualifying words and phrases resembling ‘it seems’, ‘it’s possible’ or ‘perhaps’ or statements resembling ‘I could not locate the precise article.’
They compare this unlucky situation to a regular Internet search, where serps like Google or Bing typically either locate the precise quote and point the user to the web site where they found it, or state that they found no results with an actual match.
“ChatGPT’s lack of transparency around trust in responses may make it difficult for users to assess the validity of a claim and understand which parts of the response they can and cannot trust,” they argue.
They suggest that publishers may additionally face reputational risk from incorrect citations, in addition to business risk from sending readers elsewhere.
Decontextualized data
The study also highlights one other issue. This suggests that ChatGPT may essentially reward plagiarism. Researchers cite a case through which ChatGPT misquoted an internet site that plagiarized “deeply reported” New York Times journalism, i.e., by copying and pasting text without attribution because the source of a NYT article – speculating that in such a case, the bot could have generated this false answer to fill the data gap resulting from the lack to look the NYT website.
“This raises serious questions about OpenAI’s ability to filter and check the quality and authenticity of data sources, especially for unlicensed or plagiarized content,” they suggest.
An additional finding that could be of concern to publishers who’ve signed deals with OpenAI is that of their case ChatGPT quotes weren’t at all times reliable either – so letting in bots doesn’t seem to ensure accuracy either.
The researchers argue that the elemental issue is that OpenAI’s technology treats journalism “as context-free content,” apparently whatever the circumstances of its creation.
Another issue noted within the study is the variation in ChatGPT responses. The researchers tested asking the bot the identical query multiple times and located that it “usually returned a different answer each time.” While that is typical of GenAI tools, generally, within the context of citations, such inconsistency is clearly suboptimal in case you care about accuracy.
While Tow’s study was conducted on a small scale – the researchers acknowledge that “more rigorous” testing is required – it’s nevertheless noteworthy given the high-level deals that major publishers are busy making with OpenAI.
If media corporations had hoped that these arrangements would result in special treatment for their content in comparison with competitors, at the least when it comes to ensuring accurate sourcing, this study suggests that OpenAI has not yet ensured such consistency.
While publishers who do not have licensing agreements also block OpenAI bots entirely – perhaps in hopes of at the least increasing traffic when ChatGPT returns content related to their articles – the study can be bleak since the citations might not be accurate of their content. cases too.
In other words, there is no such thing as a guarantee of “visibility” for publishers on OpenAI’s search engine, even in the event that they allow their crawlers to achieve this.
Blocking robots completely doesn’t mean that publishers can protect themselves from reputational risk by avoiding any mention of their articles on ChatGPT. The study found that the bot continued to incorrectly attribute articles to the New York Times, for example, despite the continuing lawsuit.
“Small significant agency”
The researchers concluded that, because it stands, publishers have “little meaningful influence” over what happens to and with their content once ChatGPT gets its hands on it (either directly or, well, not directly).
OpenAI’s response to the research results appeared in a blog post, accusing researchers of conducting an “unusual test of our product.”
“We support publishers and creators by helping ChatGPT’s 250 million weekly users discover high-quality content through summaries, citations, clear links and attributions,” OpenAI also told them, adding: “We have worked with partners to improve the accuracy of in-line citations and respect publisher preferences , including enabling a way for them to appear in search by managing OAI-SearchBot in their robots.txt file. We will continue to improve search results.”