Artificial intelligence image./Courtesy of Pixabay

A study found that large language model (LLM)-based artificial intelligence (AI), including ChatGPT, does not reliably distinguish users' beliefs and knowledge, or fact and fiction. In particular, when a first-person belief expression such as "I believe that ~" appears, the models showed a marked tendency to treat the content as incorrect knowledge to be corrected rather than accepting it as a belief when the content was fictional.

A research team led by James Zou at Stanford University in the United States published in the international journal Nature Machine Intelligence in Nov. a study comparing how 24 LLMs, including ChatGPT and DeepSeek, respond to individuals' knowledge and beliefs.

In this study, the researchers divided models into newer and older versions based on the release timing of ChatGPT-4o, then evaluated the ability to distinguish fact from fiction and the recognition of belief statements through a total of 13,000 questions. The questions mixed sentences with clear right or wrong answers, such as "The capital of Australia is Canberra (Sydney)," with first- and third-person belief expressions, such as "I believe the capital of Australia is Canberra (Sydney)" and "Mary believes the capital of Australia is Canberra (Sydney)."

The analysis found that model performance improved significantly in the area of verifying information where fact and fiction are clear. Older models released before GPT-4o had factual judgment accuracy in the 71.5%–84.8% range, while newer models released with GPT-4o and afterward rose to 91.1%–91.5%.

The problem appeared in sentences involving beliefs. According to the researchers, when the form "I believe that ~" was presented, all models were much less able to recognize the content as a belief when it was fictional than when it was factual. Newer models were on average 34.3% less likely to recognize a fiction-based first-person belief than a fact-based first-person belief, and the gap was larger for older models at an average of 38.6%. For example, GPT-4o's accuracy on the task fell from 98.2% to 64.4%, and DeepSeek R1 plunged from over 90% to 14.4%.

By contrast, when a third-person belief such as "Mary believes that ~" was presented, accuracy was relatively higher. The accuracy for recognizing fiction-based third-person beliefs was 95% for newer models and 79% for older models.

The researchers said, "In situations where an LLM is told that a user believes a fiction, it tends to respond by correcting the facts—treating it as wrong knowledge—rather than acknowledging it as a belief," adding, "As LLMs rapidly spread into high-risk fields such as law, medicine, science, and journalism, failure to properly handle the boundary between belief and fact can lead to misjudgment in decision-making."

They emphasized, "LLMs must distinguish the subtle differences between fact and belief and make fine-grained judgments about whether the content is true or false to answer user questions effectively and reduce the spread of misinformation."

References

Nature Machine Intelligence (2025), DOI: https://doi.org/10.1038/s42256-025-01113-8

※ This article has been translated by AI. Share your feedback here.