Why is ChatGPT so bad at references?

ChatGPT references
How to cite this article (Harvard) amend as required
Stevens, G. (2026) Why is ChatGPT so bad at references?, Academic Writing and Research. Available at: https://academic-writing.uk/why-is-chatgpt-so-bad-at-references/ (Accessed on: February 5, 2026)

In the same way that Google, especially Google Scholar, was a great innovation for academic writers twenty years ago, ChatGPT can be a very useful tool for today’s students and researchers. But why is ChatGPT so bad at references?

You didn’t know it was?

Some of you may be surprised to read the title of this article. After all, searching for a precise reference for a source, let’s say a journal article, should be within the grasp of today’s much-vaunted AI. Well, I can assure you that my own tests show a 30 to 40% failure rate on academic references. This is enough to cause you a serious problem when your dissertation, thesis or research article is being assessed. In this article, I try to get to the bottom of why this happens.

Checking a list of references is one of the more tedious aspects of academic writing. Understandable then that it is tempting to copy and paste your entire reference list into ChatGPT (or its Academic version Scholar GPT) and ask it to check it for you. But don’t do that until you have read the rest of this article.

Citation drift

Citation drift is a phenomenon that ChatGPT regularly falls victim to. With millions of references found online and students and researchers using references that appear in other academic works rather than checking them on publishers’ websites, faulty references can spread through the internet like wildfire. This includes incorrect titles, incorrect authors, incorrect years and perhaps most commonly, incorrect DOIs. Even non-existent sources proliferate. Provided the reference looks plausible on the surface, AI models are likely to take them as fact. I asked Scholar GPT to explain   

Scholar GPT explains ‘Citation drift’ that it says is part of the reason for its inaccuracy

So we can see from this that because the world of academic referencing is so full of errors, these errors find their way into the references that ChatGPT will give you. But that’s not the full story. We also need to consider the model’s ‘personality’.

“Always sound confident” and optimised for “helpfulness”

When pressed further on why it makes mistakes, ScholarGPT revealed further reasons why it is so prone to present inaccurate citations while guaranteeing their accuracy. It seems it is down in part to its personality. The model behind Scholar GPT is optimised for “helpfulness” I was informed and, furthermore, it is trained to “always sound confident”. As a consequence, in the context of references, not only is there a strong chance a reference may be wrong, but it will be presented to you as the gospel truth with complete conviction. Why?

The answer lies in how the model is trained. Firstly, it is trained to always sound confident. You won’t read “I’m not sure”, or “You may want to double check that”. Instead, every piece of information it provides you with will be offered in a confident manner. Secondly, by its own admission, the model is optimised for “helpfulness”. It is trained to help, and you won’t see the words “sorry, I can’t help you with that”. So, it seems that if the model has the choice of supplying you with information (for example, a reference), or admitting it can’t help or is unsure, then it will default to the former. Below is ChatGPT going to great lengths to explain to me why a reference it has checked for me is correct.

Here it is replying after I show evidence that it is, in fact, an incorrect reference –

Here is ChatGPT (Scholar version) confessing to its model’s limitations, particularly it confidence and helpfulness.

So, what does this all mean?

Well, the obvious conclusion is to treat references provided to you by ChatGPT/Scholar GPT with extreme caution. Google Scholar remains an excellent way of checking references and check on a publisher’s website remains the gold standard. More than this, if you ask ChatGPT to check your references using the correct prompt, you can substantially reduce the risk of errors. The bad news is that your prompt needs to be extremely thorough. But the good news is that as a reader of Academic Writing & Research, you can access it right here in a copy and pasteable format.

Final takeaway

So, the main messages from this article are: (1) ChatGPT and its ScholarGPT version are really not very good with academic references; (2) the reasons for this are a combination of the proliferation of bad references found online and the way the model is trained to prioritise ‘confidence’ and ‘helpfulness’. (3) Never blindly accept a reference or list of references offered to you by ChatGPT. (4) The chatbot can be helpful if you are willing to use a very detailed prompt to reduce the chance of errors. (5) Always make your final check a human one.

Written by Glenn Stevens

You may also like...