Skip to Main Content

AI Tools: ChatGPT and Large Language Models

This is an introduction to ChatGPT and large language models. We will do our best to keep information updated, but the technology changes rapidly!

What is ChatGPT?

ChatGPT is an artificial intelligence language model, similar to a chatbot, but more robust.

ChatGPT is not a search engine where you are given a set of results to a specific search, but instead it creates “new content” by predicting the word most likely to come next (e.g. based on HUGE dataset -- publicly available Internet sites as of 2020).

Language Learning Models (LLMs), like ChatGPT, are designed to model human language and use mathematical models to predict what the next word is most likely to be based on what you are asking for.

Keep in mind: they don't think. They don't understand, read, choose or give you the "best information." Sometimes it might feel or seem like it, but this isn't how the technology works. They also won't tell you where they got the information they're pulling from, and who is doing the work behind the scenes. Many, if not most, are unregulated and influenced by how we all interact with it. 

 

University of Minnesota Libraries. (2023, July 21). ChatGPT and other AI tools. https://libguides.umn.edu/c.php?g=1314591&p=9664664

Where does the information come from? How current is it?

Where does the information come from?

ChatGPT was trained on a body of text which allows it to generate text in response to a prompt. Some partial lists of the training dataset exist, and ChatGPT will also provide a partial list when queried. However, the entire body of text that has trained ChatGPT is unknown.

When ChatGPT provides an answer to a question, it will not immediately provide a reference for where the information came from. This is because it is pulling predictive language from a wide variety of places, so the information usually doesn't come from a single source. Because of this, you typically cannot trace the response back to a single parent source or know where the information came from.

How current is the information?

As of 1 March 2023, the cutoff date for the data ChatGPT was trained on is September 2021. This means the tool has limited access to events and information more recent than that. ChatGPT is being updated regularly, so this may change. However, it is important to realize that the currency of the information provided by ChatGPT is lagging. This can impact information credibility, especially when dealing with a topic where the age of your information matters.

Update

As of 24 March 2023, OpenAI has began implementing plugins for ChatGPT which will "help [it] access up-to-date information, run computations, or use third-party services." The access to current information is not yet a part of the ChatGPT research preview commonly used. 

Bard and currency

Google Bard does not have a cutoff date for the information it was trained on. However, this does not mean Bard will still be wholly accurate.

 

Scheelke, A. (2023, July 10). AI, ChatGPT, and the Library. https://libguides.slcc.edu/ChatGPT/InformationLiteracy

 

Challenges and benefits of ChatGPT

Challenges of using ChatGPT or other LLMs in your coursework

  • There is a mix of correct and incorrect information
  • Has limited knowledge of the world after 2020 (WHOA! Things have happened since then, right?)
  • Likelihood of biased content is high.

Benefits of using ChatGPT or other LLMs in your coursework

  • Can provide simple explanations to well known, non-controversial topics
  • Can provide sample text
  • Can create a list of keywords, search terms

 

University of Minnesota Libraries. (2023, July 21). ChatGPT and other AI tools. https://libguides.umn.edu/c.php?g=1314591&p=9664664

Check AI tools for credibility

Evaluating all information for credibility is highly recommended, regardless where you find it. This is especially true for generative AI responses. There are many different tools, checklists, and strategies to help you evaluate your sources. None of them are black-and-white checklists for determining if a source is credible and if you should use it.

Here are two strategies for evaluating information provided by generative AI tools:

Lateral reading

Don't take what ChatGPT tells you at face value. Look to see if other reliable sources contain the same information and can confirm what ChatGPT says. This could be as simple as searching for a Wikipedia entry on the topic or doing a Google search to see if a person ChatGPT mentions exists. When you look at multiple sources, you maximize lateral reading and can help avoid bias from a single source.

Watch Crash Course's "Check Yourself with Lateral Reading" video (14 min) to learn more.

Verify citations

If a generative AI tool provides a reference, confirm that the source exists. Try copying the article title or citation into a search tool like Google Scholar or the Library's One Search. Do a Google search for the lead author.

If the source is real, check that it contains what ChatGPT says it does. Read the source or its abstract.  

ChatGPT can provide references. However, these references may not be where the information actually came from and - more importantly - may not be for real sources. Despite sounding plausible, ChatGPT can easily hallucinate citations. This can be an issue with other generative AI tools, like Bard, as well.

For example, ChatGPT provided this citation and summary in response to a prompt about needing an academic journal article on the topic of amphetamine use and social determinants of health:

"Here's an example of a peer-reviewed journal article on amphetamine abuse and its intersections with social determinants of health:
Stoové, M. A., Dietze, P. M., Jolley, D., Fry, C. L., Brewer, N., & Hellard, M. E. (2008). Increasing access to sterile injecting equipment and syringe exchange in a rural area with a high prevalence of bloodborne viruses. Australian Journal of Rural Health, 16(2), 99-106. 
This article describes a study conducted in a rural area of Australia with a high prevalence of bloodborne viruses, including those associated with amphetamine use..." OpenAI. (2023, February 17). [ChatGPT response to a prompt about locating an academic journal article]. https://chat.openai.com/

Although the summary sounds plausible and the citation looks realistic, this article does not exist. The journal exists, as does the lead author. However, Stoové has not published in this journal.

 

Scheelke, A. (2023, July 10). AI, ChatGPT, and the Library. https://libguides.slcc.edu/ChatGPT/InformationLiteracy