Commonly Used AI Terms
Note: These terms are based on the version of ChatGPT freely available (as of this writing, May 2023). ChatGPT, a subscription service, uses ChatGPT 4. The technology is rapidly changing and these terms will as well. For example, as of this writing, ChatGPT can only take a text prompt, but the ability to upload images is expected to be available soon in ChatGPT4.
- ChatGPT: A system/service/tool that generates text from a “prompt” input.
- Prompt: Text input, such as a sentence, paragraphs, or whole pages of text. The prompt is the request for which ChatGPT generates a response.
- Session: A set of interactions with ChatGPT in a browser. A user’s interaction with ChatGPT within a session can helps refine the prompts and the output. Once a user closes their browser or logs out, the session terminates. Unlike Siri, Alexa, and Google Assistant, at this point in its development ChatGPT does not “remember” a user and does not carry over conversations between sessions.
- GPT: “Generative Pre-trained Transformer” – A type of artificial intelligence model.
- Generative: The system creates output
- Pre-Trained: The system is trained with labeled data (see supervised learning, below)
- Transformer: A type of artificial intelligence model.
- GPT-3, GPT-3.5, GPT-4: The models that power services such as DALL-E and ChatGPT.
- The initial public release of ChatGPT was based on GPT-3, then quickly updated to use GPT-3.5. GPT-3 and GPT-3.5 are examples of “large language models.”
- In March 2023, OpenAI announced GPT-4 with improved functionality and expanded capabilities, including the ability to work with images. As such, GPT-4 is a “large multimodal model.”
Additional AI Terms
When one approaches artificial intelligence from a field other than computer science, the use of familiar terms from other contexts can be confusing.
- Model: At its most simple level, a calculating processor. Like a handheld calculator, a model takes an input and produces an output. There are many types of models that take a particular type of input, such as text or images, or a combination, such as text and images.
- Training: Providing input to a model from a source. For GPT-x, the “training data” is a corpora of pre-2022 documents from the Internet.
- Bias: In a statistical context, bias refers to a non-random data set used to train a model. For example, a data set intended for a world language application but drawn only from English novels, or a data set intended for general architectural design but consisting only of images of seaside houses, are said to be “biased training data.”
- Learning is found in computer science and statistical analysis terms such as “machine learning” and “deep learning.” In the context of artificial intelligence, “learning” is when a system ingests information and, through its processing model, develops output. Two common forms of machine learning are:
- Supervised Learning: A model using labeled data, such as a collection of images labeled as “bicycles.”
- Unsupervised Learning: A model using unlabeled data.
To understand the difference between these two forms, with supervised learning, think of a kindergarten teacher who shows a child objects made from Legos and tells the student, “Here are 5 examples of houses, and here are 5 examples of cars.” By instructing the student which items are houses and which are cars (i.e., labeling the data), the teacher is supervising the learning. The child then learns to distinguish which features of a house make it characteristic of a house (such as having a porch, windows, and doors), and which features of a car make it characteristic of a car (such as having an engine, wheels, and seats).
In contrast, with unsupervised learning, the data has no labels. The kindergarten teacher would just say to the child, “Here are 10 Lego objects,” but does not tell the student which 5 objects are cars and which 5 objects are houses (i.e., unlabeled data). The child then compares the objects to each other and identifies patterns. Without knowing what the objects are called, the student may realize that 5 objects share common characteristics and therefore groups them together. The other 5 objects look similar to each other, and so the student puts them into a separate group.
The words in a text corpora are annotated with parts-of-speech, semantic roles, syntactic structures, language identifiers, and other characteristics. In earlier days of the development of artificial intelligence, computer time was very costly so creating curated datasets was a computational necessity. While having experts hand-code supervised learning datasets is time consuming and expensive, it resulted in very high-quality outputs. For example, when the Brown Corpus (1961) — the first million-word electronic dataset — was developed, each term in 500 works were manually tagged with a genre.
Unsupervised learning is computationally more challenging because the training data does not have labels, but modern computing resources are inexpensive and readily available and as a result can computationally process much greater raw data.
The term “deep learning” has different meanings in teaching and learning and artificial intelligence. Kim DeBacco shares this definition: “Martin & Säljö (the original authors of the theory), followed by Entwistle, Ramsden, Prosser, Trigwell, and others, differentiate deep learning (e.g., the ability to apply new concepts to a different context) from surface learning (e.g., cramming for exams). Surface learning relies on short-term memory: in one ear, and out the other! The constructive alignment theory (sometimes called backward design) holds that if course-level learning outcomes, activities, and assessments are closely aligned, then students will engage in deep learning which lasts over time.
In their Deep Learning (MIT Press, 2016), Goodfellow, et al, provide this definition: “A major source of difficulty in many real-world artificial intelligence applications is that many of the factors of variation influence every single piece of data we are able to observe. The individual pixels in an image of a red car might be very close to black at night. The shape of the car’s silhouette depends on the viewing angle. Most applications require us to disentangle the factors of variation and discard the ones that we do not care about…. Deep learning solves this central problem in representation learning by introducing representations that are expressed in terms of other, simpler representations. Deep learning enables the computer to build complex concepts out of simpler concepts.” (5)
- Network: In artificial intelligence, often used in the phrase, “neural network.” Inspired by the neurons of the brain, this type of a network uses linked computers, wherein the output of one “node” (simply: a processor) is sent to multiple other nodes. These nodes, layered in a hierarchical fashion like the boxes of a family tree or an org chart, form the network. The output of one layer becomes the input for the next layer, and the last layer produces the result from the previous computations.