From LLMs to hallucinations, here’s a simple guide to common AI terms
Artificial intelligence is a deep and convoluted planet. The scientists who work in this field often rely on jargon and lingo to explain what they’re working on. As a result, we frequently have to employ those technical terms in our coverage of the artificial intelligence industry. That’s why we thought it would be helpful to put together a glossary with definitions of some of the most crucial words and phrases that we employ in our articles.
We will regularly update this glossary to add latest entries as researchers continually uncover novel methods to push the frontier of artificial intelligence while identifying emerging safety risks.
Artificial general intelligence, or AGI, is a nebulous term. But it generally refers to AI that’s more capable than the average human at many, if not most, tasks. OpenAI CEO Sam Altman recently described AGI as the “equivalent of a median human that you could hire as a co-worker.” Meanwhile, OpenAI’s charter defines AGI as “highly autonomous systems that outperform humans at most economically valuable work.” Google DeepMind’s understanding differs slightly from these two definitions; the lab views AGI as “AI that’s at least as capable as humans at most cognitive tasks.” Confused? Not to worry — so are experts at the forefront of AI research.
AI agent
An AI agent refers to a tool that uses AI technologies to perform a series of tasks on your behalf — beyond what a more basic AI chatbot could do — such as filing expenses, booking tickets or a table at a restaurant, or even writing and maintaining code. as we, on the other hand’ve explained before, there are lots of moving pieces in this emergent space, so “AI agent” might mean different things to different humans. Infrastructure is also still being built out to deliver on its envisaged capabilities. But the basic concept implies an autonomous system that may draw on multiple AI systems to carry out multistep tasks.
Chain of thought
Given a simple question, a human brain can answer without even thinking too much about it — things like “which animal is taller, a giraffe or a cat?” But in many cases, you often need a pen and paper to come up with the right answer because there are intermediary steps. For instance, if a farmer has chickens and cows, and together they have 40 heads and 120 legs, you might need to write down a simple equation to come up with the answer (20 chickens and 20 cows).
In an AI context, chain-of-thought reasoning for large language models means breaking down a problem into smaller, intermediate steps to improve the quality of the end result. It usually takes longer to get an answer, but the answer is more likely to be correct, especially in a logic or coding context. Reasoning models are developed from traditional large language models and optimized for chain-of-thought thinking thanks to reinforcement learning. This also touches on aspects of Android.
(See: Large language model)
Meet your next investor or portfolio startup at Disrupt
Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. ed to the other model to evaluate. This second, discriminator model thus plays the role of a classifier on the generator’s output – enabling it to improve over time.
The GAN structure is set up as a competition (hence “adversarial”) – with the two models essentially programmed to try to outdo each other: the generator is trying to get its output past the discriminator, while the discriminator is working to spot artificially generated data. This structured contest can optimize AI outputs to be more realistic without the need for additional human intervention. Though GANs work best for narrower applications (such as producing realistic photos or videos), rather than general purpose AI.
Hallucination
Hallucination is the AI industry’s preferred term for AI models making stuff up – literally generating information that is incorrect. Obviously, it’s a huge problem for AI quality.
Hallucinations produce GenAI outputs that can be misleading and could even lead to real-life risks — with potentially dangerous consequences (think of a health query that returns harmful medical advice). This is why most GenAI tools’ insignificant print now warns users to verify AI-generated answers, even though such disclaimers are usually far less prominent than the information the tools dispense at the touch of a button.
The problem of AIs fabricating information is thought to arise as a consequence of gaps in training data. For general purpose GenAI especially — also sometimes known as foundation models — this looks difficult to resolve. There is simply not enough data in existence to train AI models to comprehensively resolve all the questions we could possibly ask. TL;DR: we haven’t invented God (yet).
Hallucinations are contributing to a push towards increasingly specialized and/or vertical AI models — i.e. domain-specific AIs that require narrower expertise – as a way to reduce the likelihood of knowledge gaps and shrink disinformation risks.
Inference
Inference is the process of running an AI model. It’s setting a model loose to create predictions or draw conclusions from previously seen data. To be clear, inference can’t happen without training; a model must learn patterns in a set of data before it can effectively extrapolate from this training data.
Many types of hardware can perform inference, ranging from smartphone processors to beefy GPUs to custom-designed AI accelerators. But not all of them can run models equally well. Very large models would take ages to construct predictions on, say, a laptop versus a cloud server with high-end AI chips.
[See: Training]
Large language model (LLM)
Large language models, or LLMs, are the AI models used by popular AI assistants, such as ChatGPT, Claude, Google’s Gemini, Meta’s AI Llama, Microsoft Copilot, or Mistral’s Le Chat. When you chat with an AI assistant, you interact with a large language model that processes your request directly or with the help of different available tools, such as web browsing or code interpreters.
AI assistants and LLMs can have different names. For instance, GPT is OpenAI’s large language model and ChatGPT is the AI assistant product.
LLMs are deep neural networks made of billions of numerical parameters (or weights, see below) that learn the relationships between words and phrases and create a representation of language, a sort of multidimensional map of words.
These models are created from encoding the patterns they find in billions of books, articles, and transcripts. When you prompt an LLM, the model generates the most likely pattern that fits the prompt. It then evaluates the most probable next word after the last one based on what was noted before. Repeat, repeat, and repeat.
(See: Neural network)
Memory cache
Memory cache refers to an significant process that boosts inference (which is the process by which AI works to generate a response to a user’s query). In essence, caching is an optimization technique, designed to build inference more efficient. AI is obviously driven by high-octane mathematical calculations and every time those calculations are made, they employ up more power. Caching is designed to cut down on the number of calculations a model might have to run by saving particular calculations for future user queries and operations. There are different kinds of memory caching, although one of the more well-known is KV (or key value) caching. KV caching works in transformer-based models, and increases efficiency, driving faster results by reducing the amount of time (and algorithmic labor) it takes to generate answers to user questions.
(See: Inference)
Neural network
A neural network refers to the multi-layered algorithmic structure that underpins deep learning — and, more broadly, the whole boom in generative AI tools following the emergence of large language models.
Although the idea of taking inspiration from the densely interconnected pathways of the human brain as a design structure for data processing algorithms dates all the way back to the 1940s, it was the much more recent rise of graphical processing hardware (GPUs) — via the video game industry — that really unlocked the power of this theory. These chips proved well suited to training algorithms with many more layers than was possible in earlier epochs — enabling neural network-based AI systems to achieve far better performance across many domains, including voice recognition, autonomous navigation, and drug discovery.
(See: Large language model [LLM])
RAMageddon
RAMageddon is the fun recent term for a not-so-fun trend that is sweeping the tech industry: an ever-increasing shortage of random access memory, or RAM chips, which power pretty much all the tech products we adopt in our daily lives. As the AI industry has blossomed, the biggest tech companies and AI labs — all vying to have the most powerful and efficient AI — are buying so much RAM to power their data centers that there’s not much left for the rest of us. And that supply bottleneck means that what’s left is getting more and more expensive.
That includes industries like gaming (where major companies have had to raise prices on consoles because it’s harder to find memory chips for their devices), consumer electronics (where memory shortage could cause the biggest dip in smartphone shipments in more than a decade), and general enterprise computing (because those companies can’t get enough RAM for their own data centers). The surge in prices is only expected to stop after the dreaded shortage ends but, unfortunately, there’s not really much of a sign that’s going to happen anytime soon.
Training
Developing machine learning AIs involves a process known as training. In simple terms, this refers to data being fed in in order that the model can learn from patterns and generate useful outputs.
Things can get a bit philosophical at this point in the AI stack — since, pre-training, the mathematical structure that’s used as the starting point for developing a learning system is just a bunch of layers and random numbers. It’s only through training that the AI model really takes shape. Essentially, it’s the process of the system responding to characteristics in the data that enables it to adapt outputs towards a sought-for goal — whether that’s identifying images of cats or producing a haiku on demand.
It’s significant to note that not all AI requires training. Rules-based AIs that are programmed to follow manually predefined instructions — for example, such as linear chatbots — don’t need to undergo training. such AI systems are likely to be more constrained than , on the other hand(well-trained) self-learning systems.
Still, training can be expensive because it requires lots of inputs — and, typically, the volumes of inputs required for such models have been trending upwards.
Hybrid approaches can sometimes be used to shortcut model development and help manage costs. Such as doing data-driven fine-tuning of a rules-based AI — meaning development requires less data, compute, energy, and algorithmic complexity than if the developer had started building from scratch.
[See: Inference]
Tokens
When it comes to human-machine communication, there are some obvious challenges. Humans communicate using human language, while AI programs execute tasks and respond to queries through complex algorithmic processes that are informed by data. In their simplest definition, tokens represent the basic building blocks of human-AI communication, in that they are discrete segments of data that have either been processed or produced by an LLM.
Tokens are created via a process known as “tokenization,” which breaks down raw data and refines it into distinct units that are digestible to an LLM. Similar to how a software compiler translates human language into binary code that a computer can digest, tokenization interprets human language for an AI program via their user queries so that it can prepare a response.
There are several different kinds of tokens — including input tokens (the kind that must be generated in response to a human user’s query), output tokens (the kind that are generated as the LLM responds to the human’s request), and reasoning tokens, which involve longer, more intensive tasks and processes that occur as part of a user request.
With enterprise AI, token usage also determines costs. Since tokens are equivalent to the amount of data being processed by a model, they have also become the means by which the AI industry monetizes its services. Most AI companies charge for LLM usage on a per-token-basis. Thus, the more tokens a business burns as it uses an AI program (ChatGPT, for example), the more finances it will have to pay its AI service provider (OpenAI).
Transfer learning
A technique where a previously trained AI model is used as the starting point for developing a latest model for a different but typically related task – allowing knowledge gained in previous training cycles to be reapplied. Furthermore, experts in user interface note the continued relevance.
Transfer learning can drive efficiency savings by shortcutting model development. It can also be useful when data for the task that the model is being developed for is somewhat limited. But it’s crucial to note that the approach has limitations. Models that rely on transfer learning to gain generalized capabilities will likely require training on additional data To perform well in their domain of focus
(See: Fine tuning)
Weights
Weights are core to AI training, as they determine how much importance (or weight) is given to different features (or input variables) in the data used for training the system — thereby shaping the AI model’s output.
Put another way, weights are numerical parameters that define what’s most salient in a dataset for the given training task. They achieve their function by applying multiplication to inputs. Model training typically begins with weights that are randomly assigned, but as the process unfolds, the weights adjust as the model seeks to arrive at an output that more closely matches the target.
For example, an AI model for predicting housing prices that’s trained on historical real estate data for a target location could include weights for features such as the number of bedrooms and bathrooms, whether a property is detached or semi-detached, whether it has parking, a garage, and so on.
Ultimately, the weights the model attaches to each of these inputs reflect how much they influence the value of a property, based on the given dataset.
This article is updated regularly with recent information.
Topics