Exploration vs Exploitation
Exploration versus exploitation is the core dilemma of reinforcement learning: try new actions to learn, or repeat the best-known one to win.
Plain-language explanations of the ideas behind modern AI.
Exploration versus exploitation is the core dilemma of reinforcement learning: try new actions to learn, or repeat the best-known one to win.
A family of forecasting methods that weight recent observations more heavily, extended by Holt and Winters to handle trend and seasonality.
The US legal doctrine at the heart of AI copyright fights, weighing four factors to decide whether training on copyrighted works without permission is lawful.
The result that several natural definitions of fairness cannot all hold at once when group base rates differ.
A feature store is a central system that computes, stores, and serves the input features for ML models, shared across teams for both training and serving.
Privacy techniques that avoid pooling raw data: federated learning keeps data on devices, differential privacy adds calibrated noise to protect individuals.
Feedback is when a system's output is fed back to influence its input, the mechanism behind self-regulation in machines, organisms, and learning.
A model performing a new task from only a handful of examples, often given in the prompt with no retraining.
Adapting a pre-trained model to a specific task or behavior by training it further on a smaller, targeted dataset.
Finite state machines were the long-standing default for game AI, switching a character between named states like patrol and attack.
An Excel feature that synthesizes string-transformation programs from a few user examples, no code required.
A large model pretrained on vast scientific data, then adapted to many specific tasks, bringing the foundation-model recipe into the sciences.
The idea of pre-training one large model on vast amounts of time-series data so it can forecast new, unseen series with no per-dataset training.
Karl Friston's claim that brains act to minimize a single quantity, variational free energy, unifying perception, learning, and action.
The mechanism by which a language model outputs structured arguments to invoke developer-defined functions and tools.
A model that places a probability distribution over functions, giving predictions with calibrated uncertainty rather than single guesses.
A pair of neural networks that compete, one generating fake data and one judging it, producing strikingly realistic synthetic images and media.
AI systems that generate music audio from text prompts or melodies, exemplified by Google MusicLM and Meta MusicGen and popularized by tools like Suno and Udio.
Optimization method inspired by evolution, founded by John Holland in 1975, that breeds and mutates candidate solutions over generations on hard problems.
Goal-Oriented Action Planning lets game characters plan their own action sequences with A*, made famous by F.E.A.R.
The label John Haugeland gave in 1985 to classical symbolic AI, the view that intelligence is the rule-governed manipulation of symbols.
Graphics chips became ideal neural network engines: both rendering and neural networks rely on parallel matrix math, made usable for AI by programmable GPUs.
A technique that builds decision trees in sequence, each correcting the last; XGBoost is the implementation that dominates tabular-data tasks.
Gradient checkpointing saves memory during training by discarding most activations and recomputing them in the backward pass instead of storing them.
An optimization method that repeatedly nudges a model's settings in the direction that most reduces its error.
Neural networks that operate directly on graph data - nodes and edges - by passing messages along connections, used for molecules, weather, and networks.
The filtering and safety layer wrapped around a deployed model that screens inputs and outputs against unsafe content, separate from the model's own training.
When an AI language model produces fluent, confident text that is factually wrong or unsupported by its inputs.
The principle that neurons which fire together strengthen their connection, the oldest biologically grounded rule for learning in networks.
Hidden Markov models powered speech recognition and early NLP by inferring hidden states, like words, from observable signals.
High Bandwidth Memory stacks DRAM chips vertically beside the processor, feeding data to AI accelerators fast enough to keep thousands of cores busy.
High-frequency trading uses automated systems to place and cancel huge volumes of orders in microseconds to capture tiny, fleeting edges.
Encryption that lets you compute directly on encrypted data, so a server can process information it can never read.
A hosting service where anyone can publish a runnable machine-learning demo in the browser, which spread a culture of try-it-yourself model demos.
Routing small pieces of human judgment into problems machines cannot yet solve - the idea behind CAPTCHAs, crowd labeling, and AI data work.
Systematically searching for the model settings, like learning rate and depth, that are chosen before training and strongly affect results.
Teaching an agent a skill by having it learn from demonstrations of the desired behavior rather than from a reward signal.
A model's ability to learn a new task from examples placed in the prompt, without any change to its trained weights.
An open-source framework from the UK AI Security Institute for building and running large language model evaluations.
Graphcore's massively parallel AI chip that keeps model state in large on-chip memory across thousands of independent cores.
Software that adapts instruction to each learner by modeling their knowledge, exemplified by Carnegie Learning's Cognitive Tutor for math.
Research aimed at understanding what is actually happening inside an AI model, so its behavior can be explained, trusted, and corrected.
Inverse reinforcement learning flips the usual problem: instead of learning behavior from a reward, it infers the hidden reward from behavior.
Iris recognition identifies people from the texture of the iris, encoded by John Daugman's algorithm into a compact binary IrisCode.
JAX is a Google research library that pairs NumPy-style array code with composable transformations like autodiff, JIT compilation, and auto-vectorization.
The argument that cheaper, more efficient AI will increase total compute demand rather than reduce it - named for an 1865 observation about coal.
The most widely used clustering algorithm: it partitions data into k groups by repeatedly assigning points to the nearest cluster center.
A structured map of real-world entities and the relationships between them, used to organize enterprise data, power search, and ground AI systems in facts.