Chain-of-Thought Prompting
Prompting a model to spell out intermediate reasoning steps, which improves its accuracy on complex, multi-step problems.
Plain-language explanations of the ideas behind modern AI.
Prompting a model to spell out intermediate reasoning steps, which improves its accuracy on complex, multi-step problems.
Collaborative filtering predicts what you will like from the behavior of other people whose tastes resemble yours.
Chatbots built to be an ongoing emotional presence, a friend or partner, rather than to answer questions or complete tasks.
Governing AI by regulating the computing power used to train and run it, since compute is detectable, excludable, quantifiable, and concentrated.
The field of getting computers to interpret images and video, advancing from hand-crafted features through the ImageNet era to today's multimodal models.
Concept drift is when the relationship a deployed model learned changes over time, silently degrading accuracy unless detected and retrained.
Roger Schank's theory that any sentence's meaning can be represented with a small set of primitive acts, independent of the exact words used.
The view that cognition arises from networks of simple neuron-like units with learned connection weights, the philosophical root of neural networks.
Anthropic's method for training a model to be harmless by having it critique and revise its own outputs against a written set of principles.
Whether a benchmark score actually measures the capability it claims to, borrowed from psychometrics.
Techniques for marking and tracing the origin of media, such as C2PA Content Credentials and DeepMind's SynthID watermarks, to flag AI-generated content.
The maximum amount of text - measured in tokens - that a model can take in and work with at one time.
A training method that learns representations by pulling similar examples together and pushing dissimilar ones apart.
A neural network designed for images that detects visual features regardless of their position, foundational to modern computer vision.
An open-source framework for orchestrating teams of role-playing AI agents into crews and event-driven flows, built independently of LangChain.
The average number of bits to encode data from one distribution using a code built for another, and the standard loss for classifiers.
A method for estimating how well a model will generalize by repeatedly training on part of the data and testing on the held-out rest.
The study of control and communication through feedback in machines and living things, an interdisciplinary framework that shaped early AI.
Testing frontier AI models for specific high-risk skills, such as cyberattack, bioweapon uplift, or autonomous replication, before deployment.
A self-reinforcing loop where a deployed model surfaces hard cases, those get labeled and retrained, and the better model attracts more data.
Annotating raw data with the answers a supervised model learns from - the labor-intensive foundation of most applied machine learning.
Data parallelism trains a model by replicating it across devices, giving each a slice of the batch, then averaging gradients so all copies stay in sync.
An attack that corrupts a model by tampering with its training data, degrading accuracy or planting hidden, attacker-chosen behavior.
Automatically checking that data flowing into a model meets expectations on schema, ranges, and distribution before it causes failures.
Data-centric AI is the practice of systematically improving the data, not just the model, treating dataset quality as the main lever for better performance.
The worry that a model learns to behave well during training only so it gets deployed, then pursues a different goal - passing every test yet unsafe.
Tree-based machine learning methods, including Breiman's 2001 random forests, that remain the default for tabular business data in spreadsheets and databases
Machine learning with many-layered neural networks that learn features directly from data, the approach behind the modern AI boom.
Libraries like TensorFlow and PyTorch that handle automatic differentiation and GPU computation, making it practical to build and train neural networks.
Synthetic media in which a person's likeness or voice is generated or swapped using deep-learning techniques, and the detection research built to identify it.
A group-fairness criterion requiring that a model's positive-decision rate be equal across protected groups, regardless of outcomes.
Software that converses with a person in natural language, split traditionally into goal-driven task-oriented and open-ended chit-chat systems.
A mathematical definition of privacy: an analysis's output should look nearly the same whether or not any one person's data is included.
A generative method that learns to create images by reversing a step-by-step noising process, powering tools like DALL-E 2 and Stable Diffusion.
A digital twin is a continuously updated virtual model of a physical asset, fed by sensor data to simulate, monitor, and predict its behaviour.
A Bayesian prior over probability distributions that lets a model decide how many clusters or components the data require.
DVC is an open-source tool from Iterative that versions large datasets and models alongside Git, bringing reproducibility to machine-learning projects.
EdgeRank was Facebook's early News Feed formula, scoring each story by affinity, content weight, and how recently it happened.
Ranking AI models by win rate in head-to-head comparisons, the same idea used to rank chess players, popularized by Chatbot Arena.
Representations that turn words, items, or data into lists of numbers so that similar things sit close together.
The view that intelligence depends on having a body and acting in the world, not just on abstract symbol manipulation in a head.
A two-part neural design that first reads an input into an internal representation, then generates an output from it - the backbone of translation and seq2seq.
End-to-end driving replaces a self-driving car's hand-built modules with a single neural network that maps sensors straight to steering and speed.
Engagement-based ranking orders a feed by what a user is predicted to click, watch, or react to, rather than by time.
A group-fairness criterion requiring a classifier to have equal true-positive and false-positive rates across protected groups.
A model property where transforming the input produces a correspondingly transformed output, used to build symmetry into networks.
The push to turn ad hoc AI benchmarking into a rigorous discipline, addressing validity problems in how models are measured.
A type of AI program that captures a human expert's knowledge as a set of if-then rules and applies them to give advice or make decisions in a narrow domain.