The Jagged Frontier of AI Capability
The idea that AI helps on some tasks and hurts on others of seemingly equal difficulty, in an uneven pattern.
Plain-language explanations of the ideas behind modern AI.
The idea that AI helps on some tasks and hurts on others of seemingly equal difficulty, in an uneven pattern.
Left 4 Dead's AI Director procedurally paces each playthrough, spawning zombies to track players' emotional intensity.
Most of the world's languages have too little digital text or audio to train good AI, leaving billions of speakers underserved by language technology.
A 1946-1953 interdisciplinary conference series that defined cybernetics, fixing concepts like feedback and information across science.
A thought experiment: a being physically identical to a person but with no inner experience, used to argue consciousness is not purely physical.
The idea that general-purpose technologies first depress measured productivity, then lift it, as intangible investments pay off.
The difficulty that you can never list all the preconditions an action needs, identified by John McCarthy as a core obstacle for common-sense AI.
Many machine-learning results do not hold up, often because data leakage inflates accuracy; a 2023 review found the problem across 17 fields.
Harnad's 1990 question of how the symbols inside a computer could mean anything without being tied to the world through the senses.
Copyright carve-outs that let computers analyze large bodies of works, now the legal foundation for assembling AI training data outside the US.
Alan Turing's proposal to replace 'can machines think?' with a test of whether a machine's conversation is indistinguishable from a human's.
The challenge of getting an advanced AI to adopt and pursue the values we actually want, rather than a flawed proxy for them.
The Vauquois triangle maps machine translation into three strategies - direct, transfer, and interlingua - by how deeply each analyzes meaning.
A simple, old strategy for balancing exploration and exploitation that works well for online recommendation and ads.
Predicting future values of a quantity recorded over time, the basis of demand planning, capacity, and budgeting across business and science.
The process of breaking text into small units (tokens) that a model can read, often using subword pieces to handle any word.
Language model tokenizers split non-English text into more tokens, so speakers of many languages pay more and get worse results for the same content.
The mechanism that lets a language model call external functions, search, or APIs, turning a text generator into a system that can take real actions.
The collection of examples a machine learning model learns from, whose quality and coverage largely determine model performance.
The provenance chain that turns raw web crawls into a model's training corpus: collection, filtering, deduplication, and weighted mixing of diverse sources.
Dropout, batch normalization, and the Adam optimizer - inventions from 2014 and 2015 that made deep neural networks reliably trainable and remain standard.
When the data or code a model sees in production differs from training, causing predictions to be worse in the real world than in testing.
Reusing a model trained on one large task as the starting point for a new, related task, so a small amount of new data goes a long way.
The neural network architecture, built entirely on attention, that powers modern large language models.
Triton is an open-source Python-like language and compiler for writing fast custom GPU kernels with far less effort than raw CUDA.
A retrieval architecture that encodes queries and items separately so candidate lookup becomes fast vector search.
Finding structure, groupings, or patterns in data that has no labels or predefined correct answers.
A neural network that compresses data into a smooth space and generates new examples from it - one of three generative families alongside GANs and diffusion.
A way to approximate hard-to-compute probability distributions by turning Bayesian inference into an optimization problem.
A specialized store for embeddings that finds the most similar items by meaning rather than exact match, the retrieval engine behind most RAG systems.
The idea that scaling video generators turns them into general simulators of the physical world, not just clip makers.
A vision-language model processes images and text together, so one system can describe pictures, answer questions about them, and follow visual instructions.
A robot model that takes camera images and a language instruction and directly outputs actions, often by emitting actions as tokens.
Recording a person's voice so AI can later synthesize speech in it, letting people who lose their voice to ALS keep speaking in their own sound.
Technology that generates human-sounding speech from text, also called text-to-speech, now realistic enough for AI narration, voice cloning, and voice agents.
The on-device task of constantly listening for a trigger phrase like Alexa or Okay Google before a voice assistant starts processing speech.
Two closely related ways to penalize large model weights and curb overfitting, which differ once adaptive optimizers are used.
Principled rules for setting a network's starting weights so signals neither vanish nor explode, making very deep networks trainable from scratch.
Word embeddings represent words as dense vectors so that words with similar meanings sit near each other, a foundation of modern NLP.
AI systems that learn an internal simulation of an environment, letting an agent imagine and plan ahead; central to robotics and physical AI.
A model handling a task it was never trained on, given only an instruction and no examples.
A cash prize for compressing a gigabyte of Wikipedia, built on the claim that better compression means better intelligence.
The proposed route to machine intelligence by scanning a real brain and running it faithfully as software.
Steve Wozniak's proposed test of general intelligence: a robot that can enter a strange house and make a cup of coffee.
Bostrom's thesis that agents with almost any final goal will pursue the same sub-goals, like self-preservation and resources.
Bostrom's claim that an AI's level of intelligence and its final goals are independent and can vary freely.
Bostrom's scenario where an AI acts cooperatively while weak, then turns once it is strong enough to resist humans.
Google's 2018 service letting businesses train custom vision, text, and tabular models without ML expertise, using transfer learning and NAS.