Kubeflow
Kubeflow is an open-source toolkit for running the machine-learning lifecycle on Kubernetes, originating from how Google ran TensorFlow internally.
Plain-language explanations of the ideas behind modern AI.
Kubeflow is an open-source toolkit for running the machine-learning lifecycle on Kubernetes, originating from how Google ran TensorFlow internally.
LangChain's graph-based runtime for building stateful, long-running agents as nodes and edges with persistence and human-in-the-loop control.
Groq's AI inference chip that uses on-chip SRAM and fully compiler-scheduled execution for fast, deterministic LLM serving.
An AI system trained on vast text to predict and generate language, able to perform many tasks from a single model.
Training a model to order a list of results by relevance, the core machine learning task behind search.
The international debate over weapons that select and attack targets without human intervention, and whether new law should restrict them.
The advantage liars gain when the existence of deepfakes lets them dismiss real, incriminating evidence as fake.
Lidar maps a car's surroundings in 3D with laser pulses; it is central to most robotaxi designs and the focus of the camera-versus-lidar debate.
A family of attention variants that scale linearly with sequence length instead of quadratically, easing long-context cost.
A recurrent neural network design that remembers information over long sequences, enabling early advances in speech and language processing.
The ethical view that positively shaping the very long-term future is a key moral priority, often invoked to justify work on AI risk.
A parameter-efficient fine-tuning method that customizes a large model by training a tiny set of added weights, cutting cost and storage by orders of magnitude.
A formula that scores how wrong a model's predictions are, giving training a single number to minimize.
A field where computer programs improve at a task by learning patterns from data and experience rather than being explicitly programmed.
Methods for making a trained model forget specific data, so that deleting a record also removes its influence on the model.
Neural networks trained to predict the energy and forces between atoms at near-quantum accuracy but orders of magnitude faster than quantum simulation.
A family of methods that draw samples from complex probability distributions by simulating a cleverly designed random walk.
The mathematical frame for sequential decision-making under uncertainty, defining the states, actions, and rewards reinforcement learning builds on.
The idea that a human must retain enough understanding and authority over a weapon to be morally and legally accountable for its use of force.
The effort to reverse-engineer the specific algorithms and circuits neural networks learn, reading their internals like code.
When a language model reproduces verbatim chunks of its training data, raising the copyright question at the center of suits like NYT v. OpenAI.
Meta-learning, or learning to learn, trains a system so it can pick up new tasks quickly from very little data.
FarmBeats is a Microsoft Research platform that fuses drone imagery with ground sensors over rural connectivity to build AI maps of farm conditions.
An architecture that routes each input to a few specialized sub-networks, growing total capacity without growing per-query cost.
Monitoring deployed models in production to detect drift, data quality issues, and performance decay before they harm the business.
Short standardized documents that ship with a trained model to report its intended uses, performance across groups, and known limitations.
The degradation that happens when AI models are trained on AI-generated data, making the rare tails of the data distribution disappear over generations.
An open standard from Anthropic for connecting AI applications to external data sources, tools, and workflows in a uniform way.
Training a smaller, cheaper model to mimic a larger one, transferring much of its capability at a fraction of the cost.
A central store that versions trained models, tracks their lineage, and manages their promotion from staging to production.
The study of systems where many autonomous agents interact, the academic field that grounds today's LLM agent orchestration.
The multi-armed bandit is the simplest reinforcement learning problem: balance exploring unknown options against exploiting the best known one.
Running several attention computations in parallel so a model can attend to different kinds of relationships at once - a key piece of the Transformer.
AI systems that work across more than one type of data - for example understanding images and text together rather than text alone.
A measure of how much knowing one variable reduces uncertainty about another, defining the capacity of a communication channel.
The field of getting computers to work with human language, which evolved from hand-written rules to statistics to neural networks to large language models.
Automatically designing the structure of a neural network by searching over possible architectures instead of hand-crafting them.
A computing model built from layers of simple interconnected units that adjust their connections to learn patterns from data.
A class of neural networks that learn mappings between whole functions, letting one model solve a family of differential equations at any resolution.
The research field at the intersection of neuroscience and AI, using each to inform the other in a two-way exchange of ideas.
Hardware that copies the brain's style of computation - event-driven spikes and memory next to processing - to run neural workloads at very low power.
Shannon's result that reliable communication is possible up to a fixed rate, the channel capacity, even over a noisy channel.
Reasoning where adding new facts can cancel earlier conclusions, the formal basis for default assumptions and common-sense inference in symbolic AI.
A family of generative models built from invertible transformations, letting them compute exact data likelihoods.
A decoding method that samples the next token from the smallest set whose probabilities add up to p, keeping text varied without choosing unlikely words.
TensorRT is NVIDIA's deep-learning inference optimizer that speeds up trained models on GPUs through quantization, layer fusion, and kernel tuning.
Triton is NVIDIA's open-source server for deploying trained models from any framework, with dynamic batching and concurrent execution for production inference.
NVIDIA's high-speed interconnect that lets GPUs share data far faster than PCIe, the glue of multi-GPU AI systems.