Intro to Large Language Models

This is Andrej Karpathy’s widely shared “1hr Talk” introduction to large language models, posted to his own YouTube channel. It is built as a single sitting briefing for a general audience: what an LLM actually is, how it is trained in stages, and what the technology can and cannot do.

Karpathy walks through the core ideas without requiring any math. He explains pretraining on large text corpora, the move from a base model to a helpful assistant through finetuning, and the way these models behave as a new kind of computer. He then turns to practical capabilities such as tool use and multimodality, and to the security questions raised by jailbreaks and prompt injection.

Karpathy is the right person to hear this from. As a founding member of OpenAI and former Director of AI at Tesla, he has built large-scale systems firsthand, and he is one of the field’s clearest teachers. A business reader who watches this once will come away with a durable and accurate mental model rather than a collection of buzzwords.

Intro to Large Language Models

Sources

Related