Article 3: The Backprop Revival - The Story of AI

Let me tell you about the worst feeling in science. It is not being wrong. Being wrong is normal; you dust yourself off and try again. The worst feeling is being right - and having nobody believe you. Being right at the wrong time.

When we left off, a Harvard graduate student named Paul Werbos had, in 1974, written down the solution to the great unsolved problem of neural networks - how to train a deep, many-layered network - and then watched it sink without a trace. The field had just decided that neural networks were a dead end. So the answer to the question everyone had given up on sat in a filing cabinet, gathering dust, for more than a decade.

This is the story of the people who refused to give up. The ones who, through the long winter of the late 1970s and early 1980s, kept tending a fire that everyone else had declared out. It is a story about stubbornness, and about a strange truth that runs all through the history of technology: that the most important ideas often arrive years before the world is ready to use them.

The first crack of warmth came, of all places, from physics. In 1982, a physicist named John Hopfield published a short paper showing that a network of simple units could store memories - could settle into a remembered pattern the way a ball rolls down into the bottom of a valley. It was elegant, it was rigorous, and crucially, it was respectable. Hopfield was a serious physicist, and when a serious physicist takes an unfashionable idea seriously, other serious people start to wonder if they were too quick to dismiss it. Mathematicians and physicists began drifting back to neural networks, almost sheepishly.

But the real turning point came from a man who would become the central character of this entire saga: a British-born researcher named Geoffrey Hinton. Hinton had an almost religious conviction that the brain-inspired approach was right, and he had held onto it straight through the years when holding onto it was career suicide. And in 1986, Hinton, with David Rumelhart and Ronald Williams, published a paper in the journal Nature that did the thing everyone had given up on. It showed how to train those deep, multi-layered networks. They called the method backpropagation.

Here is the idea, and it is beautiful in its simplicity. The network makes a guess. You measure how wrong it was. And then you send that error signal backwards through the network, layer by layer, and at each connection you ask a simple question - how much did you contribute to this mistake? - and you nudge it, just slightly, in the right direction. Do that over and over, with thousands of examples, and something remarkable happens. The hidden layers in the middle of the network - the ones nobody programmed - start to organize themselves. They invent their own internal features, their own way of seeing. This was the missing key. It was, in essence, what Werbos had written down twelve years earlier, finally arriving at a moment when the world could hear it.

And the networks started to do things that gave people chills. There was a system called NETtalk, built in 1987, that learned to read English aloud. And what made it unforgettable was that you could listen to it learn. Early on, it babbled like an infant - formless noise. And as it trained, hour by hour, the babble slowly resolved into syllables, then words, then recognizable speech. People who heard it never forgot it. It sounded like a machine learning to talk. Around the same time, at Bell Labs, a young Frenchman named Yann LeCun trained a network to read the handwritten zip codes on real mail. Not a toy. Real envelopes, real digits, at a scale that actually mattered to the post office. Backpropagation, plus a brain-inspired design, plus real-world data. In hindsight, that 1989 zip-code reader contains the entire recipe of the AI boom that was still twenty years away.

So the believers had been vindicated. The idea worked. And here is the cruel twist of this chapter: it didn’t matter yet. Because while the connectionists were quietly proving themselves right, their old rivals - the symbolic AI camp - were having the time of their lives, and then living through a spectacular crash that would tar the whole field all over again.

Remember the symbolic approach, the one built on logic and rules instead of neurons? In the 1980s, it finally found a way to make money. The product was called an expert system - a program that captured a human specialist’s knowledge as a giant pile of if-then rules. A company called Digital Equipment deployed one to configure its computer orders, and it actually saved real money, and that was all it took. A gold rush began. An entire industry sprang up selling expensive, specialized computers built just to run this stuff. Japan launched a massive national project to build the machines of the future and scared Western governments into pouring in money to keep up. The hype, once again, went vertical.

And once again, it came crashing down. The expert systems turned out to be brittle - change one thing and they shattered. They were monstrously expensive to maintain. The specialized computers were undercut by ordinary machines that did the job for less. By around 1987, the whole market collapsed. Companies folded. The money vanished. And artificial intelligence became, for the second time in twenty years, a phrase you did not say out loud if you wanted to be taken seriously. The second AI winter had arrived.

Not everyone drew the same lesson from the wreckage. A roboticist named Rodney Brooks looked at decades of failure and concluded that the whole project had been built on a mistake. All that effort to give a machine a careful internal model of the world - maybe that was the problem. His robots threw out the model entirely and just reacted to the world directly. The world, he liked to say, is its own best model. It was a deliberate slap at the establishment, and it pointed toward a different, more grounded way of building intelligent machines.

So picture the field at the end of the 1980s. The symbolic camp had soared and crashed, and dragged AI’s reputation down with it. The connectionists, meanwhile, had quietly won the argument - backpropagation worked, the networks could learn, the theory was sound. And it still didn’t matter, because their idea was starving. Backpropagation needed two things the 1980s simply could not provide in the quantities required: enormous amounts of data, and enormous amounts of cheap computing power. LeCun’s network could read zip codes, but it was slow, and the datasets were tiny. The algorithm that would one day devour the entire internet was being fed with an eyedropper.

So the pioneers were stuck in the strangest of places - holding the winning lottery ticket, in a world that had not yet built the place where you cash it in. What happened next was not another dramatic crash. It was something quieter, and in a way more important. For the next twenty years, while the headlines ignored AI completely, the world would slowly, almost accidentally, build the two things the believers were waiting for. The data. And the power.

That long, quiet wait is where the next chapter begins.

The Backprop Revival

Listen to the article

The video episode is coming soon

Sources and show notes

Made with AI, sourced like a library