Catastrophic Forgetting

Catastrophic forgetting is the problem that a neural network trained on a new task tends to overwrite the knowledge it had acquired for old ones. Because learning works by nudging the same shared weights to reduce error on whatever data is in front of the model, training hard on task B can move those weights away from the values that made task A work. A network that has just mastered a second language dataset may suddenly do poorly on the first, even though nothing about the first changed. The DeepMind-led 2017 paper “Overcoming catastrophic forgetting in neural networks” stated the problem bluntly, noting that without intervention “catastrophic forgetting is an inevitable feature of connectionist models.”

This is the central obstacle to continual or lifelong learning, where the goal is to keep adding skills over time the way a person does, rather than retraining from scratch whenever the task list grows. The 2017 paper proposed elastic weight consolidation, which “remembers old tasks by selectively slowing down learning on the weights important for those tasks” - identifying which parameters matter most for prior skills and resisting changes to them. Other approaches replay old examples during new training or grow the network so new tasks get fresh capacity.

In modern practice the issue shows up when fine-tuning a pretrained model: push too hard on a narrow dataset and the model can lose general abilities it had before. It is one reason teams favor lighter-touch adaptation methods that change few parameters.

Why business readers should care: catastrophic forgetting is why you cannot simply keep bolting new knowledge onto a deployed model without care. It explains why updating an AI for a new task can quietly degrade its old behavior, and why retraining and fine-tuning strategies need testing for regressions.

Sources

Related