KAN: Kolmogorov-Arnold Networks

“KAN: Kolmogorov-Arnold Networks,” submitted to arXiv on April 30, 2024 by Ziming Liu and colleagues including Max Tegmark, proposed an alternative to the standard multilayer perceptron (MLP), the basic building block of most neural networks. The idea draws on the Kolmogorov-Arnold representation theorem, a result showing that any multivariate continuous function can be built from sums of single-variable functions.

In a conventional MLP, each node applies a fixed activation function and the learning happens in the linear weights on the connections. KANs flip this: they place learnable activation functions, parameterized as splines, on the edges, and remove the fixed activations from the nodes. Each connection thus learns its own shape of nonlinearity rather than just a scalar weight.

The authors argued this design yields networks that are both more accurate and more interpretable than MLPs of comparable or even larger size, particularly on tasks like fitting functions and solving differential equations. Because each edge function can be visualized as a curve, a trained KAN can sometimes be read to recover an explicit mathematical formula, which the authors highlight as useful for scientific discovery. They also reported more favorable neural scaling behavior.

KANs generated considerable interest as a fresh take on a decades-old default. For a general reader, the appeal is interpretability: in domains like physics and engineering where understanding the model matters as much as its accuracy, an architecture you can read and turn into equations is genuinely valuable.

KAN: Kolmogorov-Arnold Networks

Sources

Related