The M-competitions are a long-running series of open forecasting contests organized by Spyros Makridakis, running since 1982. Their purpose is empirical rather than theoretical: rather than argue about which forecasting method should work best, they run many methods on thousands of real time series and measure which actually do. The results have repeatedly reshaped what practitioners believe.
The M4 competition (2018) used 100,000 time series drawn from economics, finance, demographics, and industry, evaluated through the official Mcompetitions GitHub repository against statistical and machine learning benchmarks. A notable finding was that the best entries combined statistical and machine learning ideas, and that pure machine learning methods of the day often failed to beat simple statistical baselines such as exponential smoothing. The M5 competition (2020), described in the M5 Accuracy results paper, switched to a single large retail dataset of Walmart sales, around 42,000 hierarchical series, and reached the opposite headline: gradient-boosted trees, especially LightGBM, dominated the leaderboard, helped by the rich explanatory variables retail data provides.
Together M4 and M5 are the field’s reference point for honest, out-of-sample comparison of forecasting methods.
Why business readers should care: the M-competitions are the closest thing forecasting has to an independent referee, and they show that the right method depends heavily on the data, not on hype.