ML Observability

ML observability, also called model monitoring, is the practice of continuously watching a machine-learning model after it is deployed to make sure it is still behaving correctly. Unlike conventional software, a model can degrade silently: it keeps returning predictions without errors even as the world changes underneath it and those predictions become wrong. Observability tooling exists to catch this.

The things teams monitor include input data quality (missing values, out-of-range features, schema changes), data drift (the distribution of incoming data shifting away from the training data), concept drift (the relationship between inputs and the correct answer changing), training-serving skew (production data differing from training data), and, where ground-truth labels eventually arrive, the model’s actual accuracy over time. Open-source frameworks such as Evidently provide metrics, tests, and dashboards for evaluating and monitoring both traditional models and LLM-powered systems, and a number of commercial platforms (for example Arize, WhyLabs, and Fiddler) offer hosted monitoring.

This discipline grew out of the recognition, captured in papers like Google’s “Hidden Technical Debt in Machine Learning Systems,” that most of a model’s lifetime cost is in maintenance after deployment.

For a business reader, ML observability is the safety system that keeps a working model from quietly turning into a liability as conditions change.

Sources

Related