For decades the Open Source Initiative (OSI) has been the steward of the Open Source Definition for software, the standard that decides what can legitimately be called “open source.” The rise of large AI models created a problem the old definition could not answer: companies were calling their models “open source” even when key components were withheld or use was restricted, a practice critics labeled “open-washing.” On October 28, 2024, at the All Things Open conference, the OSI released version 1.0 of the Open Source AI Definition (OSAID), the first attempt at an authoritative answer.
The definition adapts the four classic software freedoms to AI systems: a system is open-source AI only if it grants the freedoms to use it for any purpose without permission, to study how it works, to modify it (including to change its outputs), and to share it with or without modifications. To make those freedoms real, the definition requires access to the model weights and parameters, the full source code used to train and run the system, and - notably - “sufficiently detailed information about the data” so that a skilled person could build a substantially equivalent system. That data-information requirement, rather than mandating release of the raw training dataset itself, was the most debated compromise. The result drew endorsements from groups such as the Mozilla Foundation and EleutherAI, but also criticism that it was too permissive on data.
Why business readers should care: the label “open source” carries legal, reputational, and procurement weight. The OSAID gives organizations a yardstick to judge whether a model marketed as open actually grants the freedoms the term implies, and exposes the gap with restricted “open-weight” licenses like Meta’s Llama license.