Office Open XML (OOXML)

Office Open XML (OOXML) is the family of XML-based file formats that Microsoft Office uses for documents, spreadsheets, and presentations, recognizable by the file extensions .docx, .xlsx, and .pptx. The Ecma standard’s official scope states that it “defines Office Open XML’s vocabularies and document representation and packaging.” A modern Office document is not a single opaque binary file but a ZIP package (the Open Packaging Conventions) containing multiple XML parts: one part holds the document body, others hold styles, the relationships between parts, embedded media, and metadata. Opening a .docx with an unzip tool reveals this internal directory of XML.

The format was designed to replace the legacy binary .doc, .xls, and .ppt formats that had been Office’s native storage for decades. Moving to XML made document contents transparent, easier to generate and inspect programmatically, more robust against corruption, and amenable to processing by software other than Office itself. Each application has its own vocabulary — WordprocessingML for documents, SpreadsheetML for workbooks, and PresentationML for slides — layered on the same packaging conventions and shared markup for drawing, shared strings, and metadata.

Office Open XML was approved as an Ecma International standard, Ecma-376, in December 2006, developed through Ecma Technical Committee TC45. Microsoft then submitted the format to the international standards process. In 2008 it was approved as ISO/IEC 29500 through the Joint Technical Committee’s fast-track procedure, producing a format that exists in both “Strict” and “Transitional” conformance classes, the latter accommodating compatibility with older binary documents.

The ISO standardization was one of the most contested in the history of document formats. Critics argued that the specification was enormous (thousands of pages), was rushed through the fast-track timeline, duplicated the already-standardized OpenDocument Format, and contained references to legacy binary behaviors that made full independent implementation difficult. The ballot drew an unusually large number of national-body comments and procedural objections, and several national standards organizations and observers publicly criticized the process. Supporters countered that a documented, royalty-claim-covered specification for the world’s most widely used office formats was a clear improvement over undocumented binary files. The outcome left two competing ISO office-document standards, OOXML and ODF, in the field.

OOXML’s importance is simply scale: it is the default save format of the most widely used office software in the world, so the majority of word-processing documents, spreadsheets, and presentations created today are OOXML packages. Its ZIP-of-XML design influenced how other applications read and write Office files, enabled a large ecosystem of server-side document generation and conversion tools, and made Office content far more interoperable than the binary formats it replaced, even as the standardization fight became a defining episode in the politics of open formats.