DBRX (Databricks open MoE model)

DBRX is an open large language model released by Databricks on March 27, 2024. It uses a fine-grained mixture-of-experts (MoE) design with 132 billion total parameters, of which 36 billion are active on any given input. Where Mixtral used 8 experts and selected 2 per token, DBRX uses 16 experts and selects 4, which Databricks argued gave more routing combinations and better quality per unit of compute. Both the base and instruction-tuned weights were released under an open license on Hugging Face.

On release, Databricks said DBRX surpassed GPT-3.5 on general knowledge and programming and was competitive with Gemini 1.0 Pro, while leading open models on composite measures - 74.5 percent on the Hugging Face Open LLM Leaderboard and 70.1 percent on the HumanEval coding benchmark, ahead of CodeLlama-70B. The release doubled as a showcase for Databricks’ Mosaic training stack, acquired when it bought MosaicML for $1.3 billion in 2023.

Why business readers should care: DBRX showed an enterprise data platform shipping a frontier-class open MoE model, signaling that the companies selling data infrastructure intend to provide the models that run on it, not just the pipes.