Mistral releases Mistral 7B

On September 27, 2023, the French lab Mistral AI - founded earlier that year by Arthur Mensch, Guillaume Lample, and Timothee Lacroix - released Mistral 7B, its first model, calling it “the most powerful language model for its size to date.” The model has 7.3 billion parameters and was published under the permissive Apache 2.0 license, meaning it could be used and modified without restriction.

Mistral reported that the model “outperforms Llama 2 13B on all benchmarks” and “outperforms Llama 1 34B on many benchmarks,” and that on reasoning and comprehension tasks it performed equivalently to a Llama 2 model more than three times its size. It achieved this efficiency using Grouped-Query Attention (GQA) for faster inference and Sliding Window Attention (SWA) to handle longer sequences at lower cost.

Mistral 7B marked the arrival of a serious European open-weights competitor. A small, capable, freely licensed model that ran on modest hardware, it gave developers a strong alternative to both closed commercial systems and Meta’s Llama, and set up Mistral’s follow-on sparse mixture-of-experts release, Mixtral, later the same year.

Sources

Related