Snowflake Arctic (Dense-MoE open model)

Snowflake Arctic is an open enterprise language model that Snowflake released on April 24, 2024, under the Apache 2.0 license with ungated access to weights and code. Its standout claim was training efficiency: Snowflake said it built Arctic for a compute budget of under $2 million - less than 3,000 GPU weeks - while reaching quality comparable to Llama 3 70B, which it estimated used about 17 times more compute. The model deliberately optimized for what Snowflake called enterprise intelligence (SQL generation, coding, instruction following) rather than general-knowledge trivia benchmarks.

The efficiency came from an unusual Dense-MoE hybrid architecture: a 10-billion-parameter dense transformer running in parallel with a residual mixture-of-experts layer of 128 experts at 3.66 billion parameters each, using top-2 gating. That yields 480 billion total parameters but only about 17 billion active per token, keeping inference cost low while the large expert pool holds more knowledge. Arctic was distributed through Hugging Face, AWS, Azure, the NVIDIA API catalog, and Snowflake’s own Cortex service.

Why business readers should care: Arctic was a pointed argument that competitive open models could be trained on modest budgets if architecture and data are tuned for the target use, undercutting the assumption that only the largest compute budgets produce useful enterprise models.

Snowflake Arctic (Dense-MoE open model)

Sources

Related