Code Llama is a family of code-specialized large language models released by Meta and described in “Code Llama: Open Foundation Models for Code” (arXiv, August 2023, by Baptiste Roziere and colleagues). The models are built on top of Llama 2 and come in 7B, 13B, 34B, and 70B parameter sizes, with separate Python-specialized and instruction-following variants alongside the base versions.
Technically, the models were trained on sequences of 16,000 tokens and handle inputs up to about 100,000 tokens, and most variants support infilling, meaning they can fill in code given both the text before and after a gap rather than only continuing from the left. On standard benchmarks the family reached up to 67 percent on HumanEval and 65 percent on MBPP, and the Python-specialized 7B model outperformed the much larger base Llama 2 70B on both, showing the value of code-focused training. The models were released under a permissive license allowing research and commercial use.
Code Llama mattered as one of the first strong, openly available code model families that organizations could run and fine-tune on their own infrastructure. For businesses concerned about sending source code to a third-party API, an open code model that supports infilling and long context offered a self-hosted alternative to proprietary coding assistants.