Llama 2: Open Foundation and Fine-Tuned Chat Models

Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.

Preview

PDF Thumbnail

Synopsis

Overview

  • Keywords: Large Language Models, Llama 2, Fine-Tuning, Chat Models, Safety, Reinforcement Learning
  • Objective: Develop and release Llama 2, a collection of pretrained and fine-tuned large language models optimized for dialogue use cases.
  • Hypothesis: Llama 2-Chat models will outperform existing open-source models and be competitive with closed-source models in terms of helpfulness and safety.
  • Innovation: Introduction of a new fine-tuning methodology, including Ghost Attention for improved dialogue flow and safety enhancements through iterative evaluations.

Background

  • Preliminary Theories:

    • Reinforcement Learning with Human Feedback (RLHF): A method for aligning model outputs with human preferences by using feedback from human evaluators to guide model training.
    • Auto-Regressive Transformers: The foundational architecture for LLMs, which predicts the next token in a sequence based on previous tokens.
    • Safety in AI: The importance of mitigating risks associated with LLMs, including bias, toxicity, and misinformation.
    • Instruction Tuning: A technique to enhance model performance on specific tasks by training on a dataset of annotated instructions.
  • Prior Research:

    • BLOOM (2022): An open-source LLM that set a precedent for community-driven model development.
    • LLaMa-1 (2023): A predecessor to Llama 2, demonstrating competitive performance with closed-source models.
    • Falcon (2023): Another open-source model that provided a benchmark for evaluating Llama 2's performance.
    • GPT-3 (2020): A widely recognized closed-source model that established high standards for LLM capabilities.

Methodology

  • Key Ideas:

    • Supervised Fine-Tuning (SFT): Initial training phase using a curated dataset of instructions to guide model behavior.
    • Ghost Attention (GAtt): A novel technique that enhances multi-turn dialogue consistency by maintaining context across turns.
    • Iterative Reward Modeling: Continuous refinement of the model based on feedback from human evaluators to improve alignment with user expectations.
  • Experiments:

    • Human Evaluation: Over 4,000 prompts were used to assess helpfulness and safety, comparing Llama 2-Chat with both open-source and closed-source models.
    • Safety Evaluations: Conducted through red-teaming and iterative assessments to identify and mitigate potential risks in model outputs.
  • Implications: The methodology emphasizes transparency and reproducibility, encouraging further community engagement in improving LLM safety and performance.

Findings

  • Outcomes:

    • Llama 2-Chat models consistently outperformed open-source counterparts and showed competitive performance against closed-source models like ChatGPT.
    • Significant improvements in safety metrics, with toxicity levels nearing zero across all model sizes.
    • Enhanced truthfulness scores post-fine-tuning, indicating a more reliable model output.
  • Significance: Llama 2-Chat challenges the notion that only closed-source models can achieve high performance, demonstrating that open-source models can be equally effective with proper tuning and safety measures.

  • Future Work: Opportunities exist for expanding multilingual capabilities, refining safety protocols, and exploring the integration of external tools within the model framework.

  • Potential Impact: Advancements in Llama 2-Chat could lead to broader adoption of open-source LLMs in various applications, fostering innovation while maintaining ethical standards in AI development.

Notes

Meta

Published: 2023-07-18

Updated: 2025-08-27

URL: https://arxiv.org/abs/2307.09288v2

Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom

Citations: 5602

H Index: 635

Categories: cs.CL, cs.AI

Model: gpt-4o-mini