2505.00949 May 02, 2025 cs.AI

Llama-Nemotron: 효율적인 추론 모델

Llama-Nemotron: Efficient Reasoning Models

M. Patwary
M. Patwary
Citations: 7,282
h-index: 25
Brandon Norick
Brandon Norick
Citations: 3,309
h-index: 9
A. Bercovich
A. Bercovich
Citations: 1,739
h-index: 11
Itay Levy
Itay Levy
NVIDIA
Citations: 261
h-index: 4
Izik Golan
Izik Golan
Citations: 112
h-index: 3
Mohammad Dabbah
Mohammad Dabbah
Citations: 79
h-index: 4
Ran El-Yaniv
Ran El-Yaniv
Citations: 208
h-index: 6
Omri Puny
Omri Puny
Citations: 480
h-index: 8
Ido Galil
Ido Galil
Technion
Citations: 185
h-index: 6
Zach Moshe
Zach Moshe
Citations: 348
h-index: 6
Tomer Ronen
Tomer Ronen
Citations: 168
h-index: 5
Najeeb Nabwani
Najeeb Nabwani
Citations: 79
h-index: 4
Ido Shahaf
Ido Shahaf
Citations: 297
h-index: 8
Oren Tropp
Oren Tropp
Citations: 79
h-index: 4
Ehud Karpas
Ehud Karpas
Citations: 355
h-index: 7
Ran Zilberstein
Ran Zilberstein
Citations: 100
h-index: 4
Jiaqi Zeng
Jiaqi Zeng
Citations: 797
h-index: 10
Soumye Singhal
Soumye Singhal
NVIDIA
Citations: 353
h-index: 8
A. Bukharin
A. Bukharin
Citations: 1,422
h-index: 12
Yian Zhang
Yian Zhang
ML^2
Citations: 1,994
h-index: 11
Tugrul Konuk
Tugrul Konuk
Citations: 204
h-index: 5
Gerald Shen
Gerald Shen
Citations: 658
h-index: 9
Ameya Mahabaleshwarkar
Ameya Mahabaleshwarkar
Citations: 409
h-index: 9
Bilal Kartal
Bilal Kartal
Citations: 122
h-index: 4
Yoshi Suhara
Yoshi Suhara
Citations: 430
h-index: 9
Olivier Delalleau
Olivier Delalleau
Citations: 813
h-index: 10
Zijia Chen
Zijia Chen
Citations: 266
h-index: 6
Zhilin Wang
Zhilin Wang
Citations: 85
h-index: 3
David Mosallanezhad
David Mosallanezhad
Citations: 188
h-index: 6
Adi Renduchintala
Adi Renduchintala
Citations: 217
h-index: 7
Haifeng Qian
Haifeng Qian
Citations: 138
h-index: 5
Dima Rekesh
Dima Rekesh
Citations: 1,038
h-index: 8
Fei Jia
Fei Jia
Citations: 274
h-index: 4
Somshubra Majumdar
Somshubra Majumdar
Citations: 3,917
h-index: 23
V. Noroozi
V. Noroozi
Citations: 2,272
h-index: 18
W. Ahmad
W. Ahmad
Citations: 311
h-index: 8
Sean Narenthiran
Sean Narenthiran
Citations: 707
h-index: 10
Aleksander Ficek
Aleksander Ficek
Citations: 410
h-index: 9
Mehrzad Samadi
Mehrzad Samadi
Citations: 210
h-index: 6
Jocelyn Huang
Jocelyn Huang
Citations: 1,039
h-index: 11
Siddhartha Jain
Siddhartha Jain
Citations: 441
h-index: 6
Igor Gitman
Igor Gitman
Citations: 2,361
h-index: 14
Ivan Moshkov
Ivan Moshkov
Citations: 685
h-index: 8
Wei Du
Wei Du
Citations: 421
h-index: 7
Shubham Toshniwal
Shubham Toshniwal
Citations: 4,541
h-index: 23
George Armstrong
George Armstrong
Citations: 99
h-index: 4
B. Kisačanin
B. Kisačanin
Citations: 626
h-index: 13
Matvei Novikov
Matvei Novikov
Citations: 220
h-index: 6
Daria Gitman
Daria Gitman
Citations: 309
h-index: 5
E. Bakhturina
E. Bakhturina
Citations: 863
h-index: 13
Prasoon Varshney
Prasoon Varshney
Citations: 337
h-index: 6
Jane Scowcroft
Jane Scowcroft
Citations: 498
h-index: 8
John Kamalu
John Kamalu
Citations: 544
h-index: 7
Dan Su
Dan Su
Citations: 406
h-index: 8
Kezhi Kong
Kezhi Kong
Citations: 315
h-index: 7
Markus Kliegl
Markus Kliegl
Citations: 726
h-index: 11
Rabeeh Karimi
Rabeeh Karimi
Citations: 108
h-index: 2
Ying Lin
Ying Lin
Citations: 332
h-index: 9
S. Satheesh
S. Satheesh
Citations: 49,266
h-index: 18
Jupinder Parmar
Jupinder Parmar
Citations: 433
h-index: 9
Pritam Gundecha
Pritam Gundecha
Citations: 831
h-index: 13
Joseph Jennings
Joseph Jennings
Citations: 421
h-index: 8
Shrimai Prabhumoye
Shrimai Prabhumoye
Citations: 291
h-index: 11
Syeda Nahida Akter
Syeda Nahida Akter
Citations: 293
h-index: 8
Abhinav Khattar
Abhinav Khattar
Citations: 290
h-index: 8
Deepak Narayanan
Deepak Narayanan
Citations: 497
h-index: 8
R. Waleffe
R. Waleffe
Citations: 607
h-index: 12
Jimmy Zhang
Jimmy Zhang
Citations: 505
h-index: 7
Bor-Yiing Su
Bor-Yiing Su
Citations: 2,648
h-index: 12
Terry Kong
Terry Kong
Citations: 163
h-index: 4
Parth Chadha
Parth Chadha
Citations: 165
h-index: 4
Sahil Jain
Sahil Jain
Citations: 219
h-index: 7
Christine Harvey
Christine Harvey
Citations: 108
h-index: 2
Elad Segal
Elad Segal
Citations: 3,785
h-index: 7
Jining Huang
Jining Huang
Citations: 242
h-index: 4
Sergey Kashirsky
Sergey Kashirsky
Citations: 124
h-index: 4
Robert Mcqueen
Robert Mcqueen
Citations: 61
h-index: 1
Izzy Putterman
Izzy Putterman
Citations: 79
h-index: 3
Arun Venkatesan
Arun Venkatesan
Citations: 99
h-index: 2
Sherry Wu
Sherry Wu
Citations: 90
h-index: 3
Manoj Kilaru
Manoj Kilaru
Citations: 81
h-index: 3
Anna Warno
Anna Warno
Citations: 61
h-index: 1
Abhilash Somasamudramath
Abhilash Somasamudramath
Citations: 60
h-index: 1
Sandip Bhaskar
Sandip Bhaskar
Citations: 60
h-index: 1
Maka Dong
Maka Dong
Citations: 60
h-index: 1
Nave Assaf
Nave Assaf
Citations: 127
h-index: 4
Shahar Mor
Shahar Mor
Citations: 87
h-index: 4
Omer Ullman Argov
Omer Ullman Argov
Citations: 87
h-index: 4
Scot Junkin
Scot Junkin
Citations: 60
h-index: 1
Pedro Larroy
Pedro Larroy
Citations: 1,023
h-index: 4
Monika Katariya
Monika Katariya
Citations: 60
h-index: 1
Marco Rovinelli
Marco Rovinelli
Citations: 60
h-index: 1
Viji Balas
Viji Balas
Citations: 60
h-index: 1
Nicholas Edelman
Nicholas Edelman
Citations: 60
h-index: 1
Anahita Bhiwandiwalla
Anahita Bhiwandiwalla
Citations: 502
h-index: 11
Muthu Subramaniam
Muthu Subramaniam
Citations: 60
h-index: 1
Smita Ithape
Smita Ithape
Citations: 76
h-index: 3
Yuting Wu
Yuting Wu
Citations: 61
h-index: 1
S. Velury
S. Velury
Citations: 60
h-index: 1
Omri Almog
Omri Almog
Citations: 110
h-index: 2
Joyjit Daw
Joyjit Daw
Citations: 124
h-index: 4
Denys Fridman
Denys Fridman
Citations: 256
h-index: 4
Erick Galinkin
Erick Galinkin
Citations: 226
h-index: 6
Michael Evans
Michael Evans
Citations: 168
h-index: 4
K. Luna
K. Luna
Citations: 168
h-index: 4
Leon Derczynski
Leon Derczynski
Citations: 338
h-index: 7
Nikki Pope
Nikki Pope
Citations: 117
h-index: 4
E. Long
E. Long
Citations: 207
h-index: 6
Guillermo Siman
Guillermo Siman
Citations: 60
h-index: 1
Tomasz Grzegorzek
Tomasz Grzegorzek
Citations: 178
h-index: 2
Pablo Ribalta
Pablo Ribalta
Citations: 375
h-index: 4
Joey Conway
Joey Conway
Citations: 188
h-index: 5
Trisha Saar
Trisha Saar
Citations: 226
h-index: 3
Ann Guan
Ann Guan
Citations: 115
h-index: 4
Krzysztof Pawelec
Krzysztof Pawelec
Citations: 281
h-index: 5
Shyamala Prayaga
Shyamala Prayaga
Citations: 108
h-index: 2
Oleksii Kuchaiev
Oleksii Kuchaiev
NVIDIA
Citations: 6,372
h-index: 29
Boris Ginsburg
Boris Ginsburg
Citations: 306
h-index: 9
O. Olabiyi
O. Olabiyi
Citations: 831
h-index: 14
Kari Briski
Kari Briski
Citations: 163
h-index: 4
Jonathan Cohen
Jonathan Cohen
Citations: 60
h-index: 1
Bryan Catanzaro
Bryan Catanzaro
Citations: 3,307
h-index: 29
Jonah Alben
Jonah Alben
Citations: 2,555
h-index: 5
Yonatan Geifman
Yonatan Geifman
Citations: 1,528
h-index: 9
Eric Chung
Eric Chung
Citations: 183
h-index: 5
Guyue Huang
Guyue Huang
Citations: 792
h-index: 10
G. Lam
G. Lam
Citations: 77
h-index: 3
V. Nguyen
V. Nguyen
Citations: 65
h-index: 2
Andrew Wang
Andrew Wang
Citations: 129
h-index: 4
Seth Schneider
Seth Schneider
Citations: 110
h-index: 4
K. Ramamoorthy
K. Ramamoorthy
Citations: 98
h-index: 4
O.O. Romanenko
O.O. Romanenko
Citations: 60
h-index: 1

우리는 뛰어난 추론 능력, 추론 효율성, 그리고 기업용 오픈 라이선스를 제공하는 이기종 추론 모델의 개방형 제품군인 Llama-Nemotron 모델 시리즈를 소개합니다. 이 제품군은 Nano(8B), Super(49B), Ultra(253B)의 세 가지 크기로 제공되며, DeepSeek-R1과 같은 최첨단 추론 모델과 경쟁할 수 있는 성능을 발휘하는 동시에 더 우수한 추론 처리량과 메모리 효율성을 제공합니다. 본 보고서에서는 가속화된 추론을 위해 Llama 3 모델 기반의 신경망 아키텍처 탐색, 지식 증류, 지속적인 사전 훈련을 활용하고, 이어서 지도 미세 조정과 대규모 강화 학습이라는 두 가지 주요 부분으로 구성된 추론 중심의 사후 훈련 단계를 거치는 이 모델들의 훈련 절차에 대해 논의합니다. Llama-Nemotron 모델은 동적 추론 토글을 지원하는 최초의 오픈 소스 모델로서, 사용자가 추론 중에 표준 채팅 모드와 추론 모드 사이를 전환할 수 있도록 합니다. 열린 연구를 더욱 지원하고 모델 개발을 촉진하기 위해 우리는 다음의 리소스를 제공합니다: 1. 상업적 이용이 허용되는 NVIDIA 오픈 모델 라이선스 계약하에 Llama-Nemotron 추론 모델(LN-Nano, LN-Super, LN-Ultra)을 공개합니다. 2. 전체 사후 훈련 데이터셋인 Llama-Nemotron-Post-Training-Dataset을 공개합니다. 3. 훈련 코드베이스인 NeMo, NeMo-Aligner, Megatron-LM을 공개합니다.

Original Abstract

We introduce the Llama-Nemotron series of models, an open family of heterogeneous reasoning models that deliver exceptional reasoning capabilities, inference efficiency, and an open license for enterprise use. The family comes in three sizes -- Nano (8B), Super (49B), and Ultra (253B) -- and performs competitively with state-of-the-art reasoning models such as DeepSeek-R1 while offering superior inference throughput and memory efficiency. In this report, we discuss the training procedure for these models, which entails using neural architecture search from Llama 3 models for accelerated inference, knowledge distillation, and continued pretraining, followed by a reasoning-focused post-training stage consisting of two main parts: supervised fine-tuning and large scale reinforcement learning. Llama-Nemotron models are the first open-source models to support a dynamic reasoning toggle, allowing users to switch between standard chat and reasoning modes during inference. To further support open research and facilitate model development, we provide the following resources: 1. We release the Llama-Nemotron reasoning models -- LN-Nano, LN-Super, and LN-Ultra -- under the commercially permissive NVIDIA Open Model License Agreement. 2. We release the complete post-training dataset: Llama-Nemotron-Post-Training-Dataset. 3. We also release our training codebases: NeMo, NeMo-Aligner, and Megatron-LM.

64 Citations
5 Influential
14.5 Altmetric
146.5 Score

AI Analysis

Korean Summary

이 논문은 NVIDIA가 Llama 3 모델을 기반으로 개발한 개방형 추론 모델 시리즈인 'Llama-Nemotron(Nano, Super, Ultra)'을 소개합니다. 이 모델들은 신경망 아키텍처 탐색(NAS), 지식 증류, 그리고 대규모 강화학습(RL)을 결합하여 DeepSeek-R1과 같은 최첨단 모델과 대등한 추론 성능을 유지하면서도 추론 효율성을 극대화했습니다. 특히 'Puzzle' 프레임워크를 통해 하드웨어 효율적인 구조로 변환되었으며, 사용자가 시스템 프롬프트를 통해 '상세한 사고(detailed thinking)' 모드를 켜거나 끌 수 있는 동적 토글 기능을 제공하여 유연한 비용 관리 및 응답 스타일 제어가 가능합니다.

Key Innovations

  • Puzzle 프레임워크 기반 NAS (블록 단위 증류 및 Attention 메커니즘 제거)
  • FFN Fusion (연속된 FFN 블록을 병합하여 레이어 깊이 및 지연 시간 감소)
  • 동적 추론 토글 (Dynamic Reasoning Toggle) 기능
  • 대규모 강화학습을 위한 GRPO(Group Relative Policy Optimization) 알고리즘 적용
  • FP8 생성 및 훈련/추론 병합을 통한 인프라 메모리 최적화

Learning & Inference Impact

추론 측면에서는 NAS와 FFN Fusion 기술을 통해 불필요한 연산을 줄여, LN-Ultra(253B) 모델이 단일 8xH100 노드에서 DeepSeek-R1보다 높은 처리량으로 구동될 수 있게 하였습니다. 학습 과정에서는 강력한 교사 모델의 추론 과정을 증류(SFT)한 후, 대규모 RL을 적용하여 학생 모델이 교사 모델의 성능 한계를 뛰어넘도록(Self-improvement) 설계되었습니다. 또한, 커리큘럼 학습 방식을 도입하여 학습 안정성을 높였습니다.

Technical Difficulty

고급

Estimated implementation complexity based on methodology.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!