2601.01718v1 Jan 05, 2026 cs.AI

Yuan3.0 Flash: 기업용 애플리케이션을 위한 오픈 멀티모달 대규모 언어 모델

Yuan3.0 Flash: An Open Multimodal Large Language Model for Enterprise Applications

Yujia Liu

Citations: 0

h-index: 0

Sean Wang

Citations: 5

h-index: 1

Louie Li

Citations: 1

h-index: 1

Darcy Chen

Citations: 1

h-index: 1

Allen Wang

Citations: 1

h-index: 1

Jiangang Luo

Citations: 100

h-index: 3

Xudong Zhao

Citations: 60

h-index: 5

Gawain Ma

Citations: 1

h-index: 1

Marcus Mao

Citations: 1

h-index: 1

Claire Wang

Citations: 10

h-index: 2

Hunter He

Citations: 1

h-index: 1

Logan Chen

Citations: 2

h-index: 1

Qasim Meng

Citations: 1

h-index: 1

Penn Zheng

Citations: 4

h-index: 1

O. Zhu

Citations: 20

h-index: 1

Tong Yu

Citations: 33

h-index: 2

Shawn Wu

Citations: 12

h-index: 2

Carolyn Wang

Citations: 89

h-index: 3

Z. Zhang

Citations: 2

h-index: 1

Jason Wang

Citations: 1,079

h-index: 5

Leo Zhang

Citations: 10

h-index: 1

J. Jia

Citations: 628

h-index: 12

C. Shen

Citations: 48

h-index: 3

J. Gong

Citations: 19

h-index: 2

Joseph Shen

Citations: 19

h-index: 1

우리는 37억 개의 활성 파라미터와 400억 개의 전체 파라미터를 갖춘 오픈 소스 전문가 혼합(MoE) 멀티모달 대규모 언어 모델인 Yuan3.0 Flash를 소개합니다. 이 모델은 범용 작업에 대한 경쟁력 있는 역량을 유지하면서 기업 중심 작업의 성능을 향상시키기 위해 특별히 설계되었습니다. 대규모 추론 모델(LRM)에서 흔히 관찰되는 과잉 사고(overthinking) 현상을 해결하기 위해, 우리는 과잉 사고 행동을 효과적으로 조절하는 새로운 강화 학습 훈련 알고리즘인 RAPO(Reflection-aware Adaptive Policy Optimization)를 제안합니다. 검색 증강 생성(RAG), 복잡한 표 이해, 요약과 같은 기업 중심 작업에서 Yuan3.0 Flash는 일관되게 우수한 성능을 달성합니다. 또한 수학, 과학 등의 분야에서도 강력한 추론 능력을 보여주며, 평균 토큰 사용량을 약 1/4에서 1/2로 줄이면서도 프론티어 모델과 대등한 정확도를 달성합니다. Yuan3.0 Flash는 후속 연구와 실제 도입을 촉진하기 위해 완전한 오픈 소스로 공개되었습니다: https://github.com/Yuan-lab-LLM/Yuan3.0.

Original Abstract

We introduce Yuan3.0 Flash, an open-source Mixture-of-Experts (MoE) MultiModal Large Language Model featuring 3.7B activated parameters and 40B total parameters, specifically designed to enhance performance on enterprise-oriented tasks while maintaining competitive capabilities on general-purpose tasks. To address the overthinking phenomenon commonly observed in Large Reasoning Models (LRMs), we propose Reflection-aware Adaptive Policy Optimization (RAPO), a novel RL training algorithm that effectively regulates overthinking behaviors. In enterprise-oriented tasks such as retrieval-augmented generation (RAG), complex table understanding, and summarization, Yuan3.0 Flash consistently achieves superior performance. Moreover, it also demonstrates strong reasoning capabilities in domains such as mathematics, science, etc., attaining accuracy comparable to frontier model while requiring only approximately 1/4 to 1/2 of the average tokens. Yuan3.0 Flash has been fully open-sourced to facilitate further research and real-world deployment: https://github.com/Yuan-lab-LLM/Yuan3.0.

1 Citations

0 Influential

52.18220981415 Altmetric

261.9 Score

Original PDF

187

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!