2601.21051v1 Jan 28, 2026 cs.AI

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B 기술 보고서

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report

Assaf Eisenman

Citations: 15,463

h-index: 12

Baturay Sağlam

Citations: 217

h-index: 8

Zhuoran Yang

Citations: 182

h-index: 6

Ed Li

Citations: 25

h-index: 3

Jianliang He

Citations: 12

h-index: 1

Aman Priyanshu

Citations: 77

h-index: 3

Paul Kassianik

Citations: 697

h-index: 6

Sajana Weerawardhena

Citations: 51

h-index: 4

Anu Vellore

Citations: 44

h-index: 3

Blaine Nelson

Citations: 622

h-index: 5

Neusha Javidnia

Citations: 16

h-index: 2

Arthur Goldblatt

Citations: 12

h-index: 1

Fraser Burch

Citations: 40

h-index: 2

Avi Zohary

Citations: 28

h-index: 1

Mahdi Sabbaghi

Citations: 41

h-index: 3

Supriti Vijay

Manipal Institute Of Technology

Citations: 83

h-index: 5

Rahim Dharssi

Citations: 1

h-index: 1

Dhruv Kedia

Citations: 39

h-index: 2

Kojin Oshiba

Citations: 83

h-index: 5

Yaron Singer

Citations: 640

h-index: 5

Amin Karbasi

Citations: 98

h-index: 6

본 연구에서는 사이버 보안을 위한 최초의 오픈 소스 기반 추론 모델인 Foundation-Sec-8B-Reasoning을 소개합니다. 이 모델은 이전에 출시된 Foundation-Sec-8B 기반 모델(Llama-3.1-8B-Base에서 파생)을 기반으로 하며, 지도 학습 미세 조정(SFT)과 검증 가능한 보상을 이용한 강화 학습(RLVR)의 두 단계 과정을 통해 훈련되었습니다. 저희의 훈련 과정은 사이버 보안 분석, 지시 따르기, 그리고 수학적 추론을 포괄하는 독점적인 추론 데이터를 활용합니다. 10개의 사이버 보안 벤치마크 및 10개의 범용 벤치마크를 통한 평가 결과, 저희 모델은 사이버 보안 작업에서 훨씬 큰 모델과 경쟁력 있는 성능을 보이며, 동시에 강력한 일반적인 능력을 유지합니다. 이 모델은 다단계 추론 작업에서 효과적인 일반화 능력을 보여주며, 적절한 시스템 프롬프트와 안전 장치가 적용되었을 때 뛰어난 안전 성능을 제공합니다. 본 연구는 도메인 전문 추론 모델이 특정 작업에서 강력한 성능을 달성하면서도 광범위한 일반적인 능력을 유지할 수 있음을 보여줍니다. 모델은 https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning 에서 공개적으로 제공됩니다.

Original Abstract

We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, instruction-following, and mathematical reasoning. Evaluation across 10 cybersecurity benchmarks and 10 general-purpose benchmarks demonstrates performance competitive with significantly larger models on cybersecurity tasks while maintaining strong general capabilities. The model shows effective generalization on multi-hop reasoning tasks and strong safety performance when deployed with appropriate system prompts and guardrails. This work demonstrates that domain-specialized reasoning models can achieve strong performance on specialized tasks while maintaining broad general capabilities. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.

1 Citations

1 Influential

26 Altmetric

133.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!