2602.16953v1 Feb 18, 2026 cs.AI

LLM4Cov: 실행 인식 에이전트 기반 학습을 통한 고-커버리지 테스트벤치 생성

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

Hejia Zhang

Citations: 167

h-index: 5

Jishen Zhao

Citations: 37

h-index: 3

Zhongming Yu

Citations: 174

h-index: 5

Haoxing Ren

Citations: 109

h-index: 5

Brucek Khailany

Citations: 9,218

h-index: 46

Chia-Tung Ho

Citations: 257

h-index: 7

실행 정보를 활용하는 LLM 에이전트는 도구 피드백을 통해 학습하는 유망한 방법론이지만, 이러한 피드백은 일반적으로 비용이 많이 들고 획득하는 데 시간이 오래 걸려 온라인 강화 학습(RL)을 비현실적으로 만듭니다. 고-커버리지 하드웨어 검증은 산업용 시뮬레이터에 의존하고 미분 가능한 실행 신호를 사용하므로 이러한 어려움을 잘 보여줍니다. 본 논문에서는 검증을 메모리스 상태 전환으로 모델링하고, 결정적인 평가기를 사용하여 안내하는 오프라인 에이전트 학습 프레임워크인 LLM4Cov를 제안합니다. 이러한 모델링을 기반으로, 우리는 실행 검증 데이터 큐레이션, 정책 기반 에이전트 데이터 합성, 그리고 최악의 상태 우선 샘플링을 도입하여 실행 제약 조건 하에서 확장 가능한 학습을 가능하게 합니다. 또한, 기존 검증 스위트에서 파생된 현실적인 벤치마크를 수정된 평가 프로토콜을 통해 큐레이션했습니다. 제안된 파이프라인을 사용한 40억 개의 파라미터를 가진 소형 모델은 에이전트 평가 하에서 69.2%의 커버리지 달성률을 보여주었으며, 이는 기존 모델보다 5.3% 향상된 성능이며, 크기가 10배 더 큰 모델과 경쟁력 있는 성능을 보였습니다.

Original Abstract

Execution-aware LLM agents offer a promising paradigm for learning from tool feedback, but such feedback is often expensive and slow to obtain, making online reinforcement learning (RL) impractical. High-coverage hardware verification exemplifies this challenge due to its reliance on industrial simulators and non-differentiable execution signals. We propose LLM4Cov, an offline agent-learning framework that models verification as memoryless state transitions guided by deterministic evaluators. Building on this formulation, we introduce execution-validated data curation, policy-aware agentic data synthesis, and worst-state-prioritized sampling to enable scalable learning under execution constraints. We further curate a reality-aligned benchmark adapted from an existing verification suite through a revised evaluation protocol. Using the proposed pipeline, a compact 4B-parameter model achieves 69.2% coverage pass rate under agentic evaluation, outperforming its teacher by 5.3% and demonstrating competitive performance against models an order of magnitude larger.

3 Citations

0 Influential

23 Altmetric

118.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!