2601.17178v1 Jan 23, 2026 cs.CR

TrojanGYM: 적응형 RTL 하드웨어 트로이 목마 삽입을 위한 루프 내 검출기를 갖춘 LLM 프레임워크

TrojanGYM: A Detector-in-the-Loop LLM for Adaptive RTL Hardware Trojan Insertion

Minghao Shao

Citations: 454

h-index: 11

Saideep Sreekumar

Citations: 12

h-index: 2

Zeng Wang

Citations: 132

h-index: 6

Akashdeep Saha

Citations: 86

h-index: 6

W. Xiao

Citations: 46

h-index: 4

Muhammad Shafique

Citations: 56

h-index: 4

Ozgur Sinanoglu

Citations: 227

h-index: 7

Ramesh Karri

Citations: 134

h-index: 7

J. Knechtel

Citations: 1,459

h-index: 23

하드웨어 트로이 목마(HT)는 학습 기반 검출기가 좁은 트리거/페이로드 패턴 및 작고 스타일화된 벤치마크에 과적합되는 경향이 있어 여전히 심각한 위협입니다. 본 논문에서는 TrojanGYM을 소개합니다. TrojanGYM은 에이전트 기반의 LLM 기반 프레임워크로, 설계의 정확성을 유지하면서 검출기의 취약점을 드러내기 위해 HT 삽입을 자동으로 큐레이션합니다. 고수준 HT 사양을 기반으로, GPT-4, LLaMA-3.3-70B, Gemini-2.5Pro와 같은 LLM 에이전트 그룹이 협력하여 정상적인 기능에 영향을 주지 않으면서 다양한 트리거 및 페이로드를 구현하는 RTL 수정 사항을 제안하고 개선합니다. TrojanGYM은 HT 검출기와 공동 설계된 피드백 기반 벤치마크 생성 루프를 구현하며, 제약 조건을 고려한 구문 검사 및 GNN 기반 HT 검출기가 피드백을 제공하여 HT 사양 및 삽입 전략을 반복적으로 개선하여 검출기의 취약점을 더욱 명확하게 드러냅니다. 또한, LLM에 의해 생성된 HT 디자인에 대한 그래프 추출, 학습 안정성 및 예측 신뢰성을 향상시킨 GNN4TJ의 새로운 구현인 Robust-GNN4TJ를 제안합니다. 가장 어려운 TrojanGYM에서 생성된 벤치마크에서, Robust-GNN4TJ는 이전 GNN 기반 검출기에 비해 HT 검출률을 0%에서 60%로 향상시켰습니다. 본 논문에서는 SRAM, AES-128, UART 설계를 RTL 수준에서 사용하여 TrojanGYM을 구현하고, TrojanGYM이 현대적인 GNN 기반 검출기에 대해 최대 83.33%의 회피율을 보이는 다양한 기능을 갖춘 HT를 체계적으로 생성하며, 기존의 TrustHub 스타일 벤치마크만으로는 드러나지 않는 검출기의 안정성 결함을 보여줍니다. 동료 검토 후, 모든 코드 및 관련 자료를 공개할 예정입니다.

Original Abstract

Hardware Trojans (HTs) remain a critical threat because learning-based detectors often overfit to narrow trigger/payload patterns and small, stylized benchmarks. We introduce TrojanGYM, an agentic, LLM-driven framework that automatically curates HT insertions to expose detector blind spots while preserving design correctness. Given high-level HT specifications, a suite of cooperating LLM agents (instantiated with GPT-4, LLaMA-3.3-70B, and Gemini-2.5Pro) proposes and refines RTL modifications that realize diverse triggers and payloads without impacting normal functionality. TrojanGYM implements a feedback-driven benchmark generation loop co-designed with HT detectors, in which constraint-aware syntactic checking and GNN-based HT detectors provide feedback that iteratively refines HT specifications and insertion strategies to better surface detector blind spots. We further propose Robust-GNN4TJ, a new implementation of the GNN4TJ with improved graph extraction, training robustness, and prediction reliability, especially on LLM-generated HT designs. On the most challenging TrojanGYM-generated benchmarks, Robust-GNN4TJ raises HT detection rates from 0% to 60% relative to a prior GNN-based detector. We instantiate TrojanGYM on SRAM, AES-128, and UART designs at RTL level, and show that it systematically produces diverse, functionally correct HTs that reach up to 83.33% evasion rates against modern GNN-based detectors, revealing robustness gaps that are not apparent when these detectors are evaluated solely on existing TrustHub-style benchmarks. Post peer-review, we will release all codes and artifacts.

1 Citations

0 Influential

11.5 Altmetric

58.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!