2604.18206v1 Apr 20, 2026 cs.AI

학습 없이 메모리를 사용하는 제어 아키텍처

A Control Architecture for Training-Free Memory Use

Yan Lu

Citations: 29

h-index: 3

Zhicheng Qian

Citations: 829

h-index: 17

Xingyu Zhou

Citations: 6

h-index: 2

M. Jiang

Citations: 3

h-index: 1

프롬프트 주입을 통해 메모리를 활용하면 모델 가중치를 업데이트하지 않고도 추론 능력을 향상시킬 수 있지만, 동시에 제어 문제가 발생합니다. 검색된 내용은 올바른 상태에서 적용될 때만 유용하기 때문입니다. 본 연구에서는 엄격하게 학습이 없는 환경에서 이 문제를 다루며, 이를 '적용 가능성 제어' 문제로 정의합니다. 즉, 언제 메모리 기반의 두 번째 추론 단계를 수행할지, 언제 검색된 내용을 신뢰할지, 그리고 시간이 지남에 따라 메모리 저장소를 어떻게 유지할지를 결정하는 문제입니다. 저희의 방법은 불확실성 기반 라우팅, 신뢰도 기반 선택적 수용, 규칙 기반 및 예제 기반 메모리 저장소 선택, 그리고 증거 기반의 메모리 저장소 관리 기능을 결합합니다. 제한된 학습 환경에서 컴퓨팅 자원을 동일하게 사용한 비교군과 비교했을 때, 제안된 방법은 SVAMP 벤치마크에서 +7.0점, ASDiv 벤치마크에서 +7.67점의 성능 향상을 보였습니다. 동일한 아키텍처는 QA 및 에이전트 벤치마크에서도 긍정적인 효과를 보였으며, 주요 산술 작업에서는 두 번째 체크포인트에서도 동일한 긍정적인 방향의 성능 향상을 나타냈습니다. 산술 작업의 주요 경험적 결과는, 원시 메모리 활용보다는 제어 아키텍처 자체가 SVAMP 및 ASDiv 벤치마크의 성능 향상을 이끌었다는 것입니다. 메커니즘적으로, 신뢰도 지표는 유용한 규칙 기반 메모리 개입과 해로운 개입을 구분하며, 고정된 검색 설정 하에서, 수정된 항목이 실제로 포함된 행에서만 '수정'과 '손상'의 차이가 나타납니다.

Original Abstract

Prompt-injected memory can improve reasoning without updating model weights, but it also creates a control problem: retrieved content helps only when it is applied in the right state. We study this problem in a strict training-free setting and formulate it as applicability control: when to trigger a memory-assisted second pass, when to trust it, and how to maintain the memory bank over time. Our method combines uncertainty-based routing, confidence-based selective acceptance, bank selection across rule and exemplar memory, and evidence-based governance of the memory bank over time. Under a locked training-free protocol with compute-matched controls, it improves two core arithmetic benchmarks by +7.0 points on SVAMP and +7.67 points on ASDiv over baseline. The same architecture also transfers to QA and agent benchmarks with smaller positive effects and shows the same positive direction on a second checkpoint for the main arithmetic tasks. On arithmetic, the main empirical pattern is that the control architecture, rather than raw memory exposure, drives the improvements on SVAMP and ASDiv. Mechanistically, confidence separates helpful from harmful rule-bank interventions, and under fixed retrieval the repair-versus-corrupt difference localizes to rows whose retrieved set actually contains the edited entries.

0 Citations

0 Influential

8.5 Altmetric

42.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!