2601.19249v1 Jan 27, 2026 cs.AI

GLOVE: LLM 메모리-환경 재정렬을 위한 전역 검증기

GLOVE: Global Verifier for LLM Memory-Environment Realignment

Citations: 26

h-index: 1

Citations: 179

h-index: 6

대부분의 기존 메모리 강화 거대언어모델(LLM) 접근 방식은 메모리의 유효성이 작업별 성공 신호를 제공하는 외부 평가자나 메모리 항목 편집을 위한 성찰(reflection)과 같은 내부 모델 인지를 통해 확립될 수 있다고 암묵적으로 가정합니다. 그러나 이러한 가정은 동적인 변화(drifts)가 있는 실제 환경에서는 무너지는 경우가 많습니다. 본 연구에서는 상대적 진실(relative notion of truth) 개념을 확립하여 LLM 메모리 시스템에 새로운 설계 차원을 도입하는 프레임워크인 전역 검증기(Global Verifier, GLOVE)를 제안합니다. GLOVE는 검색된 메모리와 새로운 관측 간의 불일치를 감지하는 능동적 탐색(active probing)을 통해, 정답(ground-truth) 지도나 모델의 내성(introspection)에 크게 의존하지 않고도 메모리를 검증 및 업데이트하여 메모리-환경 재정렬을 가능하게 합니다. 우리는 웹 탐색, 계획 및 제어를 포괄하는 다양한 벤치마크에서 GLOVE를 평가하였으며, 기존 벤치마크 설정을 넘어 비정상성(non-stationarity)을 야기하는 통제된 환경 변화를 추가하여 실험을 진행했습니다. 실험 결과, GLOVE는 에이전트의 성공률을 크게 향상시키는 것으로 나타났으며, 이는 자가 진화할 수 있는 인지 에이전트를 향한 강력한 경로를 제시합니다.

Original Abstract

Most existing memory-enhanced Large Language Model (LLM) approaches implicitly assume that memory validity can be established either through external evaluators that provide task-specific success signals or through internal model cognition, such as reflection, for editing memory entries. However, these assumptions often break down in practical environments with dynamic drifts. We propose the Global Verifier (GLOVE), a framework that introduces a new design dimension for LLM memory systems by establishing a relative notion of truth. Through active probing to detect inconsistencies between retrieved memories and fresh observations, GLOVE enables memory-environment realignment by verifying and updating memory without access to ground-truth supervision or strong reliance on model introspection. We evaluate GLOVE on diverse benchmarks spanning web navigation, planning, and control, augmented with controlled environmental drifts that introduce non-stationarity beyond the original benchmark settings. Our results show that GLOVE substantially improves agent success rates, suggesting a robust pathway to cognitive agents capable of self-evolving.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!