2602.15252v1 Feb 16, 2026 cs.GT

불완전한 기억 하의 의사 결정: 알고리즘 및 벤치마크

Decision Making under Imperfect Recall: Algorithms and Benchmarks

Emanuel Tewolde

Citations: 245

h-index: 7

I. Anagnostides

Citations: 138

h-index: 7

T. Sandholm

Citations: 25

h-index: 3

Vincent Conitzer

Citations: 232

h-index: 7

B. Zhang

Citations: 6,943

h-index: 13

게임 이론에서 불완전한 기억 의사 결정 문제는 에이전트가 과거에 보유했던 정보를 잊는 상황을 모델링합니다. 이는 '건망증이 있는 운전자' 게임 및 제한된 통신을 사용하는 팀 게임과 같은 게임을 포함합니다. 본 논문에서는 불완전한 기억 의사 결정 문제에 대한 최초의 벤치마크 세트를 소개합니다. 당사의 벤치마크는 다양한 문제 유형을 포괄하며, 여기에는 민감한 정보를 수집하는 AI 시스템의 개인 정보 보호 및 시뮬레이션 환경에서 에이전트를 테스트하여 AI 안전성을 확보하는 것과 관련된 문제들이 포함됩니다. 본 벤치마크 세트를 사용하여 생성된 61개의 문제 인스턴스에 대해, 이러한 문제에서 1차 최적 전략을 찾는 다양한 알고리즘의 성능을 평가했습니다. 특히, 비선형 제약 조건 최적화에 대한 회한 매칭(Regret Matching, RM) 알고리즘 패밀리를 소개합니다. 이 파라미터가 없는 알고리즘 패밀리는 대규모 2인 영-합 게임을 해결하는 데 큰 성공을 거두었지만, 놀랍게도 이전에 해당 분야를 넘어 널리 연구되지 않았습니다. 주요 결과는 RM 알고리즘이 일반적으로 사용되는 1차 최적화 알고리즘(예: 투영 경사 하강법)보다 일관되게 우수한 성능을 보이며, 종종 수치적으로 큰 차이를 보인다는 것입니다. 이를 통해 RM 패밀리가 대규모 제약 조건 최적화 문제에 대한 강력한 접근 방식임을 처음으로 입증했습니다.

Original Abstract

In game theory, imperfect-recall decision problems model situations in which an agent forgets information it held before. They encompass games such as the ``absentminded driver'' and team games with limited communication. In this paper, we introduce the first benchmark suite for imperfect-recall decision problems. Our benchmarks capture a variety of problem types, including ones concerning privacy in AI systems that elicit sensitive information, and AI safety via testing of agents in simulation. Across 61 problem instances generated using this suite, we evaluate the performance of different algorithms for finding first-order optimal strategies in such problems. In particular, we introduce the family of regret matching (RM) algorithms for nonlinear constrained optimization. This class of parameter-free algorithms has enjoyed tremendous success in solving large two-player zero-sum games, but, surprisingly, they were hitherto relatively unexplored beyond that setting. Our key finding is that RM algorithms consistently outperform commonly employed first-order optimizers such as projected gradient descent, often by orders of magnitude. This establishes, for the first time, the RM family as a formidable approach to large-scale constrained optimization problems.

2 Citations

0 Influential

6.5 Altmetric

34.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!