2602.05048v1 Feb 04, 2026 cs.AI

MINT: 목표 주도형 지식 공백 추론 및 능동적 도출을 위한 최소 정보 뉴로-심볼릭 트리

MINT: Minimal Information Neuro-Symbolic Tree for Objective-Driven Knowledge-Gap Reasoning and Active Elicitation

Zeyu Fang

Citations: 48

h-index: 5

Tian Lan

Citations: 120

h-index: 7

Mahdi Imani

Citations: 107

h-index: 6

언어 기반 상호작용을 통한 공동 계획은 인간-AI 티밍의 핵심 분야입니다. 개방형 세계에서의 계획 문제는 종종 관련된 객체나 인간의 목표 및 의도와 같은 불완전한 정보와 미지수의 다양한 측면을 포함하며, 이는 공동 계획에 있어 지식 공백을 초래합니다. 우리는 객체 주도형 계획에서 AI 에이전트가 인간의 입력을 능동적으로 도출해내기 위한 최적의 상호작용 전략을 탐색하는 문제를 다룹니다. 이를 위해, 우리는 지식 공백의 영향을 추론하는 최소 정보 뉴로-심볼릭 트리(MINT)를 제안하고, MINT를 활용한 자가 플레이(self-play)를 통해 AI 에이전트의 도출 전략 및 질의를 최적화합니다. 구체적으로 MINT는 가능한 인간-AI 상호작용에 대한 명제를 생성하여 심볼릭 트리를 구축하고, 신경 계획 정책을 참조하여 잔존하는 지식 공백으로 인한 계획 결과의 불확실성을 추정합니다. 마지막으로, 대규모 언어 모델(LLM)을 활용하여 MINT의 추론 과정을 탐색 및 요약하고, 최상의 계획 성능을 위해 인간의 입력을 최적으로 이끌어낼 수 있는 질의 세트를 선별합니다. 지식 공백이 존재하는 확장된 마르코프 결정 과정 제품군을 고려하여, 능동적 인간 도출을 수행하는 MINT에 대한 수익 보장(return guarantee)을 분석합니다. 현실성이 점차 증가하는, 본 적 없거나 알 수 없는 객체를 포함한 세 가지 벤치마크에서 평가한 결과, MINT 기반 계획은 작업당 제한된 횟수의 질문만으로도 전문가 수준에 근접한 수익을 달성하였으며, 보상과 성공률을 크게 향상시킨 것으로 나타났습니다.

Original Abstract

Joint planning through language-based interactions is a key area of human-AI teaming. Planning problems in the open world often involve various aspects of incomplete information and unknowns, e.g., objects involved, human goals/intents -- thus leading to knowledge gaps in joint planning. We consider the problem of discovering optimal interaction strategies for AI agents to actively elicit human inputs in object-driven planning. To this end, we propose Minimal Information Neuro-Symbolic Tree (MINT) to reason about the impact of knowledge gaps and leverage self-play with MINT to optimize the AI agent's elicitation strategies and queries. More precisely, MINT builds a symbolic tree by making propositions of possible human-AI interactions and by consulting a neural planning policy to estimate the uncertainty in planning outcomes caused by remaining knowledge gaps. Finally, we leverage LLM to search and summarize MINT's reasoning process and curate a set of queries to optimally elicit human inputs for best planning performance. By considering a family of extended Markov decision processes with knowledge gaps, we analyze the return guarantee for a given MINT with active human elicitation. Our evaluation on three benchmarks involving unseen/unknown objects of increasing realism shows that MINT-based planning attains near-expert returns by issuing a limited number of questions per task while achieving significantly improved rewards and success rates.

3 Citations

0 Influential

3.5 Altmetric

20.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!