2602.14901v1 Feb 16, 2026 cs.LG

적합한 전문가 선택: Attentive Neural Process 기반의 작업 전문 모델 선택 - 에이전트 기반 의료 시스템을 위한 도구

Picking the Right Specialist: Attentive Neural Process-based Selection of Task-Specialized Models as Tools for Agentic Healthcare Systems

Pramit Saha

Citations: 133

h-index: 7

Joshua Strong

Citations: 136

h-index: 2

M. Alsharid

Citations: 229

h-index: 8

J. Noble

Citations: 179

h-index: 8

D. Mishra

Citations: 72

h-index: 3

작업 전문 모델은 에이전트 기반 의료 시스템의 핵심 구성 요소로서, 질병 진단, 위치 파악, 보고서 생성 등 다양한 작업에 대한 임상적 질문에 대한 답변을 가능하게 합니다. 그러나 특정 작업에 대해 단 하나의 '최고' 모델이 존재하는 경우는 드뭅니다. 실제로 각 작업은 여러 개의 경쟁하는 전문가 모델에 의해 더 잘 수행될 수 있으며, 각 모델은 서로 다른 데이터 샘플에서 뛰어난 성능을 보입니다. 따라서, 주어진 질문에 대해 에이전트는 다양한 도구 후보군 중에서 올바른 전문가 모델을 안정적으로 선택해야 합니다. 이를 위해, 우리는 TaskSelect라는 방법을 제안합니다. TaskSelect는 작업 조건부 선택 손실의 일관된 대리값을 사용하여 샘플링된 전문가 모델 후보군에 대한 모집단 위험을 최소화함으로써, 도구 선택을 적응적으로 학습합니다. 구체적으로, 우리는 쿼리와 각 모델의 행동 요약 정보를 기반으로 전문가 모델 중에서 선택하는 Attentive Neural Process 기반의 선택기를 제안합니다. 확립된 테스트 환경이 없다는 점에 착안하여, 우리는 최초로 다양한 작업 전문 모델(17개의 질병 감지, 19개의 보고서 생성, 6개의 시각적 위치 파악, 13개의 시각적 질의 응답)을 갖춘 에이전트 기반 흉부 X선 환경을 구축하고, 1448개의 질문으로 구성된 벤치마크인 ToolSelectBench를 개발했습니다. 우리의 결과는 ToolSelect가 네 가지 서로 다른 작업 유형에서 10개의 최첨단 방법보다 일관되게 우수한 성능을 보인다는 것을 보여줍니다.

Original Abstract

Task-specialized models form the backbone of agentic healthcare systems, enabling the agents to answer clinical queries across tasks such as disease diagnosis, localization, and report generation. Yet, for a given task, a single "best" model rarely exists. In practice, each task is better served by multiple competing specialist models where different models excel on different data samples. As a result, for any given query, agents must reliably select the right specialist model from a heterogeneous pool of tool candidates. To this end, we introduce ToolSelect, which adaptively learns model selection for tools by minimizing a population risk over sampled specialist tool candidates using a consistent surrogate of the task-conditional selection loss. Concretely, we propose an Attentive Neural Process-based selector conditioned on the query and per-model behavioral summaries to choose among the specialist models. Motivated by the absence of any established testbed, we, for the first time, introduce an agentic Chest X-ray environment equipped with a diverse suite of task-specialized models (17 disease detection, 19 report generation, 6 visual grounding, and 13 VQA) and develop ToolSelectBench, a benchmark of 1448 queries. Our results demonstrate that ToolSelect consistently outperforms 10 SOTA methods across four different task families.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!