2602.21864v1 Feb 25, 2026 cs.CV

DynamicGTR: 그래프 토폴로지 표현 방식 선호도를 활용하여 그래프 질의응답(QA)에서 VLM 성능 향상

DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs

James T. Kwok

Citations: 2,318

h-index: 13

Yanbin Wei

Citations: 167

h-index: 4

Jiangyue Yan

Citations: 80

h-index: 4

Chun Kang

Citations: 29

h-index: 3

Yang Chen

Citations: 23

h-index: 3

Huaizhong Liu

Citations: 11

h-index: 2

Yu Zhang

Citations: 14

h-index: 3

비전-언어 모델(VLM)은 다양한 분야에서 제로샷 질의응답(QA)을 위한 다재다능한 솔루션으로 부상했습니다. 그러나 VLM이 구조화된 그래프를 효과적으로 이해하고 정확하고 효율적인 QA를 수행하는 것은 여전히 어려운 과제입니다. 기존 접근 방식은 일반적으로 고정된 스타일의 시각적 이미지 또는 통합된 텍스트 설명과 같은 단일 그래프 토폴로지 표현(GTR)에 의존합니다. 이러한 '만능' 전략은 모델별 및 작업별 선호도를 종종 무시하여 그래프 관련 질의에 대한 부정확하거나 지나치게 긴 응답을 초래합니다. 이를 해결하기 위해, 우리는 각 질의에 대해 추론 과정에서 최적의 GTR을 동적으로 선택하는 $\mbox{DynamicGTR}$ 프레임워크를 제안합니다. 이를 통해 VLM의 제로샷 그래프 QA 기능을 향상시키고, 사용자가 원하는 정확도와 간결성 균형을 맞출 수 있습니다. 광범위한 실험 결과, DynamicGTR은 VLM 기반 그래프 알고리즘 QA 성능을 향상시킬 뿐만 아니라, 추가적인 학습 없이 합성 그래프 알고리즘 작업에서 얻은 경험을 링크 예측 및 노드 분류와 같은 실제 응용 분야로 성공적으로 이전할 수 있음을 보여줍니다. 또한, DynamicGTR은 다양한 작업, 도메인 및 모델에 걸쳐 강력한 일반화 성능을 보여주며, 광범위한 그래프 시나리오에 대한 유연한 솔루션으로서의 잠재력을 시사합니다.

Original Abstract

Vision-Language Models (VLMs) have emerged as versatile solutions for zero-shot question answering (QA) across various domains. However, enabling VLMs to effectively comprehend structured graphs and perform accurate, efficient QA remains challenging. Existing approaches typically rely on one single graph topology representation (GTR), such as fixed-style visual images or unified text descriptions. This ``one-size-fits-all'' strategy often neglects model-specific and task-specific preferences, resulting in inaccurate or over-lengthy responses to graph-related queries. To address this, we propose the $\mbox{DynamicGTR}$ framework, which dynamically selects the optimal GTR for each query during inference, thereby enhancing the zero-shot graph QA capabilities of VLMs with a customizable accuracy and brevity trade-off. Extensive experiments show that DynamicGTR not only improves VLM-based graph algorithm QA performance but also successfully transfers the experience trained from synthetic graph algorithm tasks to real-world applications like link prediction and node classification, without any additional training. Additionally, DynamicGTR demonstrates strong transferability across tasks, domains, and models, suggesting its potential as a flexible solution for broad graph scenarios.

5 Citations

0 Influential

6.5 Altmetric

37.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!