J

Jun Zhao

Total Citations
337
h-index
6
Papers
2

Publications

#1 2602.14035v1 Feb 15, 2026

FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning

Flowchart-oriented dialogue (FOD) systems aim to guide users through multi-turn decision-making or operational procedures by following a domain-specific flowchart to achieve a task goal. In this work, we formalize flowchart reasoning in FOD as grounding user input to flowchart nodes at each dialogue turn while ensuring node transition is consistent with the correct flowchart path. Despite recent advances of LLMs in task-oriented dialogue systems, adapting them to FOD still faces two limitations: (1) LLMs lack an explicit mechanism to represent and reason over flowchart topology, and (2) they are prone to hallucinations, leading to unfaithful flowchart reasoning. To address these limitations, we propose FloCA, a zero-shot flowchart-oriented conversational agent. FloCA uses an LLM for intent understanding and response generation while delegating flowchart reasoning to an external tool that performs topology-constrained graph execution, ensuring faithful and logically consistent node transitions across dialogue turns. We further introduce an evaluation framework with an LLM-based user simulator and five new metrics covering reasoning accuracy and interaction efficiency. Extensive experiments on FLODIAL and PFDial datasets highlight the bottlenecks of existing LLM-based methods and demonstrate the superiority of FloCA. Our codes are available at https://github.com/Jinzi-Zou/FloCA-flowchart-reasoning.

Shuo Zhang Jinzi Zou Nuo Xu Bo Wang Liang Li +1
0 Citations
#2 2601.08311v1 Jan 13, 2026

Enhancing Image Quality Assessment Ability of LMMs via Retrieval-Augmented Generation

Large Multimodal Models (LMMs) have recently shown remarkable promise in low-level visual perception tasks, particularly in Image Quality Assessment (IQA), demonstrating strong zero-shot capability. However, achieving state-of-the-art performance often requires computationally expensive fine-tuning methods, which aim to align the distribution of quality-related token in output with image quality levels. Inspired by recent training-free works for LMM, we introduce IQARAG, a novel, training-free framework that enhances LMMs' IQA ability. IQARAG leverages Retrieval-Augmented Generation (RAG) to retrieve some semantically similar but quality-variant reference images with corresponding Mean Opinion Scores (MOSs) for input image. These retrieved images and input image are integrated into a specific prompt. Retrieved images provide the LMM with a visual perception anchor for IQA task. IQARAG contains three key phases: Retrieval Feature Extraction, Image Retrieval, and Integration & Quality Score Generation. Extensive experiments across multiple diverse IQA datasets, including KADID, KonIQ, LIVE Challenge, and SPAQ, demonstrate that the proposed IQARAG effectively boosts the IQA performance of LMMs, offering a resource-efficient alternative to fine-tuning for quality assessment.

Jun Zhao Xiongkuo Min Huiyu Duan Kang Fu Zicheng Zhang +3
0 Citations