H

Hongtao Liu

Total Citations
95
h-index
5
Papers
3

Publications

#1 2606.05749v1 Jun 04, 2026

MARDoc: A Memory-Aware Refinement Agent Framework for Multimodal Long Document QA

Iterative retrieval-reasoning agents have recently shown promise for multimodal long-document question answering. However, most existing systems maintain a single growing context that mixes retrieval traces, observations, and intermediate reasoning. As interactions accumulate, key evidence becomes scattered and diluted, making multi-hop reasoning noisy. We propose MARDoc, a Memory-Aware Refinement Agent framework that decouples long-document QA into three specialized agents: an Explorer for multi-granularity multimodal retrieval, a Refiner for distilling interaction traces into structured evidence and reasoning memories, and a Reflector for checking evidence sufficiency and providing targeted feedback. Across iterations, the agents rely on a dynamically updated structured memory rather than a full accumulated interaction history. This design reduces context noise while preserving answer-critical facts and their logical dependencies. Experiments on MMLongBench-Doc and DocBench show that MARDoc achieves strong results, outperforming same-backbone baselines and demonstrating the effectiveness of structured memory for agentic document QA.

Hongtao Liu Jian Yang Qiyao Peng Kaifeng Chen Yongqiang Liu +2
0 Citations
#2 2601.10457v1 Jan 15, 2026

NSR-Boost: A Neuro-Symbolic Residual Boosting Framework for Industrial Legacy Models

Although the Gradient Boosted Decision Trees (GBDTs) dominate industrial tabular applications, upgrading legacy models in high-concurrency production environments still faces prohibitive retraining costs and systemic risks. To address this problem, we present NSR-Boost, a neuro-symbolic residual boosting framework designed specifically for industrial scenarios. Its core advantage lies in being "non-intrusive". It treats the legacy model as a frozen model and performs targeted repairs on "hard regions" where predictions fail. The framework comprises three key stages: first, finding hard regions through residuals, then generating interpretable experts by generating symbolic code structures using Large Language Model (LLM) and fine-tuning parameters using Bayesian optimization, and finally dynamically integrating experts with legacy model output through a lightweight aggregator. We report on the successful deployment of NSR-Boost within the core financial risk control system at Qfin Holdings. This framework not only significantly outperforms state-of-the-art (SOTA) baselines across six public datasets and one private dataset, more importantly, shows excellent performance gains on real-world online data. In conclusion, it effectively captures long-tail risks missed by traditional models and offers a safe, low-cost evolutionary paradigm for industry.

Ziming Dai Dabiao Ma Jian Yang Haojun Fei J. Tong +3
0 Citations
#3 2601.10457v2 Jan 15, 2026

NSR-Boost: A Neuro-Symbolic Residual Boosting Framework for Industrial Legacy Models

Although the Gradient Boosted Decision Trees (GBDTs) dominate industrial tabular applications, upgrading legacy models in high-concurrency production environments still faces prohibitive retraining costs and systemic risks. To address this problem, we present NSR-Boost, a neuro-symbolic residual boosting framework designed specifically for industrial scenarios. Its core advantage lies in being "non-intrusive". It treats the legacy model as a frozen model and performs targeted repairs on "hard regions" where predictions fail. The framework comprises three key stages: First, finding hard regions through residuals, then generating interpretable experts by generating symbolic code structures using Large Language Model (LLM) and fine-tuning parameters using Bayesian optimization, and finally dynamically integrating experts with legacy model output through a lightweight aggregator. Experimental results demonstrate that the framework not only significantly outperforms state-of-the-art (SOTA) baselines across six public datasets and one private dataset. More importantly, we report the successful deployment of NSR-Boost within the core financial risk control system of Qfin Holdings, where empirical results on real-world online traffic exhibit superior performance improvements and a significant reduction in the bad rate. In conclusion, it effectively captures long-tail risks missed by traditional models and offers a safe, low-cost evolutionary paradigm for industry.

Ziming Dai Dabiao Ma Jian Yang Haojun Fei J. Tong +3
0 Citations