2603.00532v1 Feb 28, 2026 cs.AI

DenoiseFlow: 불확실성 인지 노이즈 제거를 통한 신뢰성 있는 LLM 에이전트 워크플로우

DenoiseFlow: Uncertainty-Aware Denoising for Reliable LLM Agentic Workflows

Wei Wang

Citations: 0

h-index: 0

Ruiting Dai

Citations: 18

h-index: 3

Junwei Peng

Citations: 0

h-index: 0

Shijie Li

Citations: 32

h-index: 3

Yi-Tai Shang

Citations: 0

h-index: 0

Can Deng

Citations: 59

h-index: 4

Yong Zhao

Citations: 22

h-index: 2

Jiaqi Zhu

Citations: 13

h-index: 2

Yu Huang

Citations: 1

h-index: 1

Yandong Yan

Citations: 9

h-index: 1

자율 에이전트는 수학적 추론부터 소프트웨어 생성에 이르기까지 복잡하고 장기적인 작업에 점점 더 많이 활용되고 있습니다. 에이전트 워크플로우는 이러한 작업을 다단계 추론 체인으로 분해하여 수행함으로써 효율성을 높이지만, 단계가 길어질수록 신뢰도가 크게 저하됩니다. 특히, 자연어 명령어의 사소한 해석 오류가 단계별로 은밀하게 누적되는 현상이 발생합니다. 우리는 이러한 실패 방식을 '누적된 의미적 모호성'이라고 명명합니다. 기존의 문제 해결 방식은 종종 런타임 적응성이 부족하며, 정적인 탐색 예산, 반응적인 오류 복구 또는 불확실성을 완전히 무시하는 단일 경로 실행에 의존합니다. 우리는 다단계 추론 과정을 '잡음 마르코프 결정 프로세스(Noisy MDP)'로 공식화하고, 세 단계로 구성된 순환 프레임워크인 DenoiseFlow를 제안합니다. (1) 'Sensing'은 각 단계별 의미적 불확실성을 추정합니다. (2) 'Regulating'은 추정된 위험에 따라 빠른 단일 경로 실행과 병렬 탐색 간의 연산을 적응적으로 할당합니다. (3) 'Correcting'은 영향 기반의 근본 원인 분석을 통해 표적 복구를 수행합니다. 온라인 자기 교정은 검증자 피드백을 지속적으로 활용하여 의사 결정 경계를 조정하며, 이를 위해 ground-truth 라벨이 필요하지 않습니다. 수학적 추론, 코드 생성 및 다중 홉 질의응답을 포괄하는 여섯 가지 벤치마크에서 DenoiseFlow는 모든 벤치마크에서 가장 높은 정확도를 달성했으며(평균 83.3%, 최적의 기준 모델 대비 +1.3%), 적응적 분기를 통해 비용을 40~56% 절감했습니다. 상세한 ablation 연구는 프레임워크 수준의 견고성과 일반성을 더욱 뒷받침합니다. 코드 및 관련 자료는 https://anonymous.4open.science/r/DenoiseFlow-21D3/ 에서 확인할 수 있습니다.

Original Abstract

Autonomous agents are increasingly entrusted with complex, long-horizon tasks, ranging from mathematical reasoning to software generation. While agentic workflows facilitate these tasks by decomposing them into multi-step reasoning chains, reliability degrades significantly as the sequence lengthens. Specifically, minor interpretation errors in natural-language instructions tend to compound silently across steps. We term this failure mode accumulated semantic ambiguity. Existing approaches to mitigate this often lack runtime adaptivity, relying instead on static exploration budgets, reactive error recovery, or single-path execution that ignores uncertainty entirely. We formalize the multi-step reasoning process as a Noisy MDP and propose DenoiseFlow, a closed-loop framework that performs progressive denoising through three coordinated stages: (1)Sensing estimates per-step semantic uncertainty; (2)Regulating adaptively allocates computation by routing between fast single-path execution and parallel exploration based on estimated risk; and (3)Correcting performs targeted recovery via influence-based root-cause localization. Online self-calibration continuously aligns decision boundaries with verifier feedback, requiring no ground-truth labels. Experiments on six benchmarks spanning mathematical reasoning, code generation, and multi-hop QA show that DenoiseFlow achieves the highest accuracy on every benchmark (83.3% average, +1.3% over the strongest baseline) while reducing cost by 40--56% through adaptive branching. Detailed ablation studies further confirm framework-level's robustness and generality. Code is available at https://anonymous.4open.science/r/DenoiseFlow-21D3/.

0 Citations

0 Influential

2 Altmetric

10.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!