2605.27361v1 May 26, 2026 cs.AI

Natural Language Query to Configuration for Retrieval Agents

Negar Arabzadeh
Negar Arabzadeh
Citations: 1,040
h-index: 19
Matei A. Zaharia
Matei A. Zaharia
Citations: 341
h-index: 7
Esha Choukse
Esha Choukse
Citations: 1,894
h-index: 18
Melissa Z. Pan
Melissa Z. Pan
Citations: 55
h-index: 2
Mathew Jacob
Mathew Jacob
Citations: 15
h-index: 2
Fiodar Kazhamiaka
Fiodar Kazhamiaka
Citations: 1,110
h-index: 14

Modern retrieval agents expose many configuration choices -- LLM, retriever, number of documents, number of hops, and synthesis strategy -- each shaping both answer quality and serving cost. Today, these pipelines are typically hand-tuned once per workload, leaving substantial per-query optimization untapped. We formulate the problem: given a natural-language query and either an accuracy or a budget target, select from a predefined pipeline catalog the configuration that minimizes cost or maximizes accuracy at inference time. We propose **BRANE**, which uses an LLM to convert each query into workload-specific characteristics, then trains a lightweight per-configuration predictor that estimates whether the pipeline will answer the query correctly. At inference time, **BRANE** selects the configuration that maximizes predicted correctness penalized by cost, exposing a tunable cost-quality tradeoff without retraining. Across MuSiQue, BrowseComp-Plus, and FinanceBench, **BRANE** consistently pushes the cost-quality Pareto frontier, matches the best fixed configuration's accuracy at up to 89% lower cost, and outperforms LLM-routing, rule-based, and fine-tuned Qwen3-4B baselines. These results show that per-query configuration of the full retrieval pipeline is a practical alternative to static workload-level tuning.

0 Citations
0 Influential
9.5 Altmetric
47.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!