2605.28070v1 May 27, 2026 cs.AI

Bridging the Detection-to-Abstention Gap in Reasoning Models under Insufficient Information

Jinjie Gu

Citations: 476

h-index: 12

Yihao Wang

Citations: 50

h-index: 4

Ye Chen

Citations: 15

h-index: 2

Yuan Wang

Citations: 47

h-index: 3

Renjie Gu

Citations: 26

h-index: 2

Yun Yue

Citations: 17

h-index: 2

Hansong Xiao

Citations: 32

h-index: 2

Chunxiao Guo

Citations: 44

h-index: 4

Peidan Wei

Citations: 40

h-index: 1

Yixin Cao

Citations: 235

h-index: 7

Jiaxu Li

Citations: 16

h-index: 3

We highlight a failure mode of large reasoning models on questions with insufficient information: models may recognize that a problem is under-specified, yet still continue reasoning and produce unsupported final answers instead of abstaining. We formalize this mismatch as the detection-to-abstention gap, where detected insufficiency fails to translate into final abstention. This gap is especially concerning in high-risk domains such as medical AI, where answers based on incomplete evidence can be more harmful than refusal. To close this gap, we propose Judge-Then-Solve (JTS), a trajectory-level reasoning-control framework that trains models to make an explicit answerability commitment before solution generation. Rather than treating abstention as a final-answer style, JTS casts it as a control decision: the model either proceeds to solve or terminates early based on its answerability judgment. We instantiate this policy through supervised warm-up and missing-premise reinforcement learning with consistency and length-shaping rewards. Experiments on dense and MoE reasoning models show that JTS substantially improves reliable abstention across datasets and pushes Abstention@Detection (A@D) to near-saturation, indicating that models not only detect missing information but also act on that detection. By terminating unanswerable trajectories immediately after the answerability judgment, JTS reduces unnecessary reasoning and improves inference efficiency when continued deliberation would amplify unsupported assumptions. We also observe that missing-premise training can alter reasoning behavior on difficult but answerable problems, reducing unproductive self-reflection. These results suggest that abstention under insufficient information is a key form of reasoning control for deploying reasoning models safely and efficiently.

0 Citations

0 Influential

6 Altmetric

30.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!