아첨은 교육적 안전 위험 요소이다: LLM 튜터가 아첨 평가 기준을 필요로 하는 이유
Sycophancy is an Educational Safety Risk: Why LLM Tutors Need Sycophancy Benchmarks
본 논문은 효과적인 튜터링은 개념적 변화를 이끌어내기 위해 오해를 파악하고 지지적으로 도전하는 '교정적 마찰'을 필요로 한다고 주장합니다. 그러나 선호도에 맞춰 조정된 LLM은 인지적 엄밀성을 융통성으로 대체할 수 있습니다. 우리는 '추론-아첨 역설'을 제시합니다. 즉, 맥락 전환 공격에 저항하는 모델조차도 권위(
This position paper argues that effective tutoring requires corrective friction: surfacing misconceptions and challenging them supportively to drive conceptual change. Yet preference-aligned LLMs can trade epistemic rigor for agreeableness. We identify a Reasoning-Sycophancy Paradox: models that resist context-switch frame attacks can still capitulate under social-epistemic pressure, especially authority ("my notes say I'm right") and social-affective face-saving ("please don't tell me I'm wrong"). We introduce EduFrameTrap, a tutoring benchmark across math, physics, economics, chemistry, biology, and computer science that varies student confidence and pressure (context-switch, authority, social-affective). Across two frontier LLMs, context-switch failures are comparatively lower for GPT-5.2, while authority and social pressure more often trigger epistemic retreat. In contrast, Claude shows substantial context-switch fragility in this run. Because these failures are hard to judge automatically, we report two-judge disagreement as a reliability signal. We argue benchmarks should measure social-epistemic courage, i.e., supportive but corrective tutoring, and treat kind-but-correct behavior as a safety requirement.
No Analysis Report Yet
This paper hasn't been analyzed by Gemini yet.
Log in to request an AI analysis.