2601.12731v1 Jan 19, 2026 cs.CL

다국어 언어 모델에서의 어려움의 공유된 기하학적 구조

A Shared Geometry of Difficulty in Multilingual Language Models

Stefano Civelli

Citations: 30

h-index: 3

Pietro Bernardelle

Citations: 34

h-index: 4

Nicolò Brunello

Citations: 27

h-index: 2

Gianluca Demartini

Citations: 144

h-index: 5

대규모 언어 모델(LLM)에서 문제 난이도 예측은 모델 자체에 의해 판단되는 작업의 난이도를 추정하는 것을 의미하며, 일반적으로 모델의 내부 표현에 대한 선형 프로브를 훈련하여 수행됩니다. 본 연구에서는 Easy2Hard 벤치마크의 AMC 서브셋을 21개 언어로 번역하여, LLM에서 문제 난이도의 다국어 기하학적 구조를 분석했습니다. 연구 결과, 난이도와 관련된 신호는 모델 내부의 두 가지 뚜렷한 단계, 즉 얕은(초기 계층) 및 깊은(후기 계층) 내부 표현에서 나타나며, 이들은 기능적으로 다른 행동을 보입니다. 깊은 표현에 훈련된 프로브는 동일 언어에서 높은 정확도를 달성하지만, 교차 언어 일반화 능력은 좋지 않습니다. 반면, 얕은 표현에 훈련된 프로브는 동일 언어 성능은 낮지만, 언어 간 일반화 능력은 훨씬 뛰어납니다. 이러한 결과는 LLM이 먼저 문제 난이도의 언어에 독립적인 표현을 형성한 다음, 이를 언어별로 특화시킨다는 것을 시사합니다. 이는 LLM 해석에 대한 기존 연구 결과와 일치하며, 모델이 언어별 출력을 생성하기 전에 추상적인 개념 공간에서 작동하는 경향이 있다는 것을 보여줍니다. 본 연구는 이러한 두 단계의 표현 과정이 의미 내용뿐만 아니라 문제 난이도 추정 등 고차원적인 인지적 특성에도 적용된다는 것을 입증합니다.

Original Abstract

Predicting problem-difficulty in large language models (LLMs) refers to estimating how difficult a task is according to the model itself, typically by training linear probes on its internal representations. In this work, we study the multilingual geometry of problem-difficulty in LLMs by training linear probes using the AMC subset of the Easy2Hard benchmark, translated into 21 languages. We found that difficulty-related signals emerge at two distinct stages of the model internals, corresponding to shallow (early-layers) and deep (later-layers) internal representations, that exhibit functionally different behaviors. Probes trained on deep representations achieve high accuracy when evaluated on the same language but exhibit poor cross-lingual generalization. In contrast, probes trained on shallow representations generalize substantially better across languages, despite achieving lower within-language performance. Together, these results suggest that LLMs first form a language-agnostic representation of problem difficulty, which subsequently becomes language-specific. This closely aligns with existing findings in LLM interpretability showing that models tend to operate in an abstract conceptual space before producing language-specific outputs. We demonstrate that this two-stage representational process extends beyond semantic content to high-level meta-cognitive properties such as problem-difficulty estimation.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!