2601.12913v1 Jan 19, 2026 cs.AI

실행 가능한 해석 가능성은 대칭성의 관점에서 정의되어야 한다

Actionable Interpretability Must Be Defined in Terms of Symmetries

M. Jamnik

Citations: 2,824

h-index: 26

Pietro Barbiero

Citations: 590

h-index: 11

M. Zarlenga

Citations: 463

h-index: 10

Francesco Giannini

Citations: 22

h-index: 3

Alberto Termine

Citations: 37

h-index: 2

Giuseppe Marra

Citations: 58

h-index: 3

F. Bonchi

Citations: 36

h-index: 3

본 논문은 인공지능 분야의 해석 가능성 연구가 근본적으로 잘못 설정되어 있다고 주장한다. 이는 기존의 해석 가능성 정의가 '실행 가능(actionable)'하지 않기 때문이며, 즉 구체적인 모델링 및 추론 규칙을 도출할 수 있는 형식적 원칙을 제공하지 못하고 있다는 것이다. 우리는 해석 가능성의 정의가 실행 가능하기 위해서는 반드시 '대칭성(symmetries)'의 관점에서 정의되어야 한다고 주장한다. 우리는 네 가지 대칭성만으로도 (i) 핵심적인 해석 가능성 속성의 근거를 제시하고, (ii) 해석 가능한 모델 클래스를 규명하며, (iii) 해석 가능한 추론(예: 정렬, 개입, 반사실)을 베이지안 역변환의 한 형태로 유도하는 통합된 공식을 도출하기에 충분하다는 가설을 제시한다.

Original Abstract

This paper argues that interpretability research in Artificial Intelligence is fundamentally ill-posed as existing definitions of interpretability are not *actionable*: they fail to provide formal principles from which concrete modelling and inferential rules can be derived. We posit that for a definition of interpretability to be actionable, it must be given in terms of *symmetries*. We hypothesise that four symmetries suffice to (i) motivate core interpretability properties, (ii) characterize the class of interpretable models, and (iii) derive a unified formulation of interpretable inference (e.g., alignment, interventions, and counterfactuals) as a form of Bayesian inversion.

1 Citations

0 Influential

13 Altmetric

66.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!