2602.07203v1 Feb 06, 2026 cs.LG

정확한 do-Shapley 값 계산

Exactly Computing do-Shapley Values

F. Fumagalli

Citations: 11

h-index: 3

R. Witter

Citations: 131

h-index: 6

'Alvaro Parafita

Citations: 3

h-index: 1

Tomas Garriga

Citations: 8

h-index: 2

Maximilian Muschalik

Citations: 343

h-index: 10

Axel Brando

Citations: 42

h-index: 4

Lucas Rosenblatt

Citations: 259

h-index: 8

구조적 인과 모델(SCM)은 자연 과학 전반에 걸친 복잡한 동역학을 설명하는 강력한 프레임워크입니다. SCM을 해석하는 특히 우아한 방법 중 하나는 do-Shapley 값으로, 게임 이론적 방법론을 사용하여 기하급수적으로 많은 개입 상황에서 변수 $d$개에 대한 평균 효과를 정량화합니다. Shapley 값과 마찬가지로, do-Shapley 값을 계산하려면 일반적으로 기하급수적으로 많은 항을 평가해야 합니다. 본 연구의 기반은 SCM의 불변 집합을 사용하여 do-Shapley 값을 재정의하는 것입니다. 이러한 통찰력을 활용하여, SCM의 그래프 구조에 따라 $d$부터 $2^d$까지 다양할 수 있는 불변 집합의 개수 $r$에 선형적인 시간 내에 do-Shapley 값을 정확하게 계산할 수 있습니다. $r$은 사전에 알려져 있지 않으므로, 정확한 알고리즘을 일반적인 Shapley 값 추정기처럼, 임의의 쿼리 예산으로 실행할 수 있는 추정기로 보완했습니다. 쿼리 예산이 $r$에 가까워질수록, 본 연구의 추정기는 기존 방법보다 훨씬 더 정확한 추정치를 제공하며, 예산이 $r$에 도달하면 기계 정밀도까지 Shapley 값을 반환합니다. 계산 속도 외에도, 식별 부담을 줄였습니다. 비모수적 식별 가능성은 모든 클래스가 아닌, $d$개의 단일 변수 집합에 대한 개입 효과만 식별하면 do-Shapley 값을 통해 달성될 수 있음을 증명했습니다.

Original Abstract

Structural Causal Models (SCM) are a powerful framework for describing complicated dynamics across the natural sciences. A particularly elegant way of interpreting SCMs is do-Shapley, a game-theoretic method of quantifying the average effect of $d$ variables across exponentially many interventions. Like Shapley values, computing do-Shapley values generally requires evaluating exponentially many terms. The foundation of our work is a reformulation of do-Shapley values in terms of the irreducible sets of the underlying SCM. Leveraging this insight, we can exactly compute do-Shapley values in time linear in the number of irreducible sets $r$, which itself can range from $d$ to $2^d$ depending on the graph structure of the SCM. Since $r$ is unknown a priori, we complement the exact algorithm with an estimator that, like general Shapley value estimators, can be run with any query budget. As the query budget approaches $r$, our estimators can produce more accurate estimates than prior methods by several orders of magnitude, and, when the budget reaches $r$, return the Shapley values up to machine precision. Beyond computational speed, we also reduce the identification burden: we prove that non-parametric identifiability of do-Shapley values requires only the identification of interventional effects for the $d$ singleton coalitions, rather than all classes.

3 Citations

0 Influential

5 Altmetric

28.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!