2604.11200v1 Apr 13, 2026 cs.LG

ShapShift: 서브그룹 조건부 섀플리 값을 이용한 모델 예측 변화 설명

ShapShift: Explaining Model Prediction Shifts with Subgroup Conditional Shapley Values

Manuela Veloso

Citations: 69

h-index: 5

Salim I. Amoukou

Citations: 110

h-index: 6

Tom Bewley

Citations: 88

h-index: 5

Emanuele Albini

Citations: 390

h-index: 7

Saumitra Mishra

Citations: 88

h-index: 6

입력 분포의 변화는 머신러닝 모델의 평균 예측값에 변화를 초래할 수 있습니다. 이러한 예측 변화는 하위 비즈니스 결과(예: 은행의 대출 승인율)에 영향을 미칠 수 있으므로, 그 원인을 이해하는 것이 중요합니다. 본 논문에서는 ShapShift를 제안합니다. ShapShift는 해석 가능한 데이터 서브그룹의 조건부 확률 변화에 예측 변화를 귀속시키는 섀플리 값 방법입니다. 여기서 서브그룹은 결정 트리 구조로 정의됩니다. 먼저 이 방법을 단일 결정 트리에 적용하여, 분할 노드에서의 조건부 확률 변화를 기반으로 정확한 설명을 제공합니다. 다음으로, 가장 설명력이 높은 트리를 선택하고 잔여 효과를 고려하여 트리 앙상블로 확장합니다. 마지막으로, 새로운 목적 함수를 사용하여 성장된 대체 트리를 사용하는 모델 독립적인 변형을 제안하여, 신경망과 같은 모델에도 적용할 수 있도록 합니다. 정확한 계산은 계산 비용이 많이 들 수 있지만, 근사화 기술을 통해 실제 적용이 가능합니다. ShapShift는 다양한 모델 클래스에서 예측 변화에 대한 간단하고 충실하며 거의 완전한 설명을 제공하며, 이는 동적인 환경에서 모델 모니터링에 도움이 됩니다.

Original Abstract

Changes in input distribution can induce shifts in the average predictions of machine learning models. Such prediction shifts may impact downstream business outcomes (e.g. a bank's loan approval rate), so understanding their causes can be crucial. We propose \ours{}: a Shapley value method for attributing prediction shifts to changes in the conditional probabilities of interpretable subgroups of data, where these subgroups are defined by the structure of decision trees. We initially apply this method to single decision trees, providing exact explanations based on conditional probability changes at split nodes. Next, we extend it to tree ensembles by selecting the most explanatory tree and accounting for residual effects. Finally, we propose a model-agnostic variant using surrogate trees grown with a novel objective function, allowing application to models like neural networks. While exact computation can be intensive, approximation techniques enable practical application. We show that \ours{} provides simple, faithful, and near-complete explanations of prediction shifts across model classes, aiding model monitoring in dynamic environments.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!