2603.01879v1 Mar 02, 2026 cs.LG

표현 기하학적 지표를 활용한 일반화 실패 진단

Diagnosing Generalization Failures from Representational Geometry Markers

Chi-Ning Chou

Citations: 85

h-index: 5

Artem Kirsanov

Citations: 13

h-index: 1

Yaodong Yang

Citations: 10

h-index: 2

SueYeon Chung

Citations: 99

h-index: 5

일반화 능력, 즉 학습 환경을 넘어 좋은 성능을 보이는 능력은 생물학적 및 인공 지능의 중요한 특징이지만, 예측 불가능한 실패를 미리 예측하는 것은 여전히 중요한 과제입니다. 기존 접근 방식은 종종 해석 가능한 특징 또는 회로를 역설계하여 설명 모델을 구축하는 '하향식' 메커니즘적 경로를 따릅니다. 이러한 방법은 유용하지만, 실제 환경에 적용될 때 실패를 예측하기 위한 고수준의 예측 신호를 제공하는 데 어려움을 겪는 경우가 많습니다. 본 연구에서는 의료 바이오마커에서 영감을 받아 '상향식' 접근 방식을 제안합니다. 즉, 모델의 미래 성능을 나타내는 강력한 지표 역할을 하는 시스템 수준의 측정값을 식별합니다. 복잡한 내부 메커니즘을 분석하는 대신, 우리는 네트워크 마커를 체계적으로 설계하고 테스트하여 구조, 기능 간의 관계를 파악하고, 예측 지표를 식별하며, 실제 환경에서 예측을 검증합니다. 이미지 분류 작업에서, 학습 데이터 내 객체의 관련 기하학적 특성이 일반화 성능이 낮은 데이터(OOD)에 대한 예측 성능 저하를 꾸준히 예측한다는 것을 발견했습니다. 특히, 효과적인 매니폴드 차원과 유용성이라는 두 가지 기하학적 측정값의 감소는 다양한 아키텍처, 최적화 방법 및 데이터 세트에 걸쳐 OOD 성능 저하를 예측합니다. 이러한 발견을 ImageNet으로 사전 학습된 모델의 전이 학습에 적용했습니다. 일관적으로 동일한 기하학적 패턴이 ID 정확도보다 OOD 전이 성능을 더 안정적으로 예측한다는 것을 확인했습니다. 본 연구는 표현 기하학이 숨겨진 취약점을 드러낼 수 있으며, 이는 모델 선택 및 AI 해석 가능성을 위한 더욱 강력한 지침을 제공할 수 있음을 보여줍니다.

Original Abstract

Generalization, the ability to perform well beyond the training context, is a hallmark of biological and artificial intelligence, yet anticipating unseen failures remains a central challenge. Conventional approaches often take a ``bottom-up'' mechanistic route by reverse-engineering interpretable features or circuits to build explanatory models. While insightful, these methods often struggle to provide the high-level, predictive signals for anticipating failure in real-world deployment. Here, we propose using a ``top-down'' approach to studying generalization failures inspired by medical biomarkers: identifying system-level measurements that serve as robust indicators of a model's future performance. Rather than mapping out detailed internal mechanisms, we systematically design and test network markers to probe structure, function links, identify prognostic indicators, and validate predictions in real-world settings. In image classification, we find that task-relevant geometric properties of in-distribution (ID) object manifolds consistently forecast poor out-of-distribution (OOD) generalization. In particular, reductions in two geometric measures, effective manifold dimensionality and utility, predict weaker OOD performance across diverse architectures, optimizers, and datasets. We apply this finding to transfer learning with ImageNet-pretrained models. We consistently find that the same geometric patterns predict OOD transfer performance more reliably than ID accuracy. This work demonstrates that representational geometry can expose hidden vulnerabilities, offering more robust guidance for model selection and AI interpretability.

0 Citations

0 Influential

2.5 Altmetric

12.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!