2604.04274v1 Apr 05, 2026 cs.AI

InferenceEvolve: 자체 진화형 인공지능을 활용한 자동화된 인과 효과 추정기 개발

InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

Yiqun T. Chen

Citations: 204

h-index: 7

Hongyu Zhao

Citations: 62

h-index: 3

Can Wang

Citations: 146

h-index: 3

인과 추론은 과학적 발견의 핵심이지만, 복잡한 통계 방법론과 실제 데이터로 인해 적절한 방법을 선택하는 것은 여전히 어려운 과제입니다. 인공지능이 과학적 발견을 가속화하는 데 성공한 사례에서 영감을 받아, 우리는 대규모 언어 모델을 사용하여 인과적 방법을 발견하고 반복적으로 개선하는 진화 프레임워크인 InferenceEvolve를 소개합니다. 널리 사용되는 벤치마크에서 InferenceEvolve는 일관되게 기존의 기준 성능을 능가하는 추정기를 제공합니다. 최근 커뮤니티 대회에서 58개의 인간 제출물 중, 저희의 최고 성능 추정기는 두 가지 평가 지표 모두에서 파레토 최적점 위에 위치했습니다. 또한, 반합성 데이터가 없는 환경에서도 강력한 대체 목표 함수를 개발하여 경쟁력 있는 결과를 얻었습니다. 진화 경로 분석 결과, 에이전트들이 공개되지 않은 데이터 생성 메커니즘에 맞게 정교한 전략을 점진적으로 발견하는 것을 알 수 있습니다. 이러한 결과는 언어 모델 기반의 진화가 부분적으로만 관찰되는 결과를 가진 인과 추론과 같은 구조화된 과학적 프로그램을 최적화할 수 있음을 시사합니다.

Original Abstract

Causal inference is central to scientific discovery, yet choosing appropriate methods remains challenging because of the complexity of both statistical methodology and real-world data. Inspired by the success of artificial intelligence in accelerating scientific discovery, we introduce InferenceEvolve, an evolutionary framework that uses large language models to discover and iteratively refine causal methods. Across widely used benchmarks, InferenceEvolve yields estimators that consistently outperform established baselines: against 58 human submissions in a recent community competition, our best evolved estimator lay on the Pareto frontier across two evaluation metrics. We also developed robust proxy objectives for settings without semi-synthetic outcomes, with competitive results. Analysis of the evolutionary trajectories shows that agents progressively discover sophisticated strategies tailored to unrevealed data-generating mechanisms. These findings suggest that language-model-guided evolution can optimize structured scientific programs such as causal inference, even when outcomes are only partially observed.

0 Citations

0 Influential

3.5 Altmetric

17.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!