2601.17920v1 Jan 25, 2026 cs.AI

연성 물질 분야의 자율 구동 실험실을 위한 에이전트 AI: 분류 체계, 벤치마크 및 당면 과제

Agentic AI for Self-Driving Laboratories in Soft Matter: Taxonomy, Benchmarks,and Open Challenges

Xuanzhou Chen

Citations: 2

h-index: 1

Audrey X. Wang

Citations: 11

h-index: 1

Hanyang Jiang

Citations: 2

h-index: 1

Dong Zhang

Citations: 0

h-index: 0

S. Yin

Citations: 0

h-index: 0

자율 구동 실험실(SDL)은 실험 설계, 자동화된 실행, 데이터 기반 의사 결정 사이의 루프를 완성하며, 비용이 많이 드는 행동, 노이즈가 많고 지연된 피드백, 엄격한 실행 가능성 및 안전 제약, 비정상성(non-stationarity)과 같은 조건 하에서 에이전트 AI를 위한 까다로운 테스트베드를 제공한다. 본 서베이는 연성 물질을 대표적인 환경으로 사용하지만, 실제 실험실에서 발생하는 AI 문제에 초점을 맞춘다. 우리는 SDL 자율성을 명시적인 관측, 행동, 비용, 제약 조건이 있는 에이전트-환경 상호 작용 문제로 정의하고, 이를 사용하여 일반적인 SDL 파이프라인을 확립된 AI 원칙과 연결한다. 또한 샘플 효율적인 실험 선택을 위한 베이지안 최적화 및 능동적 학습, 장기적인 프로토콜 최적화를 위한 계획(planning) 및 강화 학습, 이기종 장비와 소프트웨어를 조율하는 도구 사용 에이전트 등 폐루프 실험을 가능하게 하는 주요 방법론들을 검토한다. 우리는 디버깅, 재현성 및 안전한 운영을 지원하는 검증 가능하고 이력 추적(provenance)이 가능한 정책을 강조한다. 이어 의사 결정 범위, 불확실성 모델링, 행동 매개변수화, 제약 조건 처리, 실패 복구 및 인간 개입을 기준으로 시스템을 구성하는 역량 중심의 분류 체계를 제안한다. 의미 있는 비교를 위해, 비용 고려 성능, 변화(drift)에 대한 강건성, 제약 조건 위반 거동 및 재현성을 우선시하는 벤치마크 작업 템플릿과 평가 지표를 종합한다. 마지막으로 배포된 SDL에서의 교훈을 도출하고 다중 모달 표현, 보정된 불확실성, 안전한 탐색 및 공유 벤치마크 인프라 분야의 당면 과제들을 개략적으로 설명한다.

Original Abstract

Self-driving laboratories (SDLs) close the loop between experiment design, automated execution, and data-driven decision making, and they provide a demanding testbed for agentic AI under expensive actions, noisy and delayed feedback, strict feasibility and safety constraints, and non-stationarity. This survey uses soft matter as a representative setting but focuses on the AI questions that arise in real laboratories. We frame SDL autonomy as an agent environment interaction problem with explicit observations, actions, costs, and constraints, and we use this formulation to connect common SDL pipelines to established AI principles. We review the main method families that enable closed loop experimentation, including Bayesian optimization and active learning for sample efficient experiment selection, planning and reinforcement learning for long horizon protocol optimization, and tool using agents that orchestrate heterogeneous instruments and software. We emphasize verifiable and provenance aware policies that support debugging, reproducibility, and safe operation. We then propose a capability driven taxonomy that organizes systems by decision horizon, uncertainty modeling, action parameterization, constraint handling, failure recovery, and human involvement. To enable meaningful comparison, we synthesize benchmark task templates and evaluation metrics that prioritize cost aware performance, robustness to drift, constraint violation behavior, and reproducibility. Finally, we distill lessons from deployed SDLs and outline open challenges in multi-modal representation, calibrated uncertainty, safe exploration, and shared benchmark infrastructure.

0 Citations

0 Influential

0.5 Altmetric

2.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!