2604.16742v1 Apr 17, 2026 cs.AI

CT Open: 개방형 접근, 오염되지 않은, 실시간 플랫폼 - 임상 시험 결과 예측의 개방형 과제

CT Open: An Open-Access, Uncontaminated, Live Platform for the Open Challenge of Clinical Trial Outcome Prediction

Yujia Liu

Citations: 0

h-index: 0

Yang Zhang

Citations: 868

h-index: 15

Jianyou Wang

Citations: 214

h-index: 5

Youze Zheng

UC San Diego

Citations: 14

h-index: 2

Longtian Bao

UC San Diego

Citations: 14

h-index: 2

Hanyuan Zhang

Citations: 8

h-index: 1

Yu-Heng Chen

Citations: 5

h-index: 1

Matthew I. Feng

Citations: 41

h-index: 4

Maxim Khan

Citations: 6

h-index: 1

A. Sehgal

Citations: 178

h-index: 7

Christopher D. Rosin

Citations: 13

h-index: 3

R. Paturi

Citations: 6,500

h-index: 30

U. Dube

Citations: 5

h-index: 1

Leon Bergen

Citations: 82

h-index: 5

과학자들은 오랫동안 실제 현상의 결과를 예측하는 정확한 방법을 추구해 왔습니다. 인공지능 시스템이 이를 더욱 안정적으로 수행할 수 있을까요? 우리는 임상 시험 결과 예측이라는, 심지어 해당 분야 전문가에게도 어려운 중요한 과제를 통해 이 질문을 연구합니다. 우리는 CT Open이라는 개방형 접근의 실시간 플랫폼을 소개합니다. 이 플랫폼은 매년 네 개의 챌린지를 진행합니다. 누구나 각 챌린지에 대한 예측을 제출할 수 있습니다. CT Open은 제출된 예측을 평가하는데, 평가 대상은 제출 시점에는 결과가 공개되지 않았지만 이후 공개된 임상 시험입니다. 특정 날짜 이전에 임상 시험의 결과가 인터넷에 공개되었는지 확인하는 것은 놀랍도록 어렵습니다. 공식 등록부에 게시된 결과는 수년의 지연이 있을 수 있으며, 최초의 언급은 잘 알려지지 않은 기사에서 나타날 수 있습니다. 이를 해결하기 위해, 우리는 반복적인 LLM 기반 웹 검색을 사용하여 임상 시험 결과의 가장 이른 언급을 식별하는 새로운 완전 자동화된 데이터 정제 파이프라인을 제안합니다. 우리는 이 파이프라인의 품질과 정확성을 전문가의 주석을 통해 검증합니다. CT Open의 파이프라인은 평가되는 모든 임상 시험에 대해 예측이 이루어진 시점에 공개된 결과가 없도록 보장하므로, 참가자들은 어떤 방법론과 데이터 소스를 사용하더라도 자유입니다. 본 논문에서는 학습 데이터 세트와 두 개의 시간별 테스트 벤치마크인 Winter 2025 및 Summer 2025를 공개합니다. 우리는 CT Open이 발생하기 전에 실제 현상의 결과를 예측하는 AI 연구를 발전시키는 중앙 허브 역할을 할 수 있을 뿐만 아니라, 생의학 연구에 기여하고 임상 시험 설계를 개선하는 데 기여할 수 있다고 믿습니다. CT Open 플랫폼은 $\{https://ct-open.net/}$ 에서 호스팅됩니다.

Original Abstract

Scientists have long sought to accurately predict outcomes of real-world events before they happen. Can AI systems do so more reliably? We study this question through clinical trial outcome prediction, a high-stakes open challenge even for domain experts. We introduce CT Open, an open-access, live platform that will run four challenge every year. Anyone can submit predictions for each challenge. CT Open evaluates those submissions on trials whose outcomes were not yet public at the time of submission but were made public afterwards. Determining if a trial's outcome is public on the internet before a certain date is surprisingly difficult. Outcomes posted on official registries may lag behind by years, while the first mention may appear in obscure articles. To address this, we propose a novel, fully automated decontamination pipeline that uses iterative LLM-powered web search to identify the earliest mention of trial outcomes. We validate the pipeline's quality and accuracy by human expert's annotations. Since CT Open's pipeline ensures that every evaluated trial had no publicly reported outcome when the prediction was made, it allows participants to use any methodology and any data source. In this paper, we release a training set and two time-stamped test benchmarks, Winter 2025 and Summer 2025. We believe CT Open can serve as a central hub for advancing AI research on forecasting real-world outcomes before they occur, while also informing biomedical research and improving clinical trial design. CT Open Platform is hosted at $\href{https://ct-open.net/}{https://ct-open.net/}$

1 Citations

0 Influential

15 Altmetric

76.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!