2604.18827v1 Apr 20, 2026 q-bio.NC

OmniMouse: 1500억 개의 신경 정보 토큰을 활용한 다중 모달, 다중 작업 뇌 모델의 확장 특성 연구

OmniMouse: Scaling properties of multi-modal, multi-task Brain Models on 150B Neural Tokens

Sophia Sanborn

Citations: 58

h-index: 4

K. Willeke

Citations: 550

h-index: 12

P. Turishcheva

Citations: 57

h-index: 4

Alex Gilbert

Citations: 5

h-index: 2

Goirik Chakrabarty

Citations: 60

h-index: 5

H. Bedel

Citations: 674

h-index: 5

Paul G. Fahey

Citations: 1,557

h-index: 20

Yongrong Qiu

Citations: 15

h-index: 2

Marissa A. Weis

Citations: 652

h-index: 10

Michaela Vystrvcilov'a

Citations: 49

h-index: 3

Taliah Muhammad

Citations: 1,218

h-index: 16

Lydia Ntanavara

Citations: 4

h-index: 1

R. Froebe

Citations: 42

h-index: 3

Kayla Ponder

Baylor College of Medicine, Stanford School of Medicine

Citations: 470

h-index: 11

Zhengwei Tan

Citations: 10

h-index: 2

Emin Orhan

Citations: 572

h-index: 5

Erick Cobos

Citations: 1,378

h-index: 16

Katrin Franke

Citations: 103

h-index: 4

Fabian H. Sinz

Citations: 183

h-index: 7

A. Ecker

Citations: 108

h-index: 5

A. Tolias

Citations: 18,040

h-index: 65

데이터와 인공 신경망의 확장은 AI 분야에 혁신을 가져왔으며, 특히 언어 및 시각 분야에서 괄목할 만한 발전을 이루었습니다. 이러한 원리가 뇌 활동 모델링에도 적용될 수 있을지는 불분명합니다. 본 연구에서는 73마리의 쥐의 시각 피질에서 추출한 310만 개의 뉴런 데이터를 활용하여 323회 세션 동안 자연 영상, 이미지, 그리고 제어된 자극을 통해 얻은 1500억 개 이상의 신경 정보 토큰을 기록했습니다. 또한, 다양한 모달리티와 작업을 동시에 처리할 수 있는 모델을 개발하여, 신경 예측, 행동 해독, 신경 예측 또는 이들의 조합과 같은 세 가지 방식으로 테스트를 수행했습니다. OmniMouse는 기존의 특화된 모델들을 능가하는 최첨단 성능을 보여주었습니다. 연구 결과, 성능은 데이터 증가에 따라 안정적으로 향상되지만, 모델 크기를 늘리는 것만으로는 더 이상의 성능 향상을 기대하기 어렵습니다. 이는 언어 및 컴퓨터 비전 분야에서 관찰되는 일반적인 AI 확장 방식과는 다른 결과입니다. 언어 및 컴퓨터 비전 분야에서는 대규모 데이터가 모델 파라미터 확장을 주도하는 반면, 뇌 모델링, 특히 쥐의 시각 피질과 같은 비교적 단순한 시스템에서도 모델은 여전히 데이터에 의해 제한되는 경향이 있습니다. 이러한 체계적인 확장 현상은 신경 모델링에서 상전이(phase transition)가 발생할 가능성을 제시합니다. 즉, 더 크고 풍부한 데이터 세트를 통해 질적으로 새로운 기능이 발현될 수 있으며, 이는 대규모 언어 모델에서 관찰되는 새로운 특성과 유사할 수 있습니다. 코드: https://github.com/enigma-brain/omnimouse

Original Abstract

Scaling data and artificial neural networks has transformed AI, driving breakthroughs in language and vision. Whether similar principles apply to modeling brain activity remains unclear. Here we leveraged a dataset of 3.1 million neurons from the visual cortex of 73 mice across 323 sessions, totaling more than 150 billion neural tokens recorded during natural movies, images and parametric stimuli, and behavior. We train multi-modal, multi-task models that support three regimes flexibly at test time: neural prediction, behavioral decoding, neural forecasting, or any combination of the three. OmniMouse achieves state-of-the-art performance, outperforming specialized baselines across nearly all evaluation regimes. We find that performance scales reliably with more data, but gains from increasing model size saturate. This inverts the standard AI scaling story: in language and computer vision, massive datasets make parameter scaling the primary driver of progress, whereas in brain modeling -- even in the mouse visual cortex, a relatively simple system -- models remain data-limited despite vast recordings. The observation of systematic scaling raises the possibility of phase transitions in neural modeling, where larger and richer datasets might unlock qualitatively new capabilities, paralleling the emergent properties seen in large language models. Code available at https://github.com/enigma-brain/omnimouse.

3 Citations

0 Influential

50 Altmetric

253.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!