2601.09105v2 Jan 14, 2026 cs.AI

AviationLMM: 민간 항공을 위한 대규모 멀티모달 파운데이션 모델

AviationLMM: A Large Multimodal Foundation Model for Civil Aviation

Cong Chen

Citations: 4

h-index: 1

Wenbin Li

Citations: 79

h-index: 4

Jinglin Wu

Citations: 10

h-index: 2

Xiaoyong Lin

Citations: 10

h-index: 3

Jing Chen

Citations: 3,781

h-index: 6

민간 항공은 글로벌 교통 및 상업의 초석이며, 안전성, 효율성 및 고객 만족을 보장하는 것이 무엇보다 중요합니다. 그러나 항공 분야의 기존 인공지능(AI) 솔루션은 고립된 작업이나 단일 모달리티에만 집중하여 여전히 파편화되고 편협한 상태에 머물러 있습니다. 이들은 음성 통신, 레이더 항적, 센서 스트림 및 텍스트 보고서와 같은 이기종 데이터를 통합하는 데 어려움을 겪으며, 이는 상황 인식, 적응성 및 실시간 의사 결정 지원을 제한합니다. 본 논문은 민간 항공의 다양한 데이터 스트림을 통합하고 이해, 추론, 생성 및 에이전트 애플리케이션을 가능하게 하도록 설계된 민간 항공용 대규모 멀티모달 파운데이션 모델인 AviationLMM의 비전을 소개합니다. 먼저, 우리는 기존 AI 솔루션과 실제 요구 사항 간의 격차를 규명합니다. 둘째, 공대지 음성, 감시 데이터, 기내 텔레메트리, 비디오 및 구조화된 텍스트와 같은 멀티모달 입력을 수집하여 교차 모달 정렬 및 융합을 수행하고, 상황 요약 및 위험 경보부터 예측 진단 및 멀티모달 사고 재구성에 이르는 유연한 출력을 생성하는 모델 아키텍처를 설명합니다. 이러한 비전을 완전히 실현하기 위해 데이터 수집, 정렬 및 융합, 사전 학습, 추론, 신뢰성, 개인 정보 보호, 누락된 모달리티에 대한 견고성, 합성 시나리오 생성을 포함하여 해결해야 할 주요 연구 기회를 제시합니다. AviationLMM의 설계와 과제를 구체화함으로써, 우리는 민간 항공 파운데이션 모델의 발전을 가속화하고 통합되고 신뢰할 수 있으며 개인 정보를 보호하는 항공 AI 생태계를 향한 협력적인 연구 노력을 촉진하는 것을 목표로 합니다.

Original Abstract

Civil aviation is a cornerstone of global transportation and commerce, and ensuring its safety, efficiency and customer satisfaction is paramount. Yet conventional Artificial Intelligence (AI) solutions in aviation remain siloed and narrow, focusing on isolated tasks or single modalities. They struggle to integrate heterogeneous data such as voice communications, radar tracks, sensor streams and textual reports, which limits situational awareness, adaptability, and real-time decision support. This paper introduces the vision of AviationLMM, a Large Multimodal foundation Model for civil aviation, designed to unify the heterogeneous data streams of civil aviation and enable understanding, reasoning, generation and agentic applications. We firstly identify the gaps between existing AI solutions and requirements. Secondly, we describe the model architecture that ingests multimodal inputs such as air-ground voice, surveillance, on-board telemetry, video and structured texts, and performs cross-modal alignment and fusion, and produces flexible outputs ranging from situation summaries and risk alerts to predictive diagnostics and multimodal incident reconstructions. In order to fully realize this vision, we identify key research opportunities to address, including data acquisition, alignment and fusion, pretraining, reasoning, trustworthiness, privacy, robustness to missing modalities, and synthetic scenario generation. By articulating the design and challenges of AviationLMM, we aim to boost the civil aviation foundation model progress and catalyze coordinated research efforts toward an integrated, trustworthy and privacy-preserving aviation AI ecosystem.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!