2601.00583v1 Jan 02, 2026 cs.LG

HFedMoE: 자원 인식형 이종 연합 학습 모델, Mixture-of-Experts 기반

HFedMoE: Resource-aware Heterogeneous Federated Learning with Mixture-of-Experts

Senkang Hu

Citations: 499

h-index: 12

Yihang Tao

Citations: 141

h-index: 8

Yuguang Fang

Citations: 806

h-index: 17

Zihan Fang

Citations: 641

h-index: 14

Zhengyi Lin

Citations: 1,210

h-index: 12

Yanan Ma

Citations: 67

h-index: 4

Yiqin Deng

Citations: 1,277

h-index: 21

Xianhao Chen

Citations: 2,876

h-index: 30

연합 학습(FL)은 데이터 프라이버시를 보호하면서 대규모 언어 모델(LLM)을 미세 조정할 수 있도록 하지만, LLM의 상당한 크기는 모바일 장치와 같이 자원이 제한된 클라이언트의 온디바이스 훈련을 어렵게 만듭니다. 따라서 Mixture-of-Experts (MoE) 모델은 계산 효율적인 솔루션으로 부상했으며, 모델 훈련 중에 성능을 저하시키지 않고 계산 부담을 줄이기 위해 전문가 집합의 희소한 부분만 활성화합니다. MoE를 FL 미세 조정에 통합하는 것은 상당한 잠재력을 가지고 있지만, 여전히 세 가지 주요 과제에 직면합니다. 첫째, 각 전문가가 로컬 미세 조정 성능에 미치는 영향을 측정할 수 있는 신뢰할 수 있는 지표의 부족으로 인해 클라이언트에 적합한 전문가를 선택하는 것이 어렵습니다. 둘째, 클라이언트 간의 이질적인 컴퓨팅 리소스는 다양한 입력 샘플에 대한 동적 전문가 활성화가 자원 제약적인 장치를 압도할 수 있으므로 MoE 기반 LLM 미세 조정을 심각하게 저해합니다. 셋째, 클라이언트별 전문가 집합 및 라우팅 선호도는 글로벌 집계를 저해하며, 잘못 정렬된 전문가 업데이트 및 일관되지 않은 게이팅 네트워크는 파괴적인 간섭을 유발합니다. 이러한 과제들을 해결하기 위해, 우리는 HFedMoE라는 이종 MoE 기반 FL 미세 조정 프레임워크를 제안합니다. HFedMoE는 계산 효율적인 LLM 미세 조정을 위해 각 클라이언트에 적합한 전문가 집합을 맞춤 설정합니다. 구체적으로, HFedMoE는 미세 조정 성능에 대한 기여도를 기반으로 전문가의 중요도를 파악하고, 각 클라이언트의 컴퓨팅 예산에 맞게 정보 병목 현상을 고려하여 전문가 집합을 적응적으로 선택합니다. 또한, 활성적으로 미세 조정된 전문가 및 게이팅 매개변수를 중요도에 따라 가중하여 집계하는 희소성 인지 모델 집계 전략도 설계되었습니다. 광범위한 실험 결과, HFedMoE는 훈련 정확도 및 수렴 속도 측면에서 최첨단 벤치마크를 능가하는 것으로 나타났습니다.

Original Abstract

While federated learning (FL) enables fine-tuning of large language models (LLMs) without compromising data privacy, the substantial size of an LLM renders on-device training impractical for resource-constrained clients, such as mobile devices. Thus, Mixture-of-Experts (MoE) models have emerged as a computation-efficient solution, which activates only a sparse subset of experts during model training to reduce computing burden without sacrificing performance. Though integrating MoE into FL fine-tuning holds significant potential, it still encounters three key challenges: i) selecting appropriate experts for clients remains challenging due to the lack of a reliable metric to measure each expert's impact on local fine-tuning performance, ii) the heterogeneous computing resources across clients severely hinder MoE-based LLM fine-tuning, as dynamic expert activations across diverse input samples can overwhelm resource-constrained devices, and iii) client-specific expert subsets and routing preference undermine global aggregation, where misaligned expert updates and inconsistent gating networks in troduce destructive interference. To address these challenges, we propose HFedMoE, a heterogeneous MoE-based FL fine-tuning framework that customizes a subset of experts to each client for computation-efficient LLM fine-tuning. Specifically, HFedMoE identifies the expert importance based on its contributions to fine-tuning performance, and then adaptively selects a subset of experts from an information bottleneck perspective to align with each client' s computing budget. A sparsity-aware model aggregation strategy is also designed to aggregate the actively fine-tuned experts and gating parameters with importance weighted contributions. Extensive experiments demonstrate that HFedMoE outperforms state-of-the-art benchmarks in training accuracy and convergence speed.

9 Citations

0 Influential

15 Altmetric

84.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!