2602.23827v1 Feb 27, 2026 cs.LG

FedNSAM: 연합 학습에서의 지역 및 글로벌 평탄성 일관성

FedNSAM:Consistency of Local and Global Flatness for Federated Learning

Junkang Liu

Citations: 99

h-index: 6

Fanhua Shang

Citations: 3,562

h-index: 35

Hongying Liu

Citations: 794

h-index: 14

Yuanyuan Liu

Citations: 2,074

h-index: 25

Yuxuan Tian

Citations: 20

h-index: 2

연합 학습(FL)에서 다단계 지역 업데이트와 데이터 이질성은 일반적으로 글로벌 모델의 성능을 저하시키는 더 뾰족한 글로벌 최소값을 초래합니다. 인기 있는 FL 알고리즘은 이러한 문제를 해결하기 위해 지역 학습에 sharpness-aware minimization (SAM)을 통합합니다. 그러나 데이터 이질성이 높은 환경에서는 지역 학습에서의 평탄성이 글로벌 모델의 평탄성을 의미하지 않습니다. 따라서 클라이언트 데이터에서 지역 손실 표면의 뾰족함을 최소화하는 것은 SAM이 글로벌 모델의 일반화 능력을 향상시키는 데 효과적이지 않음을 의미합니다. 우리는 이 현상을 설명하기 위해 **평탄성 거리(flatness distance)**를 정의합니다. 우리는 FL에서의 SAM을 재고하고 **평탄성 거리**를 이론적으로 분석하여, 글로벌 Nesterov 모멘텀을 지역 업데이트에 도입하여 지역 및 글로벌 평탄성의 일관성을 조화시키는 새로운 **FedNSAM** 알고리즘을 제안합니다. **FedNSAM**은 글로벌 Nesterov 모멘텀을 사용하여 클라이언트의 글로벌 변화 및 외삽의 방향을 추정합니다. 이론적으로, 우리는 Nesterov 외삽을 통해 FedSAM보다 더 강력한 수렴 경계를 증명합니다. 또한, CNN 및 Transformer 모델에 대한 광범위한 실험을 통해 **FedNSAM**의 우수한 성능과 효율성을 검증했습니다. 코드는 https://github.com/junkangLiu0/FedNSAM 에서 확인할 수 있습니다.

Original Abstract

In federated learning (FL), multi-step local updates and data heterogeneity usually lead to sharper global minima, which degrades the performance of the global model. Popular FL algorithms integrate sharpness-aware minimization (SAM) into local training to address this issue. However, in the high data heterogeneity setting, the flatness in local training does not imply the flatness of the global model. Therefore, minimizing the sharpness of the local loss surfaces on the client data does not enable the effectiveness of SAM in FL to improve the generalization ability of the global model. We define the \textbf{flatness distance} to explain this phenomenon. By rethinking the SAM in FL and theoretically analyzing the \textbf{flatness distance}, we propose a novel \textbf{FedNSAM} algorithm that accelerates the SAM algorithm by introducing global Nesterov momentum into the local update to harmonize the consistency of global and local flatness. \textbf{FedNSAM} uses the global Nesterov momentum as the direction of local estimation of client global perturbations and extrapolation. Theoretically, we prove a tighter convergence bound than FedSAM by Nesterov extrapolation. Empirically, we conduct comprehensive experiments on CNN and Transformer models to verify the superior performance and efficiency of \textbf{FedNSAM}. The code is available at https://github.com/junkangLiu0/FedNSAM.

14 Citations

0 Influential

58.523463096955 Altmetric

306.6 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!