2604.25272v1 Apr 28, 2026 stat.ML

스펙트럴 밴딧 (Spectral Bandits)

Spectral bandits

Michal Valko

Building something new @ Stealth Startup & Inria & MVA - Ex: Llama @AIatMeta Gemini and BYOL @GoogleDeepMind

Citations: 32,150

h-index: 44

B. Kveton

Citations: 4,851

h-index: 38

Tomás Kocák

Citations: 395

h-index: 7

R. Munos

Citations: 42,441

h-index: 90

Shipra Agrawal

Citations: 5,099

h-index: 28

P. Auer

Citations: 29,353

h-index: 64

그래프 상의 매끄러운 함수는 다양체 학습 및 준지도 학습 등 다양한 분야에서 널리 활용됩니다. 본 연구에서는 그래프 상에서 각 선택지의 보상이 매끄럽게 변하는 밴딧 문제를 다룹니다. 이러한 프레임워크는 콘텐츠 기반 추천과 같이 그래프를 포함하는 온라인 학습 문제를 해결하는 데 적합합니다. 이 문제에서, 추천 가능한 각 항목은 무방향 그래프의 노드이며, 해당 항목의 예상 평점은 이웃 노드의 예상 평점과 유사합니다. 목표는 높은 예상 평점을 가진 항목을 추천하는 것입니다. 본 연구에서는 최적 정책에 대한 누적 후회를 고려할 때, 알고리즘의 성능이 노드 수에 따라 급격하게 악화되지 않도록 하는 것을 목표로 합니다. 특히, 실제 그래프에서 작은 값을 갖는 '효과적인 차원'이라는 개념을 도입하고, 이 차원에 대해 선형 및 부분 선형적인 복잡도를 갖는 세 가지 알고리즘을 제안합니다. 콘텐츠 추천 문제에 대한 실험 결과, 수천 개의 항목에 대한 사용자 선호도를 단 수십 번의 노드 평가만으로 학습할 수 있음을 확인했습니다.

Original Abstract

Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this work, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning problems that involve graphs, such as content-based recommendation. In this problem, each item we can recommend is a node of an undirected graph and its expected rating is similar to the one of its neighbors. The goal is to recommend items that have high expected ratings. We aim for the algorithms where the cumulative regret with respect to the optimal policy would not scale poorly with the number of nodes. In particular, we introduce the notion of an effective dimension, which is small in real-world graphs, and propose three algorithms for solving our problem that scale linearly and sublinearly in this dimension. Our experiments on content recommendation problem show that a good estimator of user preferences for thousands of items can be learned from just tens of node evaluations.

11 Citations

2 Influential

30 Altmetric

165.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!