2606.05552v1 Jun 04, 2026 cs.LG

Balancing Image Compression and Generation with Bootstrapped Tokenization

Jing Wang
Jing Wang
Citations: 50
h-index: 3
Haozhe Chi
Haozhe Chi
Citations: 82
h-index: 4
Wu Sheng
Wu Sheng
Citations: 0
h-index: 0
Yi Ma
Yi Ma
Citations: 111
h-index: 5
Yadong Mu
Yadong Mu
Citations: 606
h-index: 7
Hao Jiang
Hao Jiang
Citations: 444
h-index: 3
Jinghan Li
Jinghan Li
Citations: 34
h-index: 3

Despite progress in image tokenization, standard methods encode redundant information by mixing all granularities within each token, thus redundancy persists between tokens. The mix of information of different granularity also complicates the training of generators. This paper introduces SelfBootTok, a method that resolves this by cleanly decomposing information into global and local token groups. Through self-bootstrapped learning, the model predicts local details exclusively from global tokens, shifting the burden of visual details from the generator to the tokenizer. Consequently, our generator is far more efficient, requiring only global tokens and reducing computation by approximately 40%, while delivering superior reconstruction and generation. Moreover, this paradigm scales elegantly: by leveraging more data or parameters to self-supervise local representation learning, SelfBootTok achieves a new state-of-the-art gFID score of 1.56 using only 64 tokens.

0 Citations
0 Influential
3.5 Altmetric
17.5 Score
Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

Log in to request an AI analysis.

댓글

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!