2602.04805v1 Feb 04, 2026 cs.GR

스킨 토큰: 통일된 자기 회귀 리깅을 위한 학습 기반의 압축 표현

Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

Jia-Peng Zhang

Citations: 94

h-index: 4

Cheng-Feng Pu

Citations: 41

h-index: 2

Meng-Hao Guo

Citations: 166

h-index: 6

Yan-Pei Cao

Citations: 87

h-index: 2

Shi-Min Hu

Citations: 204

h-index: 7

생성형 3D 모델의 급속한 확산은 애니메이션 파이프라인에서 심각한 병목 현상인 리깅 문제를 야기했습니다. 기존의 자동화된 방법들은 스키닝 방식을 '잘 정의되지 않은 고차원 회귀 문제'로 취급하며, 이는 최적화 효율성이 떨어지고 일반적으로 골격 생성과 분리됩니다. 우리는 이러한 문제가 표현 방식에 있다고 보고, 스키닝 가중치를 위한 학습 기반의 압축된 이산 표현인 '스킨 토큰(SkinTokens)'을 제안합니다. FSQ-CVAE를 활용하여 스키닝의 내재된 희소성을 포착함으로써, 우리는 문제를 연속적인 회귀에서 보다 용이하게 처리할 수 있는 토큰 시퀀스 예측 문제로 재구성합니다. 이 표현 방식은 '토큰 리그(TokenRig)'라는 통일된 자기 회귀 프레임워크를 가능하게 합니다. 토큰 리그는 전체 리깅을 단일 시퀀스의 골격 파라미터와 스킨 토큰으로 모델링하며, 골격과 피부 변형 간의 복잡한 의존성을 학습합니다. 이렇게 통일된 모델은 강화 학습 단계를 거쳐, 맞춤형 기하학적 및 의미론적 보상을 통해 다양한 데이터셋에 대한 일반화 성능을 향상시킵니다. 스킨 토큰 표현 방식은 최첨단 방법보다 스키닝 정확도 측면에서 98%에서 133%의 성능 향상을 가져왔으며, 강화 학습을 통해 개선된 전체 토큰 리그 프레임워크는 뼈 예측 정확도를 17%에서 22% 향상시켰습니다. 본 연구는 더 높은 충실도와 안정성을 제공하는 통일되고 생성적인 리깅 접근 방식을 제시하며, 3D 콘텐츠 제작 분야의 오랜 과제를 해결할 수 있는 확장 가능한 솔루션을 제공합니다.

Original Abstract

The rapid proliferation of generative 3D models has created a critical bottleneck in animation pipelines: rigging. Existing automated methods are fundamentally limited by their approach to skinning, treating it as an ill-posed, high-dimensional regression task that is inefficient to optimize and is typically decoupled from skeleton generation. We posit this is a representation problem and introduce SkinTokens: a learned, compact, and discrete representation for skinning weights. By leveraging an FSQ-CVAE to capture the intrinsic sparsity of skinning, we reframe the task from continuous regression to a more tractable token sequence prediction problem. This representation enables TokenRig, a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens, learning the complicated dependencies between skeletons and skin deformations. The unified model is then amenable to a reinforcement learning stage, where tailored geometric and semantic rewards improve generalization to complex, out-of-distribution assets. Quantitatively, the SkinTokens representation leads to a 98%-133% percents improvement in skinning accuracy over state-of-the-art methods, while the full TokenRig framework, refined with RL, enhances bone prediction by 17%-22%. Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness, offering a scalable solution to a long-standing challenge in 3D content creation.

2 Citations

0 Influential

3.5 Altmetric

19.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!