2603.23947v1 Mar 25, 2026 cs.SD

가변 길이 오디오 지문 추출

Variable-Length Audio Fingerprinting

Hongjie Chen

Citations: 409

h-index: 9

Hanyu Meng

Citations: 26

h-index: 3

Huimin Zeng

Citations: 48

h-index: 3

Ryan A. Rossi

Citations: 159

h-index: 6

Lie Lu

Citations: 28

h-index: 3

Josh Kimball

Citations: 4

h-index: 1

오디오 지문 추출은 오디오를 훨씬 낮은 차원의 표현으로 변환하여, 왜곡된 녹음도 유사한 지문을 통해 원본 오디오로 인식할 수 있도록 합니다. 기존의 딥러닝 방식은 고정된 길이의 오디오 세그먼트를 지문으로 추출하므로, 세그먼트 분할 과정에서의 시간적 변화를 고려하지 못합니다. 이러한 제약점을 해결하기 위해, 본 논문에서는 가변 길이 오디오 지문 추출(Variable-Length Audio FingerPrinting, VLAFP)이라는 새로운 방법을 제안합니다. VLAFP는 학습 및 테스트 과정 모두에서 가변 길이의 오디오를 처리할 수 있는 최초의 딥 오디오 지문 모델입니다. 실험 결과, VLAFP는 세 가지 실제 데이터셋에서 실시간 오디오 식별 및 오디오 검색 성능 측면에서 기존 최고 성능 모델보다 우수한 결과를 보였습니다.

Original Abstract

Audio fingerprinting converts audio to much lower-dimensional representations, allowing distorted recordings to still be recognized as their originals through similar fingerprints. Existing deep learning approaches rigidly fingerprint fixed-length audio segments, thereby neglecting temporal dynamics during segmentation. To address limitations due to this rigidity, we propose Variable-Length Audio FingerPrinting (VLAFP), a novel method that supports variable-length fingerprinting. To the best of our knowledge, VLAFP is the first deep audio fingerprinting model capable of processing audio of variable length, for both training and testing. Our experiments show that VLAFP outperforms existing state-of-the-arts in live audio identification and audio retrieval across three real-world datasets.

0 Citations

0 Influential

4.5 Altmetric

22.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!