2603.04099v1 Mar 04, 2026 cs.CV

고차원 위치 인코딩 및 비국소 MLP를 활용한 효율적인 포인트 클라우드 처리

Efficient Point Cloud Processing with High-Dimensional Positional Encoding and Non-Local MLPs

Yanmei Zou

Citations: 102

h-index: 3

Hongshan Yu

Citations: 136

h-index: 7

Yaonan Wang

Citations: 0

h-index: 0

Zhengeng Yang

Citations: 875

h-index: 13

Xieyuanli Chen

Citations: 6

h-index: 1

Kailun Yang

School of Artificial Intelligence and Robotics, Hunan University

Citations: 5,832

h-index: 40

Naveed Akhtar

Citations: 116

h-index: 6

다층 퍼셉트론(MLP) 모델은 현대적인 포인트 클라우드 처리의 기반이 됩니다. 그러나 이러한 모델의 복잡한 네트워크 구조는 그 강점의 원인을 가리고 모델 적용 범위를 제한합니다. 본 논문에서는 포인트 클라우드 처리의 모듈화된 특징 추출을 위한 두 단계의 추상화 및 정제(ABS-REF) 관점을 제시합니다. 이 관점은 초기 모델이 ABS 단계에 집중했던 반면, 최근 기술은 성능 향상을 위해 정교한 REF 단계를 설계한다는 점을 명확히 합니다. 또한, 트랜스포머 문헌의 '위치 인코딩' 개념을 확장하여 내재된 위치 정보를 명시적으로 활용하는 고차원 위치 인코딩(HPE) 모듈을 제안합니다. HPE는 MLP 기반 아키텍처에 쉽게 적용할 수 있으며, 트랜스포머 기반 방법과도 호환됩니다. 제안하는 ABS-REF 관점에서, MLP 기반 방법의 지역 집계 방식을 재고하고, 이웃 간의 지역적 관계를 파악하는 데 사용되는 시간 소모적인 지역 MLP 연산을 대체합니다. 대신, 효율적인 비국소 정보 업데이트를 위해 비국소 MLP를 사용하고, 제안하는 HPE를 결합하여 효과적인 지역 정보 표현을 가능하게 합니다. 개발된 HPE 모듈을 활용하여 ABS-REF 패러다임을 따르는 MLP 네트워크인 HPENets를 개발했으며, 확장 가능한 HPE 기반 REF 단계를 포함합니다. 네 가지 다양한 작업에 걸쳐 일곱 개의 공개 데이터 세트에서 수행한 광범위한 실험 결과, HPENets는 효율성과 효과성 간의 균형을 잘 유지하는 것으로 나타났습니다. 특히, 강력한 MLP 기반 모델인 PointNeXt를 ScanObjectNN 데이터 세트에서 1.1%의 mAcc, S3DIS 데이터 세트에서 4.0%의 mIoU, ScanNet 데이터 세트에서 1.8%의 mIoU, ShapeNetPart 데이터 세트에서 0.2%의 Cls. mIoU로 능가했으며, FLOPs는 각각 50.0%, 21.5%, 23.1%, 44.4% 수준입니다. 소스 코드는 https://github.com/zouyanmei/HPENet_v2.git 에서 확인할 수 있습니다.

Original Abstract

Multi-Layer Perceptron (MLP) models are the foundation of contemporary point cloud processing. However, their complex network architectures obscure the source of their strength and limit the application of these models. In this article, we develop a two-stage abstraction and refinement (ABS-REF) view for modular feature extraction in point cloud processing. This view elucidates that whereas the early models focused on ABS stages, the more recent techniques devise sophisticated REF stages to attain performance advantages. Then, we propose a High-dimensional Positional Encoding (HPE) module to explicitly utilize intrinsic positional information, extending the ``positional encoding'' concept from Transformer literature. HPE can be readily deployed in MLP-based architectures and is compatible with transformer-based methods. Within our ABS-REF view, we rethink local aggregation in MLP-based methods and propose replacing time-consuming local MLP operations, which are used to capture local relationships among neighbors. Instead, we use non-local MLPs for efficient non-local information updates, combined with the proposed HPE for effective local information representation. We leverage our modules to develop HPENets, a suite of MLP networks that follow the ABS-REF paradigm, incorporating a scalable HPE-based REF stage. Extensive experiments on seven public datasets across four different tasks show that HPENets deliver a strong balance between efficiency and effectiveness. Notably, HPENet surpasses PointNeXt, a strong MLP-based counterpart, by 1.1% mAcc, 4.0% mIoU, 1.8% mIoU, and 0.2% Cls. mIoU, with only 50.0%, 21.5%, 23.1%, 44.4% of FLOPs on ScanObjectNN, S3DIS, ScanNet, and ShapeNetPart, respectively. Source code is available at https://github.com/zouyanmei/HPENet_v2.git.

0 Citations

0 Influential

40 Altmetric

200.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!