2604.18862v1 Apr 20, 2026 cs.SE

상호 보완적인 신경망 능동 학습을 통한 인간-기계 협업 버그 보고서 식별

Human-Machine Co-Boosted Bug Report Identification with Mutualistic Neural Active Learning

Guoming Long

Citations: 23

h-index: 2

Shihai Wang

Citations: 4

h-index: 1

Hui Fang

Citations: 17

h-index: 2

Tao Chen

Citations: 86

h-index: 6

버그 보고서는 다양한 유형의 버그를 포함하며, 소프트웨어 품질 유지에 매우 중요합니다. 그러나 증가하는 복잡성과 버그 보고서의 양은 수동으로만 식별하고 적절한 팀에 할당하는 데 상당한 어려움을 야기합니다. 모든 보고서를 처리하는 것은 시간이 오래 걸리고 자원을 많이 소모하기 때문입니다. 본 논문에서는 GitHub 저장소에서 인간-기계 협업을 통해 버그 보고서를 자동으로 더 효과적으로 식별하기 위한 상호 보완적인 신경망 능동 학습(MNAL)이라는 교차 프로젝트 프레임워크를 소개합니다. MNAL은 다양한 프로젝트에서 보고서를 학습하고 일반화하는 신경 언어 모델을 활용하며, 능동 학습을 통해 신경망 능동 학습을 구현합니다. MNAL의 특징은 신경망 학습 모델과 인간 라벨러(개발자) 간의 상호 보완적인 관계를 의도적으로 설계하여 지식을 풍부하게 한다는 것입니다. 즉, 가장 유용한 인간이 라벨링한 보고서와 해당 보고서의 유사 레이블이 지정된 보고서를 사용하여 모델을 업데이트합니다. 동시에 개발자가 라벨링해야 하는 보고서는 더 읽기 쉽고 식별하기 쉬워 인간-기계 협업을 향상시킵니다. 우리는 대규모 데이터 세트를 사용하여 MNAL을 최첨단(SOTA) 방법, 기본 모델 및 다양한 변형과 비교하여 평가했습니다. 결과는 MNAL이 인간 라벨링 과정에서 가독성 및 식별성 측면에서 각각 최대 95.8% 및 196.0%의 노력을 줄이는 동시에 버그 보고서 식별 성능을 향상시킨다는 것을 보여줍니다. 또한, MNAL은 모델에 구애받지 않으며 다양한 기반 신경 언어 모델을 사용하여 모델 성능을 향상시킬 수 있습니다. 또한, 본 논문의 접근 방식의 효능을 더욱 검증하기 위해 10명의 참가자가 참여한 질적 사례 연구를 수행했습니다. 참가자들은 MNAL이 더 효과적이며 더 많은 시간과 자원을 절약한다고 평가했습니다.

Original Abstract

Bug reports, encompassing a wide range of bug types, are crucial for maintaining software quality. However, the increasing complexity and volume of bug reports pose a significant challenge in sole manual identification and assignment to the appropriate teams for resolution, as dealing with all the reports is time-consuming and resource-intensive. In this paper, we introduce a cross-project framework, dubbed Mutualistic Neural Active Learning (MNAL), designed for automated and more effective identification of bug reports from GitHub repositories boosted by human-machine collaboration. MNAL utilizes a neural language model that learns and generalizes reports across different projects, coupled with active learning to form neural active learning. A distinctive feature of MNAL is the purposely crafted mutualistic relation between the machine learners (neural language model) and human labelers (developers) when enriching the knowledge learned. That is, the most informative human-labeled reports and their corresponding pseudo-labeled ones are used to update the model while those reports that need to be labeled by developers are more readable and identifiable, thereby enhancing the human-machine teaming therein. We evaluate MNAL using a large scale dataset against the SOTA approaches, baselines, and different variants. The results indicate that MNAL achieves up to 95.8% and 196.0% effort reduction in terms of readability and identifiability during human labeling, respectively, while resulting in a better performance in bug report identification. Additionally, our MNAL is model-agnostic since it is capable of improving the model performance with various underlying neural language models. To further verify the efficacy of our approach, we conducted a qualitative case study involving 10 human participants, who rate MNAL as being more effective while saving more time and monetary resources.

0 Citations

0 Influential

3 Altmetric

15.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!