2603.11808v1 Mar 12, 2026 cs.AI

오픈 소스 에이전트 저장소 대규모 분석을 통한 기술 습득 자동화: 다중 에이전트 절차적 지식 추출 프레임워크

Automating Skill Acquisition through Large-Scale Mining of Open-Source Agentic Repositories: A Framework for Multi-Agent Procedural Knowledge Extraction

Keqian Li

Citations: 32,377

h-index: 9

Aimin Zhou

Citations: 118

h-index: 6

Hao Hao

Citations: 5

h-index: 1

Shuzhen Bi

Citations: 3

h-index: 1

Mengsong Wu

Citations: 107

h-index: 5

Wentao Liu

Citations: 22

h-index: 2

Siyu Song

Citations: 6

h-index: 2

Hongbo Zhao

Citations: 137

h-index: 7

대규모 언어 모델(LLM)에서 모듈화되고 기술 기반 에이전트로의 전환은 인공지능 배포의 근본적인 아키텍처 변화를 의미합니다. 범용 모델은 선언적 지식 측면에서 뛰어난 성능을 보이지만, 자율적인 워크플로우에서 활용될 때에는 전문적인 절차적 전문성이 부족하여 한계가 있는 경우가 많습니다. 본 연구는 GitHub와 같은 플랫폼의 오픈 소스 저장소를 분석하여 고품질 에이전트 기술을 자동적으로 습득하는 체계적인 프레임워크를 조사합니다. 우리는 TheoremExplainAgent와 Code2Video와 같은 최첨단 시스템에서 시각화 및 교육 기능을 추출하는 데 중점을 둡니다. 이 시스템들은 모두 Manim 수학 애니메이션 엔진을 사용합니다. 프레임워크는 저장소 구조 분석, 밀집 검색을 통한 의미론적 기술 식별, 그리고 표준화된 SKILL.md 형식으로의 변환을 포함합니다. 본 연구는 에이전트 저장소로부터의 체계적인 추출과 엄격한 보안 관리 및 다차원 평가 지표를 결합하면, 모델 재학습 없이 LLM의 기능을 향상시키는 절차적 지식을 확장 가능하게 습득할 수 있음을 보여줍니다. 분석 결과, 에이전트에 의해 생성된 교육 콘텐츠는 지식 전달 효율성을 40% 향상시키면서도, 인간이 제작한 튜토리얼과 동등한 수준의 교육적 품질을 유지할 수 있음을 확인했습니다.

Original Abstract

The transition from monolithic large language models (LLMs) to modular, skill-equipped agents represents a fundamental architectural shift in artificial intelligence deployment. While general-purpose models demonstrate remarkable breadth in declarative knowledge, their utility in autonomous workflows is frequently constrained by insufficient specialized procedural expertise. This report investigates a systematic framework for automated acquisition of high-quality agent skills through mining of open-source repositories on platforms such as GitHub. We focus on the extraction of visualization and educational capabilities from state-of-the-art systems including TheoremExplainAgent and Code2Video, both utilizing the Manim mathematical animation engine. The framework encompasses repository structural analysis, semantic skill identification through dense retrieval, and translation to the standardized SKILL.md format. We demonstrate that systematic extraction from agentic repositories, combined with rigorous security governance and multi-dimensional evaluation metrics, enables scalable acquisition of procedural knowledge that augments LLM capabilities without requiring model retraining. Our analysis reveals that agent-generated educational content can achieve 40\% gains in knowledge transfer efficiency while maintaining pedagogical quality comparable to human-crafted tutorials.

2 Citations

0 Influential

4.5 Altmetric

24.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!