2601.13559v1 Jan 20, 2026 cs.AI

AgentGC: LLM 기반 다중 에이전트 시스템을 활용한 진화적 학습 기반 유전자 데이터 무손실 압축

AgentGC: Evolutionary Learning-based Lossless Compression for Genomics Data with LLM-driven Multiple Agent

Sun Hui

Citations: 0

h-index: 0

Yanfeng Ding

Citations: 17

h-index: 2

Huidong Ma

Citations: 47

h-index: 5

Chang Xu

Citations: 18

h-index: 2

Keyan Jin

Citations: 53

h-index: 1

Lizheng Zu

Citations: 302

h-index: 8

Cheng Zhong

Citations: 17

h-index: 3

Xiaoguang Liu

Citations: 52

h-index: 5

Gang Wang

Citations: 0

h-index: 0

Wentong Cai

Citations: 76

h-index: 2

무손실 압축은 유전자 데이터(GD)의 저장, 공유 및 관리에 상당한 발전을 가져왔습니다. 현재의 학습 기반 방법은 낮은 수준의 압축 모델링, 제한적인 적응성, 사용자 친화적이지 않은 인터페이스라는 문제점을 가지고 있으며, 진화적인 개선이 어렵습니다. 이에, 우리는 3개의 계층으로 구성된 다중 에이전트 시스템인 AgentGC를 제안합니다. AgentGC는 '리더(Leader)'와 '워커(Worker)'라는 이름의 다중 에이전트를 활용한 최초의 진화적 에이전트 기반 GD 압축 시스템입니다. 구체적으로, 1) 사용자 계층은 리더와 LLM을 결합하여 사용자 친화적인 인터페이스를 제공합니다. 2) 인지 계층은 리더의 주도하에 LLM을 활용하여 알고리즘-데이터셋-시스템의 공동 최적화를 고려하여 낮은 수준의 모델링 및 제한적인 적응성 문제를 해결합니다. 3) 압축 계층은 워커가 이끄는 자동화된 다중 지식 학습 기반 압축 프레임워크를 사용하여 압축 및 해제를 수행합니다. AgentGC는 다양한 시나리오를 지원하기 위해 3가지 모드(CP: 압축률 우선, TP: 처리량 우선, BM: 균형 모드)를 설계했습니다. 9개의 데이터셋에 대한 14개의 기준 모델과 비교한 결과, 평균 압축률은 각각 16.66%, 16.11%, 16.33% 향상되었으며, 처리량은 각각 4.73배, 9.23배, 9.15배 향상되었습니다.

Original Abstract

Lossless compression has made significant advancements in Genomics Data (GD) storage, sharing and management. Current learning-based methods are non-evolvable with problems of low-level compression modeling, limited adaptability, and user-unfriendly interface. To this end, we propose AgentGC, the first evolutionary Agent-based GD Compressor, consisting of 3 layers with multi-agent named Leader and Worker. Specifically, the 1) User layer provides a user-friendly interface via Leader combined with LLM; 2) Cognitive layer, driven by the Leader, integrates LLM to consider joint optimization of algorithm-dataset-system, addressing the issues of low-level modeling and limited adaptability; and 3) Compression layer, headed by Worker, performs compression & decompression via a automated multi-knowledge learning-based compression framework. On top of AgentGC, we design 3 modes to support diverse scenarios: CP for compression-ratio priority, TP for throughput priority, and BM for balanced mode. Compared with 14 baselines on 9 datasets, the average compression ratios gains are 16.66%, 16.11%, and 16.33%, the throughput gains are 4.73x, 9.23x, and 9.15x, respectively.

0 Citations

0 Influential

4 Altmetric

20.0 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!