2602.17098v1 Feb 19, 2026 q-fin.PM

최적의 포트폴리오 배분을 위한 심층 강화학습: 평균-분산 최적화와의 비교 연구

Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization

Srijan Sood

Citations: 129

h-index: 6

Kassiani Papasotiriou

Citations: 39

h-index: 2

T. Balch

Citations: 1,759

h-index: 19

M. Vaičiulis

Citations: 46

h-index: 4

포트폴리오 관리는 미리 정해진 투자 목표를 달성할 목적으로 포트폴리오라 불리는 투자 자산군을 감독하는 과정이다. 포트폴리오 최적화는 감수해야 할 위험을 최소화하면서 수익을 극대화하도록 포트폴리오 자산을 배분하는 핵심 요소이다. 이는 일반적으로 포트폴리오 배분에 관한 결정을 내리기 위해 정량적 기법과 투자 전문 지식을 결합하여 활용하는 금융 전문가들에 의해 수행된다. 최근 심층 강화학습(DRL)은 과거 시장 데이터를 기반으로 모델 프리(model-free) 에이전트를 훈련시켜 포트폴리오 배분을 최적화하는 데 적용되어 유망한 결과를 보여주었다. 이러한 방법론의 대부분은 그 결과를 기본 벤치마크나 다른 최첨단 DRL 에이전트와 비교하지만, 실제 실무 환경에서 금융 전문가들이 사용하는 전통적인 방식과는 성능을 비교하지 않는 경우가 많다. 이 작업에 가장 일반적으로 사용되는 방법론 중 하나는 평균-분산 포트폴리오 최적화(MVO)인데, 이는 과거의 시계열 정보를 사용하여 예상 자산 수익률과 공분산을 추정하고, 이를 투자 목표에 맞게 최적화하는 데 활용한다. 본 연구는 최적의 포트폴리오 배분을 위한 모델 프리 DRL과 MVO 간의 철저한 비교를 수행한다. 우리는 포트폴리오 최적화를 위한 DRL이 실제 환경에서 어떻게 작동하는지에 대한 세부 사항을 설명하며, MVO에 필요한 조정 사항도 함께 언급한다. 백테스트 결과는 샤프 지수(Sharpe ratio), 최대 낙폭(maximum drawdowns), 그리고 절대 수익률을 포함한 여러 지표 전반에서 DRL 에이전트가 강력한 성능을 발휘함을 입증한다.

Original Abstract

Portfolio Management is the process of overseeing a group of investments, referred to as a portfolio, with the objective of achieving predetermined investment goals. Portfolio optimization is a key component that involves allocating the portfolio assets so as to maximize returns while minimizing risk taken. It is typically carried out by financial professionals who use a combination of quantitative techniques and investment expertise to make decisions about the portfolio allocation. Recent applications of Deep Reinforcement Learning (DRL) have shown promising results when used to optimize portfolio allocation by training model-free agents on historical market data. Many of these methods compare their results against basic benchmarks or other state-of-the-art DRL agents but often fail to compare their performance against traditional methods used by financial professionals in practical settings. One of the most commonly used methods for this task is Mean-Variance Portfolio Optimization (MVO), which uses historical time series information to estimate expected asset returns and covariances, which are then used to optimize for an investment objective. Our work is a thorough comparison between model-free DRL and MVO for optimal portfolio allocation. We detail the specifics of how to make DRL for portfolio optimization work in practice, also noting the adjustments needed for MVO. Backtest results demonstrate strong performance of the DRL agent across many metrics, including Sharpe ratio, maximum drawdowns, and absolute returns.

19 Citations

4 Influential

9.5 Altmetric

74.5 Score

Original PDF

No Analysis Report Yet

This paper hasn't been analyzed by Gemini yet.

댓글을 작성하려면 로그인하세요.

아직 댓글이 없습니다. 첫 번째 댓글을 남겨보세요!