基于动态多步损失厌恶的在线投资组合管理策略

doi:10.3969/j.issn.1005-3085.2024.04.006

工程数学学报 ›› 2024, Vol. 41 ›› Issue (4): 677-692.doi: 10.3969/j.issn.1005-3085.2024.04.006

基于动态多步损失厌恶的在线投资组合管理策略

马聪¹, 陈怡君²

1. 西北大学经济管理学院，西安 710127
2. 西安航空学院图书馆，西安 710077

收稿日期:2024-04-16 接受日期:2024-05-17 出版日期:2024-08-15 发布日期:2025-03-10
通讯作者: 陈怡君 E-mail: 201907034@xaau.edu.cn
基金资助:
国家自然科学基金 (72301211)；教育部人文社会科学研究项目 (22XJCZH004)；陕西省自然科学基础研究计划项目 (2023-JC-QN-0799)；陕西省教育厅项目 (22JK0186).

A Novel Online Portfolio Management Strategy Based on Dynamic Multi-step Loss Aversion Reward

MA Cong¹, CHEN Yijun²

1. School of Economics and Management, Northwest University, Xi'an 710127
2. Xi'an Aeronautical University Library, Xi'an 710077

Received:2024-04-16 Accepted:2024-05-17 Online:2024-08-15 Published:2025-03-10
Contact: Y. Chen. E-mail address: 201907034@xaau.edu.cn
Supported by:
The National Natural Science Foundation of China (72301211); the Ministry of Education of Humanities and Social Science Project of China (22XJCZH004); the Nature Science Basis Research Program of Shaanxi Province (2023-JC-QN-0799); the Scientific Research Project of Shaanxi Provincial Department of Education (22JK0186).

摘要/Abstract

摘要：

奖励函数设计的合理性对于提升深度强化学习算法的性能至关重要。针对投资组合管理任务，识别并解决了现有奖励函数的两大缺陷：一是过度关注短期市场波动而忽略长期趋势；二是对带来奖励和造成损失行为的奖惩相当，这并不符合投资者的损失厌恶心理。为此，借鉴行为金融学中的投资者损失厌恶理论，创新性地提出了一种多步损失厌恶 (Multi-step Loss Aversion, MSLA) 奖励函数，以更准确地刻画投资者在交易中的行为模式，并据此构建了在线投资组合管理策略。选取A股市场上三个具有代表性的指数，构建了相应的投资组合，在2019年至2023年的历史数据上进行了回测实验。实验结果表明，MSLA奖励函数显著提升了策略的整体性能，从累计收益率、夏普比率和最大回撤等指标来看，普遍优于现有的其他算法。此外，该策略不仅适用于不同市值大小股票组成的投资组合，而且在上涨、下跌和震荡的市场状态下均能保持稳健的性能，这充分说明了该算法在投资组合管理中的有效性和实用性。

关键词: 深度强化学习, 投资组合管理, 损失厌恶理论, MSLA奖励函数

Abstract:

The rationality of the reward function is crucial for enhancing the performance of the Deep Reinforcement Learning algorithms. In portfolio management, this study identifies and solves two major flaws in existing reward functions: first, overemphasis on short-term market fluctuations and neglect of long-term trends; second, the equivalent rewards or punishments for actions that result in gains or losses, which is not in line with the investor's loss aversion psychology. To this end, drawing on the loss aversion theory in behavioral finance, this paper innovatively proposes a multi-step loss aversion (MSLA) reward function, which more accurately captures the behavioral patterns of investors in trading and constructs an online portfolio management strategy based on the MSLA. The study selects three representative indices from the A-share market to build corresponding portfolios and conducts several backtesting experiments on historical data from 2019 to 2023. The experimental results demonstrate that the MSLA reward function significantly improves the overall performance of the portfolio strategy, outperforming other existing algorithms in terms of cumulative returns, Sharpe ratio, and maximum drawdown. Furthermore, the proposed strategy is not only applicable to portfolios composed of stocks with different market capitalizations, but also maintains robust performance in rising, falling, and volatile market conditions, fully illustrating its effectiveness and practicality in portfolio management.

Key words: deep reinforcement learning, portfolio management, loss aversion theory, MSLA reward function

中图分类号:

F830

马聪, 陈怡君. 基于动态多步损失厌恶的在线投资组合管理策略[J]. 工程数学学报, 2024, 41(4): 677-692.

MA Cong, CHEN Yijun. A Novel Online Portfolio Management Strategy Based on Dynamic Multi-step Loss Aversion Reward[J]. Chinese Journal of Engineering Mathematics, 2024, 41(4): 677-692.

参考文献

相关文章 15

[1]	寇梦柯, 常浩. 随机利率与通胀风险下带有最低担保的均值–方差养老金计划[J]. 工程数学学报, 2025, 42(1): 97-113.
[2]	王愫新, 荣喜民, 赵慧. Cobb-Douglas效用和Epstein-Zin递归效用下个人年金账户的最优投资和给付策略[J]. 工程数学学报, 2025, 42(1): 139-158.
[3]	杨璐, 张成科, 朱怀念, 徐萌. 部分信息下带时滞的鲁棒资产负债博弈问题研究[J]. 工程数学学报, 2024, 41(3): 551-567.
[4]	殷艳红, 夏登峰, 费为银, 郭宇超. 基于通胀和股票误定价带保费退还条款的DC型养老金最优投资策略[J]. 工程数学学报, 2024, 41(2): 266-278.
[5]	张新军, 江良, 林琦, 宋丽平. 基于跳聚集现象随机波动率短期利率模型的影响研究[J]. 工程数学学报, 2024, 41(1): 17-38.
[6]	胡晨阳, 高岳林, 孙滢. 模糊环境下基于遗传差分协同进化的多阶段投资组合模型[J]. 工程数学学报, 2024, 41(1): 39-52.
[7]	玄海燕, 姚存留, 李鸿渐, 安蓉, 钟嘉毅. 基于随机规划的多期投资组合决策研究[J]. 工程数学学报, 2023, 40(5): 751-762.
[8]	杨建奇. 内部信息者的最优效用[J]. 工程数学学报, 2023, 40(3): 493-502.
[9]	林建伟, 宋丽平. 随机利率背景下具有一般违约负相关结构公司债券的定价[J]. 工程数学学报, 2023, 40(2): 219-230.
[10]	张未未. 回购与缺货成本下的连续时间报童模型[J]. 工程数学学报, 2023, 40(1): 83-96.
[11]	孙景云, 郭精军, 赵煜. DC 型养老基金在多维相依风险资产中的最优配置[J]. 工程数学学报, 2021, 38(6): 778-796.
[12]	董艳. 缺失数据环境下汇率序列的潜变量Metropolis-Hastings算法及触发式理财产品定价#br#[J]. 工程数学学报, 2021, 38(3): 330-342.
[13]	李慧敏, 林建伟. 跳幅度为 $-1$ 模式下的公司债券定价及最佳违约边界[J]. 工程数学学报, 2021, 38(1): 11-22.
[14]	王晓琴, 高岳林. 考虑交易费用的均值--VaR多阶段投资组合优化模型[J]. 工程数学学报, 2020, 37(6): 673-684.
[15]	杨鹏, 陈鑫. $n$ 类相依保险业务下的最优再保险和投资[J]. 工程数学学报, 2020, 37(5): 550-564.

编辑推荐 0

Metrics

阅读次数

全文

646

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	16	0	630

来源	本网站	其他网站

次数	278	368
比例	43%	57%

摘要

187

最新录用	在线预览	正式出版

127	0	60

来源	本网站	其他网站

次数	180	7
比例	96%	4%

基于动态多步损失厌恶的在线投资组合管理策略

A Novel Online Portfolio Management Strategy Based on Dynamic Multi-step Loss Aversion Reward

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐 0

Metrics

本文评价