在线咨询
中国工业与应用数学学会会刊
主管:中华人民共和国教育部
主办:西安交通大学
ISSN 1005-3085  CN 61-1269/O1

工程数学学报 ›› 2024, Vol. 41 ›› Issue (2): 232-244.doi: 10.3969/j.issn.1005-3085.2024.02.003

• • 上一篇    下一篇

多期贝叶斯强化学习鲁棒投资组合选择模型

李柔佳,  段启宏,  冯卓航,  刘  嘉   

  1. 西安交通大学数学与统计学院,西安 710049
  • 收稿日期:2021-04-30 接受日期:2021-08-27 出版日期:2024-04-15 发布日期:2024-06-15
  • 通讯作者: 刘嘉 E-mail: jialiu@xjtu.edu.cn
  • 基金资助:
    国家重点研发计划 (2022YFA1004000);国家自然科学基金 (11991023; 12371324).

Multi-stage Bayesian Reinforcement Learning Robust Portfolio Selection Model

LI Roujia,  DUAN Qihong,  FENG Zhuohang,  LIU Jia   

  1. School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049
  • Received:2021-04-30 Accepted:2021-08-27 Online:2024-04-15 Published:2024-06-15
  • Contact: J. Liu. E-mail address: jialiu@xjtu.edu.cn
  • Supported by:
    The National Key R&D Program of China (2022YFA1004000); the National Natural Science Foundation of China (11991023; 12371324).

摘要:

在传统多期分布式鲁棒投资组合选择模型中,不确定集合的估计是一个具有挑战性的难题。使用贝叶斯强化学习方法来动态更新不确定集合中的一、二阶矩等模型参数,进而研究贝叶斯强化学习框架下均值–最坏鲁棒 CVaR 模型的求解问题。通过结合动态规划和渐进对冲算法,设计了两层分解求解框架。下层通过求解一系列二阶锥规划来得到给定模型参数下子问题的最优策略,上层使用贝叶斯公式得到可实施的非预期投资策略。基于美国股票市场的实证结果表明:多期鲁棒强化学习投资组合选择模型相较传统模型具有更好的样本外投资表现。

关键词: 贝叶斯强化学习, 鲁棒风险度量, 投资组合, 二阶锥规划

Abstract:

The estimation of uncertainty sets in traditional multi-stage distributionally robust portfolio selection models is a challenging problem. This paper applys the Bayesian reinforcement learning technique to dynamically update the first two order moments in the uncertainty sets of a multi-stage distributionally robust model. We study the mean-worst case robust CVaR model in the Bayesian reinforcement learning framework. We propose a two-level decomposition solution framework by combining dynamic programming techniques and the progressive hedging algorithm. The lower level finds optimal policies of sub-models with given model parameters by solving a series of second-order cone programming problems. While the upper level finds an implementable policy satisfying non-anticipation constraints by using Bayes'~law. Numerical results in the US stock market illustrate the superior out-of-sample investment performance of the multi-stage Bayesian reinforcement learning robust portfolio selection model.

Key words: Bayesian reinforcement learning, robust risk measure, portfolio selection, second-order cone programming