The rationality of the reward function is crucial for enhancing the performance of the Deep Reinforcement Learning algorithms. In portfolio management, this study identifies and solves two major flaws in existing reward functions: first, overemphasis on short-term market fluctuations and neglect of long-term trends; second, the equivalent rewards or punishments for actions that result in gains or losses, which is not in line with the investor's loss aversion psychology. To this end, drawing on the loss aversion theory in behavioral finance, this paper innovatively proposes a multi-step loss aversion (MSLA) reward function, which more accurately captures the behavioral patterns of investors in trading and constructs an online portfolio management strategy based on the MSLA. The study selects three representative indices from the A-share market to build corresponding portfolios and conducts several backtesting experiments on historical data from 2019 to 2023. The experimental results demonstrate that the MSLA reward function significantly improves the overall performance of the portfolio strategy, outperforming other existing algorithms in terms of cumulative returns, Sharpe ratio, and maximum drawdown. Furthermore, the proposed strategy is not only applicable to portfolios composed of stocks with different market capitalizations, but also maintains robust performance in rising, falling, and volatile market conditions, fully illustrating its effectiveness and practicality in portfolio management.