在线咨询
中国工业与应用数学学会会刊
主管:中华人民共和国教育部
主办:西安交通大学
ISSN 1005-3085  CN 61-1269/O1

工程数学学报 ›› 2026, Vol. 42 ›› Issue (6): 1149-1170.doi: 10.3969/j.issn.1005-3085.2025.06.012cstr: 32411.14.cjem.CN61-1269/O1.2025.06.012

• • 上一篇    下一篇

基于正交投影的纵向数据高维部分线性模型的变量选择

杨宜平1,2,  秦仁宇1,  赵培信1   

  1. 1. 重庆工商大学数学与统计学院,重庆 400067

    2. 重庆工商大学统计智能计算与监测重庆市重点实验室,重庆 400067

  • 收稿日期:2023-03-22 接受日期:2024-06-10 出版日期:2025-12-15 发布日期:2026-02-15
  • 基金资助:
    重庆市自然科学基金(cstc2021jcyj-msxmX0079);重庆市教委人文社科一般项目(21SIGH118).

Variable Selection for High-dimensional Partially Linear Models with Longitudinal Data Based on Orthogonal Projection#br#

YANG Yiping1,2,  QIN Renyu1,  ZHAO Peixin1   

  1. 1. School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing, 400067

    2. Chongqing Key Laboratory of Statistical Intelligent Computing and Monitoring, Chongqing Technology and Business University, Chongqing, 400067

  • Received:2023-03-22 Accepted:2024-06-10 Online:2025-12-15 Published:2026-02-15
  • Supported by:
    The Natural Science Foundation of Chongqing (cstc2021jcyj-msxmX0079); the Education Commission Humanities and Social Sciences General Project of Chongqing (21SIGH118).

摘要:

考虑纵向数据下参数维数发散时部分线性模型的变量选择问题。先通过B-样条和QR分解消除非参数分量,再结合SCAD惩罚和二次推断函数构造部分线性模型中回归系数的惩罚目标函数,进而同时对回归系数进行变量选择且获得回归系数的估计。进一步,结合二次推断函数获得非参数分量的估计。在一些正则条件下,证明了回归系数估计和非参数分量估计的渐近性质。通过模拟研究表明所提出的方法无论纵向数据相关结构是否正确指定,所得到的估计效果都很好。最后,采用所提方法对房地产上市公司经营绩效的影响因素进行了分析。

关键词: 纵向数据, QR分解, 二次推断函数, SCAD惩罚, 变量选择

Abstract:

The variable selection problem of a partially linear model is considered when the number of the parameters diverges for longitudinal data. The nonparametric components are first eliminated by B-spline and QR decomposition, and the penalized objective function of the regression coefficient in the partial linear model is constructed by combining SCAD penalty and quadratic inference function, the variable selection and estimation of regression coefficients are obtained at the same time. Further, the estimates of the nonparametric components are obtained by combination with the quadratic inference function. The asymptotic properties of the regression coefficient estimates and nonparametric component estimates are demonstrated under some regular conditions. The simulation results show that the proposed method has a good estimation effect regardless of whether the correlation structure of longitudinal data is specified correctly. Finally, the method proposed is used to analyze the factors affecting the operating performance of real estate listed companies.

Key words: longitudinal data, QR decomposition, quadratic inference function, SCAD penalty, variable selection

中图分类号: