在线咨询
中国工业与应用数学学会会刊
主管:中华人民共和国教育部
主办:西安交通大学
ISSN 1005-3085  CN 61-1269/O1

工程数学学报 ›› 2023, Vol. 40 ›› Issue (3): 381-397.doi: 10.3969/j.issn.1005-3085.2023.03.004

• • 上一篇    下一篇

基于混合偏正态数据下众数回归模型的变量选择

曾  鑫,    吴刘仓,   句媛媛   

  1. 昆明理工大学理学院,昆明 650093
  • 收稿日期:2020-09-25 接受日期:2021-05-31 出版日期:2023-06-15 发布日期:2023-08-15
  • 通讯作者: 句媛媛 E-mail: jundeyy@126.com
  • 基金资助:
    国家自然科学基金 (11861041);昆明理工大学学术科技创新基金 (2020YB208).

Variable Selection in Mode Regression Models Using the Mixture Skew-normal Data

ZENG Xin,   WU Liucang,   JU Yuanyuan   

  1. Faculty of Science, Kunming University of Science and Technology, Kunming 650093
  • Received:2020-09-25 Accepted:2021-05-31 Online:2023-06-15 Published:2023-08-15
  • Contact: Y. Ju. E-mail address: jundeyy@126.com
  • Supported by:
    The National Natural Science Foundation of China (11861041); the Academic and Technology Innovation Foundation of Kunming University of Science and Technology (2020YB208).

摘要:

有限混合回归 (Finite Mixture of Regression, FMR) 模型的变量选择常常在统计建模中使用。目前关于FMR模型的研究主要集中在回归误差服从正态分布的情形,而这种假设不适用于研究非对称的数据。对于偏斜数据,众数的代表性优于均值。本文基于混合偏正态数据介绍了众数回归模型的变量选择方法,并证明了变量选择方法的相合性和参数估计的Oracle性质。为了估计模型的参数,提出了一种改进的EM (Expectation-Maximum) 算法,通过模拟研究和实例分析进一步说明了所提出模型和变量选择方法的有效性。

关键词: 混合偏正态数据, 众数回归模型, 变量选择, EM算法

Abstract:

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The existing studies on FMR models mainly base on the normality ass-umption of regression error. However, this assumption is not suitable for studying asymmetric data. The performance of the mode is better than that of the mean for skewed data. This paper proposes a variable selection method for mixture of mode regression models basing on the skew-normal distribution. The consistency and the Oracle property are proved. A modified EM algorithm is developed to estimate the parameters in the model. Simulation studies are conducted to investigate the performance of the proposed methodologies. A real example is further provided to investigate the performance of the proposed methodologies.

Key words: mixture of skew-normal data, mode regression models, variable selection, EM algorithm

中图分类号: