Association Journal of CSIAM
Supervised by Ministry of Education of PRC
Sponsored by Xi'an Jiaotong University
ISSN 1005-3085  CN 61-1269/O1

Chinese Journal of Engineering Mathematics ›› 2015, Vol. 32 ›› Issue (5): 677-689.doi: 10.3969/j.issn.1005-3085.2015.05.006

Previous Articles     Next Articles

Boosting Variable Selection Algorithm for Linear Regression Models

LI Yu1,   ZHANG Chun-xia2,   WANG Guan-wei3   

  1. 1- School of Economics and Management, Xinyang Normal University, Xinyang 464000
    2- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049
    3- School of Mechatronic Engineering, Xi'an Technological University, Xi'an 710021
  • Received:2014-02-26 Accepted:2014-09-01 Online:2015-10-15 Published:2015-12-15
  • Supported by:
    The National Natural Science Foundation of China (11201367; 91230101); the National Basic Research Program 973 (2013CB329406); the Social Sciences Planning Project of Henan Province (2014BJJ069);the Key Project of Henan Education Committee (14B910001).

Abstract:

With respect to variable selection for linear regression models, this paper proposes a novel Boosting learning method based on genetic algorithm. In the novel algorithm, all training examples are firstly assigned equal weights and a traditional genetic algorithm is adopted as the base learning algorithm of Boosting. Then, the training set associated with a weight distribution is taken as the input of genetic algorithm to do variable selection. Subsequently, the weight distribution is updated according to the quality of the previous variable selection results. Through repeating the above steps for multiple times, the results are then fused via a weighted combination rule. The performance of the proposed Boosting method is investigated on some simulated and real-world data. The experimental results show that our method can significantly improve the variable selection performance of traditional genetic algorithm and accurately identify the relevant variables. Thus, the novel Boosting method can be deemed as an effective technique for handling variable selection problems in linear regression models.

Key words: Boosting algorithm, variable selection, ensemble learning, genetic algorithm, diversity

CLC Number: