Association Journal of CSIAM
Supervised by Ministry of Education of PRC
Sponsored by Xi'an Jiaotong University
ISSN 1005-3085  CN 61-1269/O1

Chinese Journal of Engineering Mathematics ›› 2019, Vol. 36 ›› Issue (4): 461-477.doi: 10.3969/j.issn.1005-3085.2019.04.009

Previous Articles     Next Articles

RS-BART: a Novel Technique to Boost the Prediction Ability of Bayesian Additive Regression Trees

WANG Guan-wei1,  ZHANG Chun-xia2,  YIN Qing-yan3   

  1. 1- School of Mechatronic Engineering, Xi'an Technological University, Xi'an 710021
    2- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049
    3- School of Science, Xi'an University of Architecture and Technology, Xi'an 710055
  • Received:2017-08-01 Accepted:2019-04-15 Online:2019-08-15 Published:2019-10-15
  • Supported by:
    The National Natural Science Foundation of China (11671317; 11601412); the Key Science and Technology Program of Shaanxi Province (2016GY-067); the Key Laboratory Program of Science and Technology Co-ordination and Innovation Project of Shaanxi Province (2014SZS20-K04).

Abstract: In supervised learning tasks, it is crucial for any algorithm to make accurate predictions on future data. As a Bayesian version of the gradient boosting algorithm, Bayesian additive regression trees (BART) have great potential to achieve high prediction accuracy. As far as we know, however, BART has not received as much attention as random forests and boosting. Thus, a comprehensive overview of BART is first presented to facilitate its understanding. Considering that BART may suffer from over-fitting in high-dimensional situations, one novel technique called RS-BART is developed to enhance its performance. Through first sorting all the variables with their relative importance, some low- or medium-dimensional BART models are trained with important variables. The predictions produced by these BART models are then integrated into the final result. By conducting experiments with some simulated and real data, RS-BART is demonstrated to perform better than or competitively with some state-of-the-art techniques including random forests, boosting and BART. Thus, RS-BART can be deemed as a competitive tool to solve real prediction tasks, especially high-dimensional but sparse ones.

Key words: ensemble learning, Bayesian additive regression tree, prediction accuracy, random forest, Gibbs sampling

CLC Number: