Chinese Journal of Engineering Mathematics
Next Articles
WANG Ge-hua, WANG Pu-yu, ZHANG Hai
Received:
Accepted:
Online:
Published:
Supported by:
Abstract: With the development of the digital age, a large number of high-dimensional data has been collected in various disciplines and fields. Faced with the huge amount of collected data, it becomes a great challenge for us to transform it into a form that can not only be stored and analyzed, but also can provide a reference for solving practical problems. In view of the current state of data storage, the distributed storage has emerged properly, in which data are stored in different machines in a certain way without any repetition, so as to solve the problem of data storage. Then, how to design a machine learning algorithm which is suitable for distributed data storage becomes another problem to be solved. As the theory of information technology has developed rapidly, the formulation and development of regularization methods provide us with an effective tool for processing and analyzing massive high-dimensional data, but they are only suitable for single-machine data processing. Concerning the superiority of non-convex regularization for variable selection and feature extraction, we combine distributed storage with non-convex regularization methods. We focus on non-convex regularization methods based on distributed computing to solve the storage and analysis of massive high-dimensional data. This paper studies the variable selection problem in the form of distributed data storage. We store the data separately in multiple computers that can communicate with each other, and propose a distributed MCP method. The distributed MCP algorithm implements interactive information between adjacent computers based on the ADMM algorithm, completes variable selection of full data, and ensures the convergence. The variable selection result of the distributed method is the same as that of the non-distributed method. Finally, the experimental results show that the proposed method is suitable for processing distributed storage data.
Key words: distributed, sparse, MCP, ADMM
CLC Number:
O213
O236.2
WANG Ge-hua, WANG Pu-yu, ZHANG Hai. Distributed Variable Selection---MCP Regularization[J]. Chinese Journal of Engineering Mathematics, doi: 10.3969/j.issn.1005-3085.2021.03.001.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://jgsx-csiam.org.cn/EN/10.3969/j.issn.1005-3085.2021.03.001
http://jgsx-csiam.org.cn/EN/Y2021/V38/I3/301