The Open Automation and Control Systems Journal

2015, 7 : 873-878
Published online 2015 August 31. DOI: 10.2174/1874444301507010873
Publisher ID: TOAUTOCJ-7-873

Parallel ID3 Algorithm Based on Granular Computing and Hadoop

Liu Ping , Wu Zhenggang , Zhou Hao , Yang Junping and Taorong Qiu
School of Information Engineering, Nanchang University, Nanchang 330031, China.

ABSTRACT

Large data processing has become a hot topic of current research. How to efficiently dig out useful information from large amounts of data has become an important research direction in the field of data mining. In this paper, firstly, based on the idea of granular computing, some granular concepts about the decision tree are introduced. Secondly, referring to granular computing, the improvement and parallelization of ID3 algorithms are presented. Finally, the proposed algorithms are tested on two data sets, and it can be concluded that the algorithm's classification accuracy is improved. From the test on a Hadoop platform, the results demonstrate that parallel algorithms can efficiently process massive datasets.

Keywords:

Data mining, granular computing, hadoop, ID3, large data processing.