Title: Design and implementation of parallel statiatical algorithm based on Hadoop's MapReduce model
Abstract: The rapid growth of data promotes the development of parallel computing. MapReduce, which is a simplified programming model of distributed parallel computing, is becoming more and more popular. In this paper, we design and implementation of parallel statistical algorithm based on Hadoop's MapReduce model. The algorithm, which is used to grasp the overall characteristics of massive data, involves the calculation of central tendency, dispersion and distribution tendency. By experiment, we come to the conclusion that the algorithm is suitable for dealing with large-scale data.
Publication Year: 2011
Publication Date: 2011-09-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 5
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot