Title: Scalable sequential pattern mining based on PrefixSpan for high dimensional data
Abstract: The phenomenon of data explosion makes analysis to find insights inside the data become more difficult. A problem that often occurs in the application of pattern recognition in the real-world domain is not only caused by the large size of data but also the high-dimensional data. Data analysis demands that a large and complex data can be processed quickly and optimally to support decision making. This study offered a scalable sequential patterns extraction to gain more insight from the data using PrefixSpan implemented on the Spark platform as a distributed system. The goal is to overcome the problem of increasing the amount of data (scalability) in complex and high dimensional data effectively and in a relatively quick performance. The experiments show that this method can make full use of cluster computing resources to accelerate the mining process, reduces the time of scanning database and build projected database with an increasing number of worker on the Spark platform.
Publication Year: 2016
Publication Date: 2016-10-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot