Title: Traversal optimizations and analysis for large graph clustering
Abstract: Graph is an extremely versatile data structure in terms of its expressiveness and flexibility to model a range of real life phenomenon, such as social, biological, sensor, and computer networks. Finding groups of vertices based on their similarity is the fundamental graph mining task to get useful insights. The existing methods suffer from scalability issues due to enormous computations of an exact similarity estimation. Therefore, we introduce Collaborative Similarity Measure (CSM) based on shortest path strategy, instead of all paths, to define structural and semantic relevance among vertices efficiently. We evaluate this measure for personalized email community detection as an application scenario. However, an abundance of structural information has resulted in non-trivial graph traversals. Shortcut construction is among the utilized techniques implemented for efficient shortest path (SP) traversals on graphs. The shortcut construction, being a computationally intensive task, required to be exclusive and offline, often produces unnecessary auxiliary data. To overcome this issue, we present Shortest Path Overlapped Region (SPORE), a performance-based initiative that improves the shortcut construction performance by exploiting SP overlapped regions. Path overlapping with empirical analysis has been overlooked by shortcut construction systems. SPORE avails this opportunity and provides a solution by constructing auxiliary shortcuts incrementally, using SP trees during traversals, instead of an exclusive step. SPORE is exposed to a graph clustering task, which requires extensive graph traversals to group similar vertices together, for realistic implications. We further suggest an optimization strategy to accelerate the performance of the clustering process using confined subgraph traversals. Leveraging the SPORE with multiple SP computations consistently reduces the latency of the entire clustering process. A parameter-free graph clustering with scalable graph traversal strategy for a billion scale graph remain an open issue.
Publication Year: 2015
Publication Date: 2015-04-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 1
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot