Publikationsansicht

Multiresolution community detection for mega-scale networks by information-based replica correlations (2008)

Abstract
We use a Potts model community detection algorithm to accurately and quantitatively evaluate the hierarchical or multiresolution structure of a graph. By calculating the correlations among multiple copies ("replicas") of the same graph over a range of resolutions, multiresolution structures manifest themselves as strong correlations between the individual replica solutions. The average Normalized Mutual Information, the Variation of Information, and other measures in principle give a quantitative estimate of the `best' resolutions and indicate the relative strength of the structures in the graph. Because the method is based on information comparisons, it can in principle be used with any community detection model that can examine multiple resolutions. As a local measure, our Potts model avoids the `resolution limit' that affects other popular models. With this model, our community detection algorithm has an accuracy that ranks among the best of currently available methods. Using it, we can examine graphs over 40 million nodes and more than one billion edges. We further report that the multiresolution variant of our algorithm can accurately solve systems of at least 200000 nodes and 10 million edges on a single processor. For typical cases, we find a super-linear scaling, $O(L^{1.3})$ for community detection and $O(L^{1.3}\log N)$ for the multiresolution algorithm where L is the number of edges and N is the number of nodes in the system.. Comment: 14 pages, 10 figures, references added and minor text changes

Details der Publikation
Download http://arxiv.org/abs/0812.1072
Archiv arXiv (United States)
Keywords Physics - Physics and Society, Condensed Matter - Statistical Mechanics, Physics - Data Analysis, Statistics and Probability
Typ text