Foster, Foster (2007)

Link and subgraph likelihoods in random undirected networks with fixed and partially fixed degree sequences

Summary

Given a "real world" network, how do we know if some particular property is the result of the network's history, or of some functional constraint? In other words, how do we know the property is not simply typical of such networks? Null models are essential tools for making this discrimination. A basic hypothesis of contemporary network research is that the degree sequence of the network generates much of its global and local structure. Hence the simplest null models capturing basic topological features of real-world networks are network ensembles of fixed degree sequence (FDS). Unfortunately, the FDS ensemble is very difficult to study analytically, despite decades of effort by combinatorialists. Consequently, most work in FDS is done numerically, using Monte Carlo techniques for sampling. In this paper we review these techniques, and propose an analytic approximation scheme for estimating local topological features of the ensemble (link likelihood, motif likelihood). We make these calculations by working in an ensemble of partially-fixed degree sequence (PFDS), which can be solved exactly. We find that PFDS ensembles, fixing only the degrees of the nodes under consideration, yield estimates comparing favorably to the results of numerical sampling for both undirected and directed networks. Calculations in the PFDS ensemble are easy to understand: the partition function can be diagrammatically expanded into a sum of sub-ensembles possessing particular subgraphs. As might be expected from this diagrammatic expansion, the PFDS approximation is useful in the study of motifs, and current work indicates that the approximation becomes better as the size of the network increases. By making a further simplifying approximation, we can calculate analytic estimates of the scaling of mean neighbor degree with degree. These compare quite favorably to Monte Carlo estimates of this scaling.

Abstract

The simplest null models for networks, used to distinguish significant features of a particular network from a priori expected features, are random ensembles with the degree sequence fixed by the specific network of interest. These "fixed degree sequence" (FDS) ensembles are, however, famously resistant to analytic attack. In this paper we introduce ensembles with partially-fixed degree sequences (PFDS) and compare analytic results obtained for them with Monte Carlo results for the FDS ensemble. These results include link likelihoods, subgraph likelihoods, and degree correlations. We find that local structural features in the FDS ensemble can be reasonably well estimated by simultaneously fixing only the degrees of few nodes, in addition to the total number of nodes and links. As test cases we use a food web, two protein interaction networks (E. coliS. cerevisiae), the internet on the autonomous system (AS) level, and the World Wide Web. Fixing just the degrees of two nodes gives the mean neighbor degree as a function of node degree, <k?>k, in agreement with results explicitly obtained from rewiring. For power law degree distributions, we derive the disassortativity analytically. In the PFDS ensemble the partition function can be expanded diagrammatically. We obtain an explicit expression for the link likelihood to lowest order, which reduces in the limit of large, sparse undirected networks with L links and with kmax ? L to the simple formula P(k,k?)= kk?/(2L+kk?). In a similar limit, the probability for three nodes to be linked into a triangle reduces to the factorized expression P(k1,k2,k3)= P(k1,k2)P(k1,k3)P(k2,k3).

BibTeX

@article{foster2007las,
title={{Link and subgraph likelihoods in random undirected networks with fixed and partially fixed degree sequences}},
author={Foster, J.G. and Foster, D.V. and Grassberger, P. and Paczuski, M.},
journal={Physical Review E},
volume={76},
number={4},
pages={46112},
year={2007},
publisher={APS}
}