Misplaced Pages

Barabási–Albert model

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Barabási-Albert (BA) model) Scale-free network generation algorithm
Display of three graphs generated with the Barabasi-Albert (BA) model. Each has 20 nodes and a parameter of attachment m as specified. The color of each node is dependent upon its degree (same scale for each graph).
Part of a series on
Network science
Internet_map_1024.jpg
Network types
Graphs
Features
Types
Models
Topology
Dynamics
  • Lists
  • Categories

The Barabási–Albert (BA) model is an algorithm for generating random scale-free networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World Wide Web, citation networks, and some social networks are thought to be approximately scale-free and certainly contain few nodes (called hubs) with unusually high degree as compared to the other nodes of the network. The BA model tries to explain the existence of such nodes in real networks. The algorithm is named for its inventors Albert-László Barabási and Réka Albert.

Concepts

Many observed networks (at least approximately) fall into the class of scale-free networks, meaning that they have power-law (or scale-free) degree distributions, while random graph models such as the Erdős–Rényi (ER) model and the Watts–Strogatz (WS) model do not exhibit power laws. The Barabási–Albert model is one of several proposed models that generate scale-free networks. It incorporates two important general concepts: growth and preferential attachment. Both growth and preferential attachment exist widely in real networks.

Growth means that the number of nodes in the network increases over time.

Preferential attachment means that the more connected a node is, the more likely it is to receive new links. Nodes with a higher degree have a stronger ability to grab links added to the network. Intuitively, the preferential attachment can be understood if we think in terms of social networks connecting people. Here a link from A to B means that person A "knows" or "is acquainted with" person B. Heavily linked nodes represent well-known people with lots of relations. When a newcomer enters the community, they are more likely to become acquainted with one of those more visible people rather than with a relative unknown. The BA model was proposed by assuming that in the World Wide Web, new pages link preferentially to hubs, i.e. very well known sites such as Google, rather than to pages that hardly anyone knows. If someone selects a new page to link to by randomly choosing an existing link, the probability of selecting a particular page would be proportional to its degree. The BA model claims that this explains the preferential attachment probability rule.

Later, the Bianconi–Barabási model works to address this issue by introducing a "fitness" parameter. Preferential attachment is an example of a positive feedback cycle where initially random variations (one node initially having more links or having started accumulating links earlier than another) are automatically reinforced, thus greatly magnifying differences. This is also sometimes called the Matthew effect, "the rich get richer". See also autocatalysis.

Algorithm

The steps of the growth of the network according to the Barabasi–Albert model ( m 0 = m = 2 {\displaystyle m_{0}=m=2} )

The only parameter in the BA model is m {\displaystyle m} , a positive integer. The network initializes with a network of m 0 m {\displaystyle m_{0}\geq m} nodes.

At each step, add one new node, then sample m {\displaystyle m} neighbors among the existing vertices from the network, with a probability that is proportional to the number of links that the existing nodes already have (The original papers did not specify how to handle cases where the same existing node is chosen multiple times.). Formally, the probability p i {\displaystyle p_{i}} that the new node is connected to node i {\displaystyle i} is

p i = k i j k j , {\displaystyle p_{i}={\frac {k_{i}}{\sum _{j}k_{j}}},}

where k i {\displaystyle k_{i}} is the degree of node i {\displaystyle i} and the sum is made over all pre-existing nodes j {\displaystyle j} (i.e. the denominator results in twice the current number of edges in the network). This step can be performed by first uniformly sampling one edge, then sampling one of the two vertices on the edge.

Heavily linked nodes ("hubs") tend to quickly accumulate even more links, while nodes with only a few links are unlikely to be chosen as the destination for a new link. The new nodes have a "preference" to attach themselves to the already heavily linked nodes.

A tree network generated according to the Barabasi-Albert model. The network is made of 50 vertices with initial degrees m 0 = 1 {\displaystyle m_{0}=1} .

Properties

Degree distribution

The distribution of the vertex degrees of a BA graph with 200000 nodes and 2 new edges per step. Plotted in log-log scale. It follows a power law with exponent -2.78.

The degree distribution resulting from the BA model is scale free, in particular, it is a power law of the form

P ( k ) k 3 {\displaystyle P(k)\sim k^{-3}\,}

Hirsch index distribution

The h-index or Hirsch index distribution was shown to also be scale free and was proposed as the lobby index, to be used as a centrality measure

H ( k ) k 6 {\displaystyle H(k)\sim k^{-6}\,}

Furthermore, an analytic result for the density of nodes with h-index 1 can be obtained in the case where m 0 = 1 {\displaystyle m_{0}=1}

H ( 1 ) | m 0 = 1 = 4 π {\displaystyle H(1){\Big |}_{m_{0}=1}=4-\pi \,}

Node degree correlations

Correlations between the degrees of connected nodes develop spontaneously in the BA model because of the way the network evolves. The probability, n k {\displaystyle n_{k\ell }} , of finding a link that connects a node of degree k {\displaystyle k} to an ancestor node of degree {\displaystyle \ell } in the BA model for the special case of m = 1 {\displaystyle m=1} (BA tree) is given by

n k = 4 ( 1 ) k ( k + 1 ) ( k + ) ( k + + 1 ) ( k + + 2 ) + 12 ( 1 ) k ( k + 1 ) ( k + ) ( k + + 1 ) ( k + + 2 ) . {\displaystyle n_{k\ell }={\frac {4\left(\ell -1\right)}{k\left(k+1\right)\left(k+\ell \right)\left(k+\ell +1\right)\left(k+\ell +2\right)}}+{\frac {12\left(\ell -1\right)}{k\left(k+\ell -1\right)\left(k+\ell \right)\left(k+\ell +1\right)\left(k+\ell +2\right)}}.}

This confirms the existence of degree correlations, because if the distributions were uncorrelated, we would get n k = k 3 3 {\displaystyle n_{k\ell }=k^{-3}\ell ^{-3}} .

For general m {\displaystyle m} , the fraction of links who connect a node of degree k {\displaystyle k} to a node of degree {\displaystyle \ell } is

p ( k , ) = 2 m ( m + 1 ) k ( k + 1 ) ( + 1 ) [ 1 ( 2 m + 2 m + 1 ) ( k + 2 m m ) ( k + + 2 + 1 ) ] . {\displaystyle p(k,\ell )={\frac {2m(m+1)}{k(k+1)\ell (\ell +1)}}\left.}

Also, the nearest-neighbor degree distribution p ( k ) {\displaystyle p(\ell \mid k)} , that is, the degree distribution of the neighbors of a node with degree k {\displaystyle k} , is given by

p ( k ) = m ( k + 2 ) k ( + 1 ) [ 1 ( 2 m + 2 m + 1 ) ( k + 2 m m ) ( k + + 2 + 1 ) ] . {\displaystyle p(\ell \mid k)={\frac {m(k+2)}{k\ell (\ell +1)}}\left.}

In other words, if we select a node with degree k {\displaystyle k} , and then select one of its neighbors randomly, the probability that this randomly selected neighbor will have degree {\displaystyle \ell } is given by the expression p ( | k ) {\displaystyle p(\ell |k)} above.

Clustering coefficient

An analytical result for the clustering coefficient of the BA model was obtained by Klemm and Eguíluz and proven by Bollobás. A mean-field approach to study the clustering coefficient was applied by Fronczak, Fronczak and Holyst.

This behavior is still distinct from the behavior of small-world networks where clustering is independent of system size. In the case of hierarchical networks, clustering as a function of node degree also follows a power-law,

C ( k ) = k 1 . {\displaystyle C(k)=k^{-1}.\,}

This result was obtained analytically by Dorogovtsev, Goltsev and Mendes.

Spectral properties

The spectral density of BA model has a different shape from the semicircular spectral density of random graph. It has a triangle-like shape with the top lying well above the semicircle and edges decaying as a power law. In (Section 5.1), it was proved that the shape of this spectral density is not an exact triangular function by analyzing the moments of the spectral density as a function of the power-law exponent.

Dynamic scaling

Generalized degree distribution F ( q , t ) {\displaystyle F(q,t)} of the BA model for m = 1 {\displaystyle m=1}
The same data is plotted in the self-similar coordinates t 1 / 2 F ( q , N ) {\displaystyle t^{1/2}F(q,N)} and q / t 1 / 2 {\displaystyle q/t^{1/2}} and it gives an excellent collapsed revealing that F ( q , t ) {\displaystyle F(q,t)} exhibit dynamic scaling.

By definition, the BA model describes a time developing phenomenon and hence, besides its scale-free property, one could also look for its dynamic scaling property. In the BA network nodes can also be characterized by generalized degree q {\displaystyle q} , the product of the square root of the birth time of each node and their corresponding degree k {\displaystyle k} , instead of the degree k {\displaystyle k} alone since the time of birth matters in the BA network. We find that the generalized degree distribution F ( q , t ) {\displaystyle F(q,t)} has some non-trivial features and exhibits dynamic scaling

F ( q , t ) t 1 / 2 ϕ ( q / t 1 / 2 ) . {\displaystyle F(q,t)\sim t^{-1/2}\phi (q/t^{1/2}).}

It implies that the distinct plots of F ( q , t ) {\displaystyle F(q,t)} vs q {\displaystyle q} would collapse into a universal curve if we plot F ( q , t ) t 1 / 2 {\displaystyle F(q,t)t^{1/2}} vs q / t 1 / 2 {\displaystyle q/t^{1/2}} .

Limiting cases

Model A

Model A retains growth but does not include preferential attachment. The probability of a new node connecting to any pre-existing node is equal. The resulting degree distribution in this limit is geometric, indicating that growth alone is not sufficient to produce a scale-free structure.

Model B

Model B retains preferential attachment but eliminates growth. The model begins with a fixed number of disconnected nodes and adds links, preferentially choosing high degree nodes as link destinations. Though the degree distribution early in the simulation looks scale-free, the distribution is not stable, and it eventually becomes nearly Gaussian as the network nears saturation. So preferential attachment alone is not sufficient to produce a scale-free structure.

The failure of models A and B to lead to a scale-free distribution indicates that growth and preferential attachment are needed simultaneously to reproduce the stationary power-law distribution observed in real networks.

Non-linear preferential attachment

Main article: Non-linear_preferential_attachment

The BA model can be thought of as a specific case of the more general non-linear preferential attachment (NLPA) model. The NLPA algorithm is identical to the BA model with the attachment probability replaced by the more general form

p i = k i α j k j α , {\displaystyle p_{i}={\frac {k_{i}^{\alpha }}{\sum _{j}k_{j}^{\alpha }}},}

where α {\displaystyle \alpha } is a constant positive exponent. If α = 1 {\displaystyle \alpha =1} , NLPA reduces to the BA model and is referred to as "linear". If 0 < α < 1 {\displaystyle 0<\alpha <1} , NLPA is referred to as "sub-linear" and the degree distribution of the network tends to a stretched exponential distribution. If α > 1 {\displaystyle \alpha >1} , NLPA is referred to as "super-linear" and a small number of nodes connect to almost all other nodes in the network. For both α < 1 {\displaystyle \alpha <1} and α > 1 {\displaystyle \alpha >1} , the scale-free property of the network is broken in the limit of infinite system size. However, if α {\displaystyle \alpha } is only slightly larger than 1 {\displaystyle 1} , NLPA may result in degree distributions which appear to be transiently scale free.

History

Preferential attachment made its first appearance in 1923 in the celebrated urn model of the Hungarian mathematician György Pólya in 1923. The master equation method, which yields a more transparent derivation, was applied to the problem by Herbert A. Simon in 1955 in the course of studies of the sizes of cities and other phenomena. It was first applied to explain citation frequencies by Derek de Solla Price in 1976. Price was interested in the accumulation of citations of scientific papers and the Price model used "cumulative advantage" (his name for preferential attachment) to generate a fat tailed distribution. In the language of modern citations network, Price's model produces a directed network, i.e. the version of the Barabási-Albert model. The name "preferential attachment" and the present popularity of scale-free network models is due to the work of Albert-László Barabási and Réka Albert, who discovered that a similar process is present in real networks, and applied in 1999 preferential attachment to explain the numerically observed degree distributions on the web.

See also

References

  1. ^ Albert, Réka; Barabási, Albert-László (2002). "Statistical mechanics of complex networks" (PDF). Reviews of Modern Physics. 74 (47): 47–97. arXiv:cond-mat/0106096. Bibcode:2002RvMP...74...47A. CiteSeerX 10.1.1.242.4753. doi:10.1103/RevModPhys.74.47. S2CID 60545. Archived from the original (PDF) on 2015-08-24.
  2. Korn, A.; Schubert, A.; Telcs, A. (2009). "Lobby index in networks". Physica A. 388 (11): 2221–2226. arXiv:0809.0514. Bibcode:2009PhyA..388.2221K. doi:10.1016/j.physa.2009.02.013. S2CID 1119190.
  3. ^ Fotouhi, Babak; Rabbat, Michael (2013). "Degree correlation in scale-free graphs". The European Physical Journal B. 86 (12): 510. arXiv:1308.5169. Bibcode:2013EPJB...86..510F. doi:10.1140/epjb/e2013-40920-6. S2CID 7520124.
  4. Klemm, K.; Eguíluz, V. C. (2002). "Growing scale-free networks with small-world behavior". Physical Review E. 65 (5): 057102. arXiv:cond-mat/0107607. Bibcode:2002PhRvE..65e7102K. doi:10.1103/PhysRevE.65.057102. hdl:10261/15314. PMID 12059755. S2CID 12945422.
  5. Bollobás, B. (2003). "Mathematical results on scale-free random graphs". Handbook of Graphs and Networks. pp. 1–37. CiteSeerX 10.1.1.176.6988.
  6. Fronczak, Agata; Fronczak, Piotr; Hołyst, Janusz A (2003). "Mean-field theory for clustering coefficients in Barabasi-Albert networks". Phys. Rev. E. 68 (4): 046126. arXiv:cond-mat/0306255. Bibcode:2003PhRvE..68d6126F. doi:10.1103/PhysRevE.68.046126. PMID 14683021. S2CID 2372695.
  7. Dorogovtsev, S.N.; Goltsev, A.V.; Mendes, J.F.F. (25 June 2002). "Pseudofractal scale-free web". Physical Review E. 65 (6): 066122. arXiv:cond-mat/0112143. Bibcode:2002PhRvE..65f6122D. doi:10.1103/PhysRevE.65.066122. PMID 12188798. S2CID 13996254.
  8. Farkas, I.J.; Derényi, I.; Barabási, A.-L.; Vicsek, T. (20 July 2001) . "Spectra of "real-world" graphs: Beyond the semicircle law". Physical Review E. 64 (2): 026704. arXiv:cond-mat/0102335. Bibcode:2001PhRvE..64b6704F. doi:10.1103/PhysRevE.64.026704. hdl:2047/d20000692. PMID 11497741. S2CID 1434432.
  9. Preciado, V.M.; Rahimian, A. (December 2017). "Moment-Based Spectral Analysis of Random Graphs with a Given Expected Degree Sequence". IEEE Transactions on Network Science and Engineering. 4 (4): 215–228. arXiv:1512.03489. doi:10.1109/TNSE.2017.2712064. S2CID 12187100.
  10. M. K. Hassan, M. Z. Hassan and N. I. Pavel, “Dynamic scaling, data-collapseand Self-similarity in Barabasi-Albert networks” J. Phys. A: Math. Theor. 44 175101 (2011) https://dx.doi.org/10.1088/1751-8113/44/17/175101
  11. Pekoz, Erol; Rollin, A.; Ross, N. (2012). "Total variation and local limit error bounds for geometric approximation". Bernoulli. Archived from the original on 2015-09-23. Retrieved 2012-10-25.
  12. Krapivsky, P. L.; Redner, S.; Leyvraz, F. (20 November 2000). "Connectivity of Growing Random Networks". Physical Review Letters. 85 (21): 4629–4632. arXiv:cond-mat/0005139. Bibcode:2000PhRvL..85.4629K. doi:10.1103/PhysRevLett.85.4629. PMID 11082613. S2CID 16251662.
  13. Krapivsky, Paul; Krioukov, Dmitri (21 August 2008). "Scale-free networks as preasymptotic regimes of superlinear preferential attachment". Physical Review E. 78 (2): 026114. arXiv:0804.1366. Bibcode:2008PhRvE..78b6114K. doi:10.1103/PhysRevE.78.026114. PMID 18850904. S2CID 14292535.
  14. Albert-László, Barabási (2012). "Luck or reason". Nature. 489 (7417): 507–508. doi:10.1038/nature11486. PMID 22972190. S2CID 205230706.
  15. Simon, Herbert A. (December 1955). "On a Class of Skew Distribution Functions". Biometrika. 42 (3–4): 425–440. doi:10.1093/biomet/42.3-4.425.
  16. Price, D.J. de Solla (September 1976). "A general theory of bibliometric and other cumulative advantage processes". Journal of the American Society for Information Science. 27 (5): 292–306. CiteSeerX 10.1.1.161.114. doi:10.1002/asi.4630270505. S2CID 8536863.
  17. Barabási, Albert-László; Albert, Réka (October 1999). "Emergence of scaling in random networks" (PDF). Science. 286 (5439): 509–512. arXiv:cond-mat/9910332. Bibcode:1999Sci...286..509B. doi:10.1126/science.286.5439.509. PMID 10521342. S2CID 524106. Archived from the original (PDF) on 2012-04-17.

External links

Categories: