Network depth: a demo

This is a small demo related to our (Giulia Bertagnolli, Claudio Agostinelli, Manlio De Domenico) recent work, Network depth: identifying median and contours in complex networks, Journal of Complex Networks 8 (4). doi: 10.1093/comnet/cnz041. arXiv:1904.05060.

Network Scientists 2010

The Network Scientists 2010 networkdownload data – is a co-authorship network with \(N=552\) nodes.

The node size in the following plot depends on degree (quantiles).

Depth Patterns

The following plot(ly) shows depth patterns in three diffusion embeddings:

  • \(p = 5, 10, 15\), the embedding dimension can be controlled through the slider on the bottom of the figure;
  • \(t\), diffusion time is on the x-axis
  • \(PTD(D_t, t, p)\), depth values on the y-axis

Each nodes corresponds to a line.

Explore the plot through plotly sliders and interaction tools!

Diffusion Distance with \(\tau = 10\)

The Network Scientists 2010 network. Node colour and size depend on the Projected Tukey Depth w.r.t. diffusion distance, \(PTD(D_t, t, p)\). Plots for \(\tau = 10\) and different values of \(p\), the dimension of the embedding space.

Pressing the animation button deeps will scale the nodes size based on the depth region (i.e. depth quantile interval) they belong to. We consider the following percentiles: 99%, 97.5%, 95%, 90%, 75%, 50%, 25% and >25%. To go back, simply refresh the page!

Embedding in \(\mathbb{R}^5\)

\(PTD(D_t, t = 10, p = 5)\)

deepest

Embedding in \(\mathbb{R}^{10}\)

\(PTD(D_t, t = 10, p = 10)\)

deepest

Embedding in \(\mathbb{R}^{15}\)

\(PTD(D_t, t = 10, p = 15)\)

deepest

Words of Complex Networks

A corpus has been built from all arxiv abstracts concerning complex network and then, through word2vec, concepts have been retrieved. We can compute similarities and distances on these \(N=95\) words, thanks to which we can embed words in space.

To visualise words and relations among them, we build an undirected weighted network (thresholding the cosine similarity matrix on the 98-percentile). In the following plot, the network structure reflects cosine similarity, node size depends on degree and node colour (brewer.pal("PuOr")) on betweenness centrality.

save

Since the this network is built upon a thresholded similarity matrix, we work directly on the matrix (without thresholds) to get distances/dissimilarities and to embed this word network into space.

For \(p\geq 8\) the depth space reduces to two depth values and in \(\mathbb{R}^p\) with dimension higher than 10 all the words lie on a convex shell, having all the same depth w.r.t. the data cloud. For \(p \geq 3\) the depth ranking is “stable”, in that the depth induced order between points remains the same but for nodes in outer contours.

Both \(p = 3, 4\) represent good choices since, they are the smallest dimensions displaying the stable depth pattern for top ranking words.

The median word is dynamics.

Embedding in \(\mathbb{R}^{3}\)

deepest

Embedding in \(\mathbb{R}^{4}\)

deepest

References

Giulia Bertagnolli
Giulia Bertagnolli
Jr Researcher (RTDa)

I am interested in the geometry of structured data (networks, functional data…) and in the development of mathematical and statistical tools for data analysis based on geometrical approaches.

Related