Principled approach to the selection of the embedding dimension of networks

Authors: Weiwei Gu, Aditya Tandon, Yong-Yeol Ahn, Filippo Radicchi

Published: 2021-06-18

DOI: 10.1038/s41467-021-23795-5

Source: Full article


Abstract

AbstractNetwork embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension – small enough to be efficient and large enough to be effective – is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible.