Fonte: https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book-Chapter_3-Node_Embeddings.pdf
@article{
author={Hamilton, William L.},
title={Graph Representation Learning},
journal={Synthesis Lectures on Artificial Intelligence and Machine Learning},
volume={14},
number={3},
pages={1-159},
publisher={Morgan and Claypool}
}
learning node embeddings: The goal of these methods is to encode nodes as low-dimensional vectors that summarize their graph position and the structure of their local graph neighborhood. In other words, we want to project nodes into a latent space, where geometric relations in this latent space correspond to relationships (e.g., edges) in the original graph or network
node embedding methods for simple and weighted graph
An Encoder-Decoder Perspective
In the encoder-decoder framework, we view the graph representation learning problem as involving two key operations. First, an encoder model maps each node in the graph into a low-dimensional vector or embedding. Next, a decoder model takes the low-dimensional node embeddings and uses them to reconstruct information about each node’s neighborhood in the original graph.
the encoder takes node IDs as input to generate the node embeddings. In most work on node embeddings, the encoder relies on what we call the shallow embedding approach, where this encoder function is simply an embedding lookup based on the node ID.
the encoder can use node features or the local graph structure around each node as input to generate an embedding. These generalized encoder architectures is often called graph neural networks (GNNs)
The role of the decoder is to reconstruct certain graph statistics from the node embeddings that are generated by the encoder.
Pairwise decoders can be interpreted as predicting the relationship or similarity between pairs of nodes. For instance, a simple pairwise decoder could predict whether two nodes are neighbors in the graph. To achieve the reconstruction objective, the standard practice is to minimize an empirical reconstruction loss L over a set of training node pairs D ... a loss function measuring the discrepancy between the decoded (i.e., estimated) similarity values dec(zu,zv) and the true values S[u,v]. Depending on the definition of the decoder (dec) and similarity function (S), the loss function might be a mean-squared error or even a classification loss, such as cross entropy
Factorization-based approaches
we can view this task as using matrix factorization to learn a low-dimensional approximation of a node-node similarity matrix S, where S generalizes the adjacency matrix and captures some user-defined notion of node-node similarity
Laplacian eigenmaps: In this approach, we define the decoder based on the L2-distance between the node embeddings (1) ... The loss function then weighs pairs of nodes according to their similarity in the graph (2) ... The intuition behind this approach is that Equation (2) penalizes the model when very similar nodes have embeddings that are far apart
(1)
(2) 
Inner-product methods ... more recent work generally employs an inner-product based decoder ...
Here, we assume that the similarity between two nodes - e.g., the overlap between their local neighborhoods - is proportional to the dot product of their embeddings. These methods are referred to as matrix-factorization approaches, since their loss functions can be minimized using factorization algorithms, such as the singular value decomposition (SVD). ... Intuitively, the goal of these methods is to learn embeddings for each node such that the inner product between the learned embedding vectors
approximates some deterministic measure of node similarity.
Here, we assume that the similarity between two nodes - e.g., the overlap between their local neighborhoods - is proportional to the dot product of their embeddings. These methods are referred to as matrix-factorization approaches, since their loss functions can be minimized using factorization algorithms, such as the singular value decomposition (SVD). ... Intuitively, the goal of these methods is to learn embeddings for each node such that the inner product between the learned embedding vectors
approximates some deterministic measure of node similarity.
Random walk embeddings
DeepWalk and node2vec ... DeepWalk and node2vec use a shallow embedding approach and an inner-product decoder. The key distinction in these methods is in how they define the notions of node similarity and neighborhood reconstruction. Instead of directly reconstructing the adjacency matrix A—or some deterministic function of A—these approaches optimize embeddings to encode the statistics of random walks.
Here, we use D to denote the training set of random walks, which is generated by sampling random walks starting from each node.There are different strategies to overcome this computational challenge, and this is one of the essential differences between the original DeepWalk and node2vec algorithms. DeepWalk employs a hierarchical softmax to approximate (decoding), which involves leveraging a binary-tree structure to accelerate the computation. The node2vec approach also distinguishes itself from the earlier DeepWalk algorithm by allowing for a more flexible definition of random walks.
Large-scale information network embeddings (LINE) algorithm is often discussed within the context of random-walk approaches. The LINE approach does not explicitly leverage random walks, but it shares conceptual motivations with DeepWalk and node2vec. The basic idea in LINE is to combine two encoder-decoder objectives.
Limitations of Shallow Embeddings
- do not share any parameters between nodes in the encoder, since the encoder directly optimizes a
unique embedding vector for each node. This lack of parameter sharing is both statistically and computationally inefficient. Complexidade O(|V|) - do not leverage node features in the encoder.
- these methods can onlygenerate embeddings for nodes that were present during the training phase. Generating embeddings for new nodes—which are observed after the training phase—is not possible unless additional optimizations are performedto learn the embeddings for these nodes
Comentários
Postar um comentário
Sinta-se a vontade para comentar. CrÃticas construtivas são sempre bem vindas.