Pular para o conteúdo principal

Graph Representation Learning - Cap 3 - Neighborhood Reconstruction Methods

Fonte: https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book-Chapter_3-Node_Embeddings.pdf

@article{
author={Hamilton, William L.},
title={Graph Representation Learning},
journal={Synthesis Lectures on Artificial Intelligence and Machine Learning},
volume={14},
number={3},
pages={1-159},
publisher={Morgan and Claypool}
}

learning node embeddings: The goal of these methods is to encode nodes as low-dimensional vectors that summarize their graph position and the structure of their local graph neighborhood. In other words, we want to project nodes into a latent space, where geometric relations in this latent space correspond to relationships (e.g., edges) in the original graph or network

node embedding methods for simple and weighted graph 

An Encoder-Decoder Perspective

In the encoder-decoder framework, we view the graph representation learning problem as involving two key operations. First, an encoder model maps each node in the graph into a low-dimensional vector or embedding. Next, a decoder model takes the low-dimensional node embeddings and uses them to reconstruct information about each node’s neighborhood in the original graph.

 
the encoder takes node IDs as input to generate the node embeddings. In most work on node embeddings, the encoder relies on what we call the shallow embedding approach, where this encoder function is simply an embedding lookup based on the node ID.
 
the encoder can use node features or the local graph structure around each node as input to generate an embedding. These generalized encoder architectures is often called graph neural networks (GNNs)
 
The role of the decoder is to reconstruct certain graph statistics from the node embeddings that are generated by the encoder.
 
Pairwise decoders can be interpreted as predicting the relationship or similarity between pairs of nodes. For instance, a simple pairwise decoder could predict whether two nodes are neighbors in the graph. To achieve the reconstruction objective, the standard practice is to minimize an empirical reconstruction loss L over a set of training node pairs D ... a loss function measuring the discrepancy between the decoded (i.e., estimated) similarity values dec(zu,zv) and the true values S[u,v]. Depending on the definition of the decoder (dec) and similarity function (S), the loss function might be a mean-squared error or even a classification loss, such as cross entropy

Factorization-based approaches

we can view this task as using matrix factorization to learn a low-dimensional approximation of a node-node similarity matrix S, where S generalizes the adjacency matrix and captures some user-defined notion of node-node similarity
 
Laplacian eigenmaps  In this approach, we define the decoder based on the L2-distance between the node embeddings (1) ... The loss function then weighs pairs of nodes according to their similarity in the graph (2) ... The intuition behind this approach is that Equation (2) penalizes the model when very similar nodes have embeddings that are far apart
(1) (2)
 
Inner-product methods ... more recent work generally employs an inner-product based decoder ...
Here, we assume that the similarity between two nodes - e.g., the overlap between their local neighborhoods - is proportional to the dot product of their embeddings. These methods are referred to as matrix-factorization approaches, since their loss functions can be minimized using factorization algorithms, such as the singular value decomposition (SVD). ... Intuitively, the goal of these methods is to learn embeddings for each node such that the inner product between the learned embedding vectors
approximates some deterministic measure of node similarity.

Random walk embeddings

DeepWalk and node2vec ... DeepWalk and node2vec use a shallow embedding approach and an inner-product decoder. The key distinction in these methods is in how they define the notions of node similarity and neighborhood reconstruction. Instead of directly reconstructing the adjacency matrix A—or some deterministic function of A—these approaches optimize embeddings to encode the statistics of random walks
 
Here, we use D to denote the training set of random walks, which is generated by sampling random walks starting from each node.There are different strategies to overcome this computational challenge, and this is one of the essential differences between the original DeepWalk and node2vec algorithms. DeepWalk employs a hierarchical softmax to approximate (decoding), which involves leveraging a binary-tree structure to accelerate the computation. The node2vec approach also distinguishes itself from the earlier DeepWalk algorithm by allowing for a more flexible definition of random walks. 

Large-scale information network embeddings (LINE) algorithm is often discussed within the context of random-walk approaches. The LINE approach does not explicitly leverage random walks, but it shares conceptual motivations with DeepWalk and node2vec. The basic idea in LINE is to combine two encoder-decoder objectives.

Limitations of Shallow Embeddings

  1. do not share any parameters between nodes in the encoder, since the encoder directly optimizes a
    unique embedding vector for each node. This lack of parameter sharing is both statistically and computationally inefficient. Complexidade O(|V|)
  2. do not leverage node features in the encoder.
  3. these methods can onlygenerate embeddings for nodes that were present during the training phase. Generating embeddings for new nodes—which are observed after the training phase—is not possible unless additional optimizations are performedto learn the embeddings for these nodes

 

Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...