Fonte: https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book-Chapter_1-Intro.pdf
@article{
author={Hamilton, William L.},
title={Graph Representation Learning},
journal={Synthesis Lectures on Artificial Intelligence and Machine Learning},
volume={14},
number={3},
pages={1-159},
publisher={Morgan and Claypool}
}
The power of the graph formalism lies both in its focus on relationships between points (rather than the properties of individual points), as well as in its generality. The same graph formalism can be used to represent social networks, interactions between drugs and proteins, the interactions between atoms in a molecule, or the connections between terminals in a telecommunications network—to name just a few examples.
This book is about how we can use machine learning to tackle this challenge.
Multi-relational Graphs
Beyond the distinction between undirected, directed and weighted edges, we will also consider graphs that have di↵erent types of edges.
Heterogeneous graphs In heterogeneous graphs, nodes are also imbued with types, meaning that we can partition the set of nodes into disjoint sets. Edges in heterogeneous graphs generally satisfy constraints according to the node types, most commonly the constraint that certain edges only connect nodes of certain types.
Multiplex graphs In multiplex graphs we assume that the graph can be decomposed in a set of k layers. Every node is assumed to belong to every layer, and each layer corresponds to a unique relation, representing the intra-layer edge type for that layer. We also assume that inter-layer edges types can exist, which connect the same node across layers.
Machine learning is inherently a problem-driven discipline. We seek to build models that can learn from data in order to solve particular tasks, and machine learning models are often categorized according to the type of task they seek to solve.
Node classification
node classification, where the goal is to predict the label yu - which could be a type, category, or attribute - associated with all the nodes u 2V, when we are only given the true labels on a training set of nodes Vtrain ⇢V. Example: classifying the topic of documents based on hyperlink or citation graphs
At first glance, node classification appears to be a straightforward variation of standard supervised classification, but there are in fact important differences. The most important difference is that the nodes in a graph are not independent and identically distributed (i.i.d.). Usually, when we build supervised machine learning models we assume that each datapoint is statistically independent from
all the other datapoints; otherwise, we might need to model the dependencies between all our input points. We also assume that the datapoints are identically distributed; otherwise, we have no way of guaranteeing that our model will generalize to new datapoints. Node classification completely breaks this i.i.d. assumption. Rather than modeling a set of i.i.d. datapoints, we are instead modeling an interconnected set of nodes
Based on the notion of homophily we can build machine learning models that try to assign similar labels to neighboring nodes in a graph [Zhou et al., 2004]. Beyond homophily there are also concepts such as structural equivalence [Donnat et al., 2018], which is the idea that nodes with similar local neighborhood structures will have similar labels, as well as heterophily, which presumes that nodes will be preferentially connected to nodes with different labels
Supervised or semi-supervised? Due to the atypical nature of node classification, researchers often refer to it as semi-supervised [Yang et al., 2016]. This terminology is used because when we are training node classification models, we usually have access to the full graph, including all the unlabeled (e.g., test) nodes. The only thing we are missing is the labels of test nodes. However, we can still use information about the test nodes (e.g., knowledge of their neighborhood in the graph) to improve our model during training. This is different from the usual supervised setting, in which unlabeled datapoints are completely unobserved during training. The general term used for models that combine labeled and unlabeled data during traning is semi-supervised learning
Relation prediction
Can we use machine learning to infer the edges between nodes in a graph?
This task goes by many names, such as link prediction, graph completion, and relational inference, depending on the specific application domain. We will simply call it relation prediction here. Along with node classification, it is one of the more popular machine learning tasks with graph data and has countless real-world applications: recommending content to users in social platforms [Ying et al., 2018a], predicting drug side-effects [Zitnik et al., 2018], or inferring new facts in a relational databases [Bordes et al., 2013] - all of these tasks can be viewed as special cases of relation prediction.
For instance, in simple graphs, such as social networks that only encode “friendship” relations, there are simple heuristics based on how many neighbors two nodes share that can achieve strong performance [L‹u and Zhou, 2011]. On the other hand, in more complex multi-relational graph datasets, such as biomedical knowledge graphs that encode hundreds of different biological interactions, relation prediction can require complex reasoning and inference strategies [Nickel et al., 2016]
Clustering and community detection
In other words, we would expect this network- like many real-world networks - to exhibit a community structure, where nodes are much more likely to form edges with nodes that belong to the same community.
This is the general intuition underlying the task of community detection. The challenge of community detection is to infer latent community structures given only the input graph G= ( V, E). The many real-world applications of community detection include uncovering functional modules in genetic interaction networks [Agrawal et al., 2018] and uncovering fraudulent groups of users in financial transaction networks [Pandit et al., 2007].
Graph classification, regression, and clustering
In these graph classification or regression applications, we seek to learn over graph data, but instead of making predictions over the individual components of a single graph ... , we are instead given a dataset of multiple different graphs and our goal is to make independent predictions specific to each graph. In the related task of graph clustering, the goal is to learn an unsupervised measure of similarity between pairs of graphs. Of all the machine learning tasks on graphs, graph regression and classification are perhaps the most straightforward analogues of standard supervised learning.
Comentários
Postar um comentário
Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.