Pular para o conteúdo principal

Knowledge Graph Embedding: A Survey of Approaches and Applications - Leitura de Artigo 3

 Q. Wang, Z. Mao, B. Wang and L. Guo, "Knowledge Graph Embedding: A Survey of Approaches and Applications," in IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 12, pp. 2724-2743, 1 Dec. 2017, doi: 10.1109/TKDE.2017.2754499.

  1. Techniques that conduct embedding using only facts observed in the KG
  2. Techniques that further incorporate additional information besides facts. 
  3. How embeddings can be applied to and benefit a wide variety of tasks (in-KG applications and out-of-KG applications)

(3)

In-KG applications: link prediction, triple classification, entity classification, and entity resolution,

Link prediction is typically referred to as the task of predicting an entity that has a specific relation with another given entity, i.e., predicting h given (r,t) or t given (h,r), with the former denoted as (?,r,t) and the latter as (h,r,?). This is essentially a KG completion task, i.e., adding missing knowledge to the graph.

With entity and relation representations learned beforehand, link prediction can be carried out simply by a ranking procedure. Various evaluation metrics have been designed based on such ranks, e.g., mean rank (the average of predicted ranks), mean reciprocal rank (the average of reciprocal ranks), Hits@n (the proportion of ranks no larger than n), and AUC-PR (the area under the precision-recall curve).

Triple classification consists in verifying whether an unseen triple fact (h,r,t) is true or not. This task, again, can be seen as some sort of completion of the input KG. Triples with higher scores tend to be true facts. Specifically, we introduce for each relation r a threshold δr. Then any unseen fact from that relation, say (h,r,t), will be predicted as true if its score fr(h,t) is higher than δr, and as false otherwise In this way, we obtain a triple classifier for each relation. Since for each triple a real valued score will be output along with the binary label, ranking metrics can also be used here, e.g., mean average precision.

Entity classification aims to categorize entities into different semantic categories, e.g., AlfredHitchcock is a Person, and Psycho a CreativeWork. Given that in most cases the relation encoding entity types (denoted as IsA ) is contained in the KG and has already been included into the embedding process, entity classification can be treated as a specific link prediction task, i.e., (x,IsA,?).

Entity resolution consists in verifying whether two entities refer to the same object. In some KGs many nodes actually refer to identical objects. A scenario where the KG already contains a relation stating whether two entities are equivalent (denoted as EqualTo) and an embedding has been learned for that relation. In this case, entity resolution degenerates to a triple classification problem, i.e., to judge whether the triple (x,EqualTo,y) holds or how likely this triple holds. 

proposed to perform entity resolution solely on the basis of entity representations. More specifically, given two entities x, y and their vector representations x, y, the similarity between x and y is computed as k(x,y)=exy22/σ, and this similarity score is used to measure the likelihood that x and y refer to the same entity. The new strategy works even if the EqualTo relation is not encoded in the input KG. AUC-PR is the most widely adopted evaluation metric for this task.

Sobre a fórmula || x - y|| ... A norma ou módulo de um vetor é um número real que representa o comprimento desse vetor e a fração é a razão entre 2 e uma constante fornecida pelo usuário. Mas isso foi um teste que os autores de outro artigo fizeram e não uma definição / prova pelo trecho extraído abaixo:

Based on this representation we compute the similarity between two en-tities x,y by using the heat kernel k(x,y) =e−‖xy2,where δ is a user-given constant and use this similarity score as a measure for the likelihood that x and y refer to the same entity. This is a relative ad hoc approach to entity resolution, but the focus of this experiment is again rather on assessing the collective learning capabilities of our approach than conducting a full entity resolution experiment.  

Out-of-KG applications are those which break through the boundary of the input KG and scale to broader domains. We introduce three such applications as examples, including relation extraction, question answering, and recommender systems. 

Relation extraction aims to extract relational facts from plain text where entities have already been detected. For example, given a sentence “Alfred Hitchcock directed Psycho” with the entities h=AlfredHitchcock and t=Psycho detected, a relation extractor should predict the relation DirectorOf between these two entities. Relation extraction has long been a crucial task in natural language processing, and provides an effective means to enrich KGs. Such approaches are still text-based extractors, ignoring the capability of a KG itself to reason new facts.

Relation extraction by jointly embedding plain text and KGs. In their work, text and KGs are represented in the same matrix. Each row of the matrix stands for a pair of entities, and each column a textual mention or a KG relation. If two entities co-occur with a mention in plain text or with a relation in KGs, the corresponding entry is set to one, and otherwise to zero.  

Question answering over KGs. Given a question expressed in natural language, the task is to retrieve the correct answer supported by a triple or set of triples from a KG. The use of KGs simplifies question answering by organizing a great variety of answers in a structured format. However, it remains a challenging task because of the great variability of natural language and of the large scale of KGs.

hybrid recommendation (systems) framework which leverages heterogeneous information in a KG to improve the quality of collaborative filtering. Specifically, they used three types of information stored in the KG, including structural knowledge (triple facts), textual knowledge (e.g., the textual summary of a book or a movie), and visual knowledge (e.g., a book's front cover or a movie's poster image), to derive semantic representations for items

Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...