Pular para o conteúdo principal

Knowledge Graph Question Answering with Ambiguous Query - Leitura de artigo

https://dl.acm.org/doi/pdf/10.1145/3543507.3583316

Lihui Liu, Yuzhong Chen, Mahashweta Das, Hao Yang, and Hanghang Tong. 2023. Knowledge Graph Question Answering with Ambiguous Query. In Proceedings of the ACM Web Conference 2023 (WWW '23). Association for Computing Machinery, New York, NY, USA, 2477–2486. https://doi.org/10.1145/3543507.3583316

Abs -> 1 -> 6 -> 2 -> 5 -> 4 -> 3

ABSTRACT

In the vast majority of the existing works, the input queries are considered perfect and can precisely express the user’s query intention. However, in reality, input queries might be ambiguous and elusive which only contain a limited amount of information.

[Consultas em palavras chaves? Consultas como perguntas "completas"? Consultas em liguagem GQL?]
[Na minha pesquisa consideramos que as consultas, de qualquer tipo, são potencialmente incompletas em relação ao contexto (implícito) uma vez que o próprio usuário / aplicação pode desconhecer o contexto que se aplica ás alegações de interesse. Mas as respostas serão o mais contextualizadas possível através do mapeamento do contexto explícito e das regras para inferir contexto implícito]

In this paper, we propose PReFNet which focuses on answering ambiguous queries with pseudo relevance feedback on knowledge graphs. In order to leverage the hidden (pseudo) relevance information existed in the results that are initially returned from a given query, .... The inferred high quality queries will be returned to the users to help them search with ease.

[Seria reescrita ou expansão de consulta para sugerir ao usuário? Reescrita com geração de respostas]

1 INTRODUCTION

A knowledge graph is a graph data structure which contains a multitude of triples denoting real world facts.

[Definição de KG para a pesquisa deles. Não considera Dual OWA, são fatos e não alegações, não cita contexto, baseado em triplas]

Despite the great progress, most works focus on answering defectless queries on knowledge graphs. These queries are assumed to be perfect and can precisely express users’ query intentions. However, this is not true most of the time in real cases for the following reasons. (1) First, the vocabulary of diferent users can vary dramatically. According to a prominent study on the human vocabulary problem [8], about 80-90% of the times two persons will give diferent representations when they are asked to name the same concept [21]. This means the input queries of diferent users could be very diferent from each other. (2) Second, some KGQA methods (e.g., [28] [4]) need to transform the natural language questions to graph queries, and then search the results according to these query graphs. The transformation algorithm may generate queries with inaccurate graph structure. Last but not the least, allowing users to input query graphs directly may introduce additional structural noise or inaccuracy due to their lack of full background knowledge of the underlying KG [21].

[A questão da incompatibilidade terminológica e o processo de conversão da necessidade de informação, pergunta para a query em grafo está fora do escopo da pesquisa (1) e (2). O possível desconhecimento do usuário em relação ao contexto ao formular a consulta é atendido com a abordagem de "Melhor" Resposta Possível uma vez que as alegações da respostas são contextualizadas com o contexto explícito e implícito (inferido)]

To address these issues, query ambiguity and vagueness need to be correctly resolved, which in turn requires new information in addition to the query itself. Relevance feedback (short for ReF) is one promising solution. The general idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not those results are relevant to form a new query. ... Finally, the newly inferred queries will be used to re-rank the original candidate answers.

[Reescrita mas não tem interação do usuário para a resposta pois é pseudo]

2 PROBLEM DEFINITION

Knowledge graph question answering aims to answer a question with the help of knowledge graphs. According to the study in [21], most users formulate queries using their own knowledge and vocabulary during the search process. They might not have a fairly good understanding of the underlying data schema and the knowledge graph structure. This means that the users’ true intentions behind the queries may be frequently misinterpreted or misrepresented.

[A semântica da conversão da necessidade de informação em uma query]

In this paper, we focus on answering ambiguous one-hop question over knowledge graph. We assume the input ambiguous query Q contains a topic/anchor entity vq (pertence) V and a sequence of words Q = (q1, q2, ..., qn). Ideally, each question can be mapped to a unique relation rq in the knowledge graph. The goal of question answering over knowledge graph is to identify a set of nodes aq (pertence) V which can answer the ambiguous question. We assume that all the answer entities exist in the knowledge graph, each question only contains a single topic/anchor entity vq (pertence) V and vq is given.

[Premissas: (1) one-hop query, (2) possui um tópico ou entidade ancora vq em cada query Q que é conhecido (não precisa ser descoberto/mapeado) e (3) cada pergunta pode ser mapeada unicamente a uma relação rq do KG ]

Problem Definition. Answering Ambiguous Query:
Given: (1) A knowledge graph G, (2) an ambiguous one-hop natural language question;
Output: (1) The answer of the question, (2) Top-k most likely correct query relations of the input query.

[Dado um KG H potencialmente incompleto em relação ao contexto e uma consulta em grafo potencialmente incompleta em relação ao contexto, Recuperar um conjunto de alegações contextualizadas que respondam a consulta]

3 PROPOSED METHOD

3.1 Model Overview

Given the ambiguous query and its anchor entity, we give the following lemma to decompose the problem of question answering over knowledge graph (KGQA).

Lemma 1. (KGQA Decomposition) Given an ambiguous query Q and its anchor node vq , let Pr (T|Q, vq) denote the probability that query relation T is generated from Q and let Pr (a|Q, vq) denote the probability that candidate answer a found by T is the true answer

... the main idea of question answering over knowledge graph. The KGQA system first transforms the input natural language question Q to a high quality query relation T, then finds the answer according to T and the anchor node vq.

3.2 Query Inference: Posterior of True Query

When the input query is a one-hop query, this problem is equivalent to the link prediction problem in the knowledge graph.

[Reuso de link prediction em graph embeddings]

3.3 Query Inference Training

3.4 Query Ranking: Likelihood of Ambiguous Query

3.5 Answering Re-ranking

4 EXPERIMENTS

4.1 Experimental Setting

WebQuestionsSP - Freebase.
SimpleQuestions - Freebase.
MetaQA - domain KG contains information about directors, movies, genres and actors.

[Não usa WD]

In the experiment, we test the efectiveness of PReFNet on complete KG and incomplete KG with 50% and 20% missing edges respectively. All missing edges are randomly deleted.

[KG incompleto]

We test the query ranking performance on 4 baselines

[Rankeamento das consultas geradas]

We test question answering performance on 3 baselines

[Comparação para as respostas geradas]

4.2 Performance of Query Ranking

Traditional KBQA methods usually transform the natural language query to a query graph, and then find the answer according to the query graph. However, because of the ambiguity in the input query, the generated query graph is usually inaccurate. The pseudo relevance feedback, on the other hand, can infer queries according to the top candidate answers.

[Transformar NL em GQL]

4.3 Performance of Question Answering

Among all the methods, EmbedKGQA with relation matching can achieve the highest accuracy. PReFNet further increases the accuracy by 1% on average.

[O ganho é pequeno, o esforço compensa?]

4.4 Efciency

4.5 Ablation Study
A - Query Inference. In this subsection, we show the efectiveness of the query inference module. We first pretrain the module only on the background knowledge graph of each dataset, and then retrain the module on the question training dataset.

B - Query Ranking.
In this subsection, we show the efectiveness of the query ranking module. Some examples are shown in Table 7. As we can see, when the input question is ambiguous, it is very hard to correctly predict its true query intention.

C - Question Answering
More specifcally, the relation matching process of EmbedKGQA fnds the shortest path between the anchor node vq and candidate answer a, and order the candidate answers according to their shortest paths.

5 RELATED WORK

5.1 Knowledge Graph Question Answering

Knowledge graph has many applications [2, 3 , 6, 7, 9, 12, 15, 19, 24 –26]. Among them, knowledge Graph Question Answering has been studied for a long time. When the input query is a natural language sentence, a general strategy to answer the question is to transform the question to a query graph, and search the answer according to the query graph.
For example, in [28], Xi et.al. propose a model which contains candidate query graphs ranking component and a true query graph generation component. By iteratively updating these two components, both components’ performance can be improved. The query graph is finally generated by the second component and can be used to search the KG.
In [14], Liu et.al. propose a multi-task model to tackle KGQA and KGC at the same time.
Other methods, e.g., [20], [16], directly learn an embedding from the natural language sentence and search answers in the embedding space. When the input query is a graph query, [18] models diferent operations in the query graph as diferent neural network and transform the query process to an entity search problem in the embedding space. In principle, all of them can be used as the query system in our method. That is, the top-k answers of these methods can be treated as the pseudo relevance feedback of our method.

[Mapear na query GQL: pode seguir abordagens de KGQA existentes. Do mesmo modo para a "Melhor" Resposta Possível porém seria sempre sem o contexto (?)]

5.2 Relevance Feedback

Relevance feedback is a widely studied topic in Information Retrieval. However, it has not been well studied for graph data. In [21], Su et.al. use relevance feedback to infer additional information and use them to enrich the query. The original ranking function is re-tuned according to the results in relevance feedback. In [11], Matteo et.al. concentrate on assisting the user by expanding the query according to the additional information in relevance feed-back to provide a more informative (full) query that can retrieve more detailed and relevant answers. However, diferent from our work which aims to infer the true intention of users, they expand the query graph at each round until they find the answer. In other words, the setting is diferent.

[Por isto este trabalho não é expansão, é reescrita que recupera um conjunto de respostas relevantes]

5.3 Variational Inference

The goal of variational inference is to approximate difcult-to-compute posterior density. In [29], Zhang et.al. treat the topic entity in the input question as a latent variable and utilize variational reasoning network to handle noise in questions, and learn multi-hop reasoning simultaneously. In [17], Qu et.al. propose a probabilistic model called RNNLogic which treats logic rules as latent variables, and simultaneously trains a rule generator as well as a reasoning predictor with logic rules. These logic rules are similar to the latent paths in our model. There are many other works using variational inference. Diferent from these works, we are the frst to utilize variational inference in relevance feedback on graph data.

6 CONCLUSION

[O problema tem relação com o nosso pq a query é incompleta, vaga, ambígua. Mas só estamos tratando este aspecto no contexto e não em outras partes da consulta. Mas seria possível a partir da query (sem contexto - one-hop) e das top-k respostas geradas por este método, acrescentar o tratamento do contexto para produzir a melhor resposta, identificando o contexto explícito do KG e o implícito e inferido pelas regras]

 

Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...