Pular para o conteúdo principal

TempQA-WD um benchmark para KGQA usando a WD e contexto temporal

Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh K. Srivastava, Cezar Pendus, Saswati Dana, Dinesh Garg, Achille Fokoue, G. P. Shrivatsa Bhargav, Dinesh Khandelwal, Srinivas Ravishankar, Sairam Gurajada, Maria Chang, Rosario Uceda-Sosa, Salim Roukos, Alexander G. Gray, Guilherme Lima, Ryan Riegel, Francois P. S. Luus, L. Venkata Subramaniam:

A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases. CoRR abs/2201.05793 (2022)

Abstract

Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning.

[Reasoning no sentido de inferir qual é o contexto de interesse]

In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata.

[Mais um baseado na WD]

The TempQA-WD dataset is available at https: //github.com/IBM/tempqa-wd.

1 Introduction

The goal of KBQA systems is to answer natural language questions by retrieving and reasoning over facts in Knowledge Base (KB).

Currently, there is a lack of approaches and datasets that address other types of complex reasoning, such as temporal and spatial reasoning. In this paper, we focus on a specific category of questions called temporal questions, where answering a question requires reasoning about points and intervals in time.

[Não é só Quando, inclui concomitância de fatos, encadeamento de fatos]

Our aim in this paper is to fill the above mentioned gaps by adapting the TempQuestions dataset to Wikidata and by enhancing it with additional SPARQL query annotations. Having SPARQL queries for temporal dataset is crucial to refresh ground truth answers as the KB evolves. We choose Wikidata for this dataset because it is well structured, fast evolving, and the most up-to-date KB, making it a suitable candidate for temporal KBQA.

[NO meu caso tem que ser WD pq se trata de um hiper grafo, ou seja, os fatos são contextualizados pq as areastas tem qualificadores]

... help drive research towards development of generalizable approaches, i.e., those that could be easily be adaptable to multiple KBs.

[Generalizar para outros contextos]

2 Related Work

Table 2: This table compares most of the KBQA datasets based on features relevant to the work (Multi-hop x Temporal Context)

3 Dataset

TempQuestions (Jia et al., 2018a) was the first KBQA dataset intended to focus specifically on temporal reasoning

We adapt TempQuestions to Wikidata to create a temporal QA dataset that has three desirable properties. First, in identifying answers in Wikidata, we create a generalizable benchmark that has parallel annotations on two KBs. Second, we take advantage of Wikidata’s evolving, up-to-date knowledge. Lastly, we enhance TempQuestions with SPARQL, entity, and relation annotations so that we may evaluate intermediate outputs of KBQA systems.

[Não basta ter a pergunta e a resposta, é necessário ter a query SPARQL para avaliar se o sistema gerou uma consulta igual ou parecida]

3.1 Wikidata

We chose Wikidata as our knowledge base as it has many temporal facts with appropriate knowledge representation encoded.

It supports reification of statements (triples) to add additional metadata with qualifiers such as start date, end date, point in time, location etc. ... With such representation and the availability of up-to-date information, Wikidata makes it a good choice to build benchmark datasets to test different kinds of reasoning including temporal reasoning.

[Os qualificadores acrescentam contexto aos fatos mas a reificação ocorre ao transferir para o BlazeGraph pq RDF não suporta]

3.2 Dataset Details

We took all the questions from TempQuestions dataset (of size 1271) and chose a subset for which we could find Wikidata answers. This subset has 839 questions that constitute our new dataset, TempQA-WD. We annotated this set with their corresponding Wikidata SPARQL queries and the derived answers.

Within this dataset, we also chose a smaller subset (of size 175) for more detailed annotations. ,,, The goal of these additional annotations is to encourage improved interpretability of the temporal KBQA systems, i.e., to evaluate accuracy of outputs expected at intermediate stages of the system.

[Interpretabilidade seria o mesmo que Explicabilidade. O usuário poderia ter acesso a consulta para entender o subgrafo subjacente que gerou os resultados? ]

3.2.1 Question Complexity Categorization

In this dataset, we also labeled questions with complexity category based on the complexity of the question in terms of temporal reasoning required to answer.

[Não é complexibilidade de graph patter: BGP x CGP]

1) Simple: Questions that involve one temporal event and need no temporal reasoning to derive the answer. For example, questions involving simple retrieval of a temporal fact or simple retrieval of other answer types using a temporal fact.

[BGP Look up]

2) Medium: Questions that involve two temporal events and need temporal reasoning (such as overlap/before/after) using time intervals of those events. We also include those questions that involve single temporal event but need additional non-temporal reasoning.

[CGP com Reasoning]

3) Complex: Questions that involve two or more temporal events, need one temporal reasoning and also need an additional temporal or non-temporal reasoning like teenager or spatial or class hierarchy

[CGP com Reasoning em mais de um aspecto]

In addition to the constants and logical connectives, we introduced some new temporal functions and instance variables to avoid function nesting.

interval, overlap, before, after, teenager, year; where interval gets time interval associated with event and overlap, before, after are used to compare temporal events ...

[Regras para parsing]

4 Evaluation

4.2 Metrics
We use GERBIL

[mesmo do QALD-9 Plus]

We use standard performance metrics typically used for KBQA systems, namely macro precision, macro recall and F1.

4.3 Results & Discussion

Como posso usar na minha pesquisa?

1) Avaliar as perguntas e entender essa classificação

2) Adaptar para outras dimensões contextuais como espacial, proveniência, tematica

3) Explorar melhor o reasoning no sentido de hierarquia de subclasse/instância



Comentários

  1. Novas URLS

    https://github.com/IBM/tempqa-wd

    https://ibm.github.io/neuro-symbolic-ai/toolkit/tempqa-wd

    ResponderExcluir

Postar um comentário

Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graphs as a source of trust for LLM-powered enterprise question answering - Leitura de Artigo

J. Sequeda, D. Allemang and B. Jacob, Knowledge Graphs as a source of trust for LLM-powered enterprise question answering, Web Semantics: Science, Services and Agents on the World Wide Web (2025), doi: https://doi.org/10.1016/j.websem.2024.100858. 1. Introduction These question answering systems that enable to chat with your structured data hold tremendous potential for transforming the way self service and data-driven decision making is executed within enterprises. Self service and data-driven decision making in organizations today is largly made through Business Intelligence (BI) and analytics reporting. Data teams gather the original data, integrate the data, build a SQL data warehouse (i.e. star schemas), and create BI dashboards and reports that are then used by business users and analysts to answer specific questions (i.e. metrics, KPIs) and make decisions. The bottleneck of this approach is that business users are only able to answer questions given the views of existing dashboa...

Knowledge Graph Toolkit (KGTK)

https://kgtk.readthedocs.io/en/latest/ KGTK represents KGs using TSV files with 4 columns labeled id, node1, label and node2. The id column is a symbol representing an identifier of an edge, corresponding to the orange circles in the diagram above. node1 represents the source of the edge, node2 represents the destination of the edge, and label represents the relation between node1 and node2. >> Quad do RDF, definir cada tripla como um grafo   KGTK defines knowledge graphs (or more generally any attributed graph or hypergraph ) as a set of nodes and a set of edges between those nodes. KGTK represents everything of meaning via an edge. Edges themselves can be attributed by having edges asserted about them, thus, KGTK can in fact represent arbitrary hypergraphs. KGTK intentionally does not distinguish attributes or qualifiers on nodes and edges from full-fledged edges, tools operating on KGTK graphs can instead interpret edges differently if they so desire. In KGTK, e...