Pular para o conteúdo principal

TempQA-WD um benchmark para KGQA usando a WD e contexto temporal

Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh K. Srivastava, Cezar Pendus, Saswati Dana, Dinesh Garg, Achille Fokoue, G. P. Shrivatsa Bhargav, Dinesh Khandelwal, Srinivas Ravishankar, Sairam Gurajada, Maria Chang, Rosario Uceda-Sosa, Salim Roukos, Alexander G. Gray, Guilherme Lima, Ryan Riegel, Francois P. S. Luus, L. Venkata Subramaniam:

A Benchmark for Generalizable and Interpretable Temporal Question Answering over Knowledge Bases. CoRR abs/2201.05793 (2022)

Abstract

Knowledge Base Question Answering (KBQA) tasks that involve complex reasoning are emerging as an important research direction. However, most existing KBQA datasets focus primarily on generic multi-hop reasoning over explicit facts, largely ignoring other reasoning types such as temporal, spatial, and taxonomic reasoning.

[Reasoning no sentido de inferir qual é o contexto de interesse]

In this paper, we present a benchmark dataset for temporal reasoning, TempQA-WD, to encourage research in extending the present approaches to target a more challenging set of complex reasoning tasks. Specifically, our benchmark is a temporal question answering dataset with the following advantages: (a) it is based on Wikidata, which is the most frequently curated, openly available knowledge base, (b) it includes intermediate sparql queries to facilitate the evaluation of semantic parsing based approaches for KBQA, and (c) it generalizes to multiple knowledge bases: Freebase and Wikidata.

[Mais um baseado na WD]

The TempQA-WD dataset is available at https: //github.com/IBM/tempqa-wd.

1 Introduction

The goal of KBQA systems is to answer natural language questions by retrieving and reasoning over facts in Knowledge Base (KB).

Currently, there is a lack of approaches and datasets that address other types of complex reasoning, such as temporal and spatial reasoning. In this paper, we focus on a specific category of questions called temporal questions, where answering a question requires reasoning about points and intervals in time.

[Não é só Quando, inclui concomitância de fatos, encadeamento de fatos]

Our aim in this paper is to fill the above mentioned gaps by adapting the TempQuestions dataset to Wikidata and by enhancing it with additional SPARQL query annotations. Having SPARQL queries for temporal dataset is crucial to refresh ground truth answers as the KB evolves. We choose Wikidata for this dataset because it is well structured, fast evolving, and the most up-to-date KB, making it a suitable candidate for temporal KBQA.

[NO meu caso tem que ser WD pq se trata de um hiper grafo, ou seja, os fatos são contextualizados pq as areastas tem qualificadores]

... help drive research towards development of generalizable approaches, i.e., those that could be easily be adaptable to multiple KBs.

[Generalizar para outros contextos]

2 Related Work

Table 2: This table compares most of the KBQA datasets based on features relevant to the work (Multi-hop x Temporal Context)

3 Dataset

TempQuestions (Jia et al., 2018a) was the first KBQA dataset intended to focus specifically on temporal reasoning

We adapt TempQuestions to Wikidata to create a temporal QA dataset that has three desirable properties. First, in identifying answers in Wikidata, we create a generalizable benchmark that has parallel annotations on two KBs. Second, we take advantage of Wikidata’s evolving, up-to-date knowledge. Lastly, we enhance TempQuestions with SPARQL, entity, and relation annotations so that we may evaluate intermediate outputs of KBQA systems.

[Não basta ter a pergunta e a resposta, é necessário ter a query SPARQL para avaliar se o sistema gerou uma consulta igual ou parecida]

3.1 Wikidata

We chose Wikidata as our knowledge base as it has many temporal facts with appropriate knowledge representation encoded.

It supports reification of statements (triples) to add additional metadata with qualifiers such as start date, end date, point in time, location etc. ... With such representation and the availability of up-to-date information, Wikidata makes it a good choice to build benchmark datasets to test different kinds of reasoning including temporal reasoning.

[Os qualificadores acrescentam contexto aos fatos mas a reificação ocorre ao transferir para o BlazeGraph pq RDF não suporta]

3.2 Dataset Details

We took all the questions from TempQuestions dataset (of size 1271) and chose a subset for which we could find Wikidata answers. This subset has 839 questions that constitute our new dataset, TempQA-WD. We annotated this set with their corresponding Wikidata SPARQL queries and the derived answers.

Within this dataset, we also chose a smaller subset (of size 175) for more detailed annotations. ,,, The goal of these additional annotations is to encourage improved interpretability of the temporal KBQA systems, i.e., to evaluate accuracy of outputs expected at intermediate stages of the system.

[Interpretabilidade seria o mesmo que Explicabilidade. O usuário poderia ter acesso a consulta para entender o subgrafo subjacente que gerou os resultados? ]

3.2.1 Question Complexity Categorization

In this dataset, we also labeled questions with complexity category based on the complexity of the question in terms of temporal reasoning required to answer.

[Não é complexibilidade de graph patter: BGP x CGP]

1) Simple: Questions that involve one temporal event and need no temporal reasoning to derive the answer. For example, questions involving simple retrieval of a temporal fact or simple retrieval of other answer types using a temporal fact.

[BGP Look up]

2) Medium: Questions that involve two temporal events and need temporal reasoning (such as overlap/before/after) using time intervals of those events. We also include those questions that involve single temporal event but need additional non-temporal reasoning.

[CGP com Reasoning]

3) Complex: Questions that involve two or more temporal events, need one temporal reasoning and also need an additional temporal or non-temporal reasoning like teenager or spatial or class hierarchy

[CGP com Reasoning em mais de um aspecto]

In addition to the constants and logical connectives, we introduced some new temporal functions and instance variables to avoid function nesting.

interval, overlap, before, after, teenager, year; where interval gets time interval associated with event and overlap, before, after are used to compare temporal events ...

[Regras para parsing]

4 Evaluation

4.2 Metrics
We use GERBIL

[mesmo do QALD-9 Plus]

We use standard performance metrics typically used for KBQA systems, namely macro precision, macro recall and F1.

4.3 Results & Discussion

Como posso usar na minha pesquisa?

1) Avaliar as perguntas e entender essa classificação

2) Adaptar para outras dimensões contextuais como espacial, proveniência, tematica

3) Explorar melhor o reasoning no sentido de hierarquia de subclasse/instância



Comentários

  1. Novas URLS

    https://github.com/IBM/tempqa-wd

    https://ibm.github.io/neuro-symbolic-ai/toolkit/tempqa-wd

    ResponderExcluir

Postar um comentário

Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.

Postagens mais visitadas deste blog

Aula 12: WordNet | Introdução à Linguagem de Programação Python *** com NLTK

 Fonte -> https://youtu.be/0OCq31jQ9E4 A WordNet do Brasil -> http://www.nilc.icmc.usp.br/wordnetbr/ NLTK  synsets = dada uma palavra acha todos os significados, pode informar a língua e a classe gramatical da palavra (substantivo, verbo, advérbio) from nltk.corpus import wordnet as wn wordnet.synset(xxxxxx).definition() = descrição do significado É possível extrair hipernimia, hiponimia, antonimos e os lemas (diferentes palavras/expressões com o mesmo significado) formando uma REDE LEXICAL. Com isso é possível calcular a distância entre 2 synset dentro do grafo.  Veja trecho de código abaixo: texto = 'útil' print('NOUN:', wordnet.synsets(texto, lang='por', pos=wordnet.NOUN)) texto = 'útil' print('ADJ:', wordnet.synsets(texto, lang='por', pos=wordnet.ADJ)) print(wordnet.synset('handy.s.01').definition()) texto = 'computador' for synset in wn.synsets(texto, lang='por', pos=wn.NOUN):     print('DEF:',s...

truth makers AND truth bearers - Palestra Giancarlo no SBBD

Dando uma googada https://iep.utm.edu/truth/ There are two commonly accepted constraints on truth and falsehood:     Every proposition is true or false.         [Law of the Excluded Middle.]     No proposition is both true and false.         [Law of Non-contradiction.] What is the difference between a truth-maker and a truth bearer? Truth-bearers are either true or false; truth-makers are not since, not being representations, they cannot be said to be true, nor can they be said to be false . That's a second difference. Truth-bearers are 'bipolar,' either true or false; truth-makers are 'unipolar': all of them obtain. What are considered truth bearers?   A variety of truth bearers are considered – statements, beliefs, claims, assumptions, hypotheses, propositions, sentences, and utterances . When I speak of a fact . . . I mean the kind of thing that makes a proposition true or false. (Russe...

DGL-KE : Deep Graph Library (DGL)

Fonte: https://towardsdatascience.com/introduction-to-knowledge-graph-embedding-with-dgl-ke-77ace6fb60ef Amazon recently launched DGL-KE, a software package that simplifies this process with simple command-line scripts. With DGL-KE , users can generate embeddings for very large graphs 2–5x faster than competing techniques. DGL-KE provides users the flexibility to select models used to generate embeddings and optimize performance by configuring hardware, data sampling parameters, and the loss function. To use this package effectively, however, it is important to understand how embeddings work and the optimizations available to compute them. This two-part blog series is designed to provide this information and get you ready to start taking advantage of DGL-KE . Finally, another class of graphs that is especially important for knowledge graphs are multigraphs . These are graphs that can have multiple (directed) edges between the same pair of nodes and can also contain loops. The...