Pular para o conteúdo principal

Automatic Question-Answer Generation for Long-Tail Knowledge - Leitura de Artigo

https://knowledge-nlp.github.io/kdd2023/papers/Kumar5.pdf

https://github.com/isunitha98selvan/odqa-tail

ABSTRACT
Pretrained Large Language Models (LLMs) have gained significant attention for addressing open-domain Question Answering (QA). While they exhibit high accuracy in answering questions related to common knowledge, LLMs encounter difficulties in learning about uncommon long-tail knowledge (tail entities).  

[Entidades com poucas informações disponíveis, não tão populares ou comuns no interesse do público em geral]

1 INTRODUCTION

However, the impressive achievements of LLMs in QA tasks are primarily observed with regard to common concepts that frequently appear on the internet (referred to as "head entities"), which are
thus more likely to be learned effectively by LLMs during pretraining time. Conversely, when it comes to dealing with long-tail knowledge, which encompasses rarely occurring entities (referred to as "tail entities"), LLMs struggle to provide accurate answers and often exhibit hallucination issues [5]. Due to the predominant focus of most QA datasets on head entities [ 3 , 6, 10], research investigating the performance of LLMs on long-tail knowledge has been limited.

[Conceito de Long-Tail e seu impacto nos LLMs. Os KGs podem cobrir tanto as tail quanto as head entities e também podem representar alegações em contextos recorrentes (default) e em contextos específicos]

In this study, we propose a novel approach to defining tail entities based on their degree information in Wikidata, as opposed to [7] relying on Wikipedia. By doing so, we generate QA datasets with distinct distributions from previous works [7], thus fostering diversity within tail-knowledge QA datasets. Within the context of Wikidata, the degrees of entities reflect their level of engagement with general knowledge. Hence, we leverage this degree information to define tail entities.

[Métrica para definir entidades tail]

 Moreover, we investigate strategies to enhance the performance of pretrained LLMs by incorporating external resources, such as external documents or knowledge graphs, during inference time on our automatically-generated long-tail QA datasets.

[Integrar LLM e KG]

Introduction of novel tail knowledge QA datasets derived from the Wikidata knowledge graph 

[Teriam exemplos de alegações com contexto neste dataset?]

RELATED WORK

Kandpal et al. [7] show that an LLM’s ability to answer a question is affected by how many times it has
seen relevant documents related to the question in its pre-training data. They show that LLMs struggle to reason accurately over rarer entities in the pre-training data.

In this work, instead of using the pre-training corpus, we define tail entities using Wikidata knowledge
graphs and construct a long-tail knowledge dataset that can be used to study the open-domain QA performance of LLMs.

AUTOMATIC GENERATION OF QA DATASETS FOR LONG-TAIL KNOWLEDGE

 

We define tail entities based on each entity’s node degree (i.e., the number of triplets that have the target entity as a subject node s1) in the knowledge graph. We first sample tail entities based on their degree information and extract all triplets that have the tail entities as the subject entity from Wikidata (proper degree bounds of tail entities will be discussed in the following section). Then we generate factoid questions by prompting LLMs with triplets. 

Prompt

3.2.1 Degree bounds for tail entities. There are no strictly-formulated definitions for tail entities that are widely accepted. Degree bounds that instantly bring in differences in model performance are also
hard to be decided in advance. As a result, degree bounds for tail entities should be selected arbitrarily. In our experiments, we classify entities with node degrees between 15 and 100 as coarse-tail entities and entities with node degrees below 3 as fine-tail entities and compare the LLM performance on them.

[Degree não leva em consideração qualificadores ou referências dos statements ligados os subject node]

Ambiguous entities: Multiple entities can have the same surface forms. 

[Diferenciar pela Identidade da Entidade que não seria o QNode, que é uma chave artificial]

Ambiguous properties: In Wikidata, a large number of properties cannot be used to generate sensible questions. For instance, subclass of, instance of, or part of would generate questions that are too vague to answer even for humans.

[Part of para objetos espaciais pode ser Contexto de Localização/Localidade]

3.2.3 Difficulty control. Questions generated from different properties can have different levels of difficulty. 

[Número de respostas possíveis]

3.2.4 LLM prompt for question generation. While the answer entity of a triplet is not part of the generated question, we find that the quality of generated questions improves when the complete triplet is provided in the prompt, instead of the first two elements (i.e., subject entity and property). For instance, given a triplet [david peel yates, conflict, world war ii], we get "What conflict was David Peel Yates involved in?" from GPT3 when using just the subject entity and property in prompt. On the contrary, when we use all subject, property, and object entities, the generated question becomes "What conflict did David Peel Yates serve in?".

[LLM precisa saber a resposta para formular a melhor pergunta. Como uma pessoa em um processo de exploração pode fazer isto se tiver pouco conhecimento do domínio? Somente com refinamentos sucessivos a partir do que aprender com as respostas anteriores]

3.2.5 Granularity of questions. Given a question, there could be several correct answers with different granularity. Unless the question specifies the granularity of the answer (e.g., which country or which city), QA datasets and models could easily pick different granularity of answers. For instance, when asked Where was Lovelyz formed?, a model could answer South Korea while the QA dataset has Seoul (the capital of South Korea) as the correct answer and marks the predicted answer wrong.

EVALUATION WITH LLMS AND EXTERNAL RESOURCES

Wikidata: Wikidata knowledge graph consists of 103, 305, 143 entities and 11, 007 properties. We access Wikidata using the Sling tool [17] in a triplet format (subject, property, object).

[Não usaram qualificadores e nem referencias]

Tail-entity datasets: We sample triplets from Wikidata to create Coarse-tail and Fine-tail datasets. Each dataset has 27, 691 triplets and 422 unique properties after the difficulty control (details in Section 3.2.3). One question&answer pair consists of a GPT3-generated question, an answer (i.e., object entity in the original triplet), and associated aliases for the answer.

4.4 LLM prompting with DPR and knowledge graphs

Knowledge graphs (KG) have been widely used to augment LLMs [19 , 25 ]. In this section, we examine how external knowledge graphs can cooperate with another external resource, Wikipedia to improve LLM performance for tail entities. We use Wikidata as our external knowledge graph after removing all triplets used for the QA generation.

5 CONCLUSION

Our work highlights the limitations of pre-trained LLMs in hadling long-tail knowledge in open-domain Question Answering. To investigate this limitation, we first propose to generate QA datasets specialized for tail entities automatically using degree information from the Wikidata knowledge graph. Our automatic QA generation approach aims to overcome the resource-intensive nature of manual dataset construction, allowing for the creation of diverse long-tail QA datasets.

[Não usou qualificadores e referencias da WD. Seria um possível trabalho futuro considerar o contexto na métrica para seelcionar as long-tails]


Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...