Pular para o conteúdo principal

Exploratory Querying of Extended Knowledge Graphs - Leitura de Artigo

Mohamed Yahya, Klaus Berberich, Maya Ramanath, and Gerhard Weikum. 2016. Exploratory querying of extended knowledge graphs. Proc. VLDB Endow. 9, 13 (September 2016), 1521–1524. https://doi.org/10.14778/3007263.3007299

ABSTRACT
... However, querying a KG to explore entities and discover facts is difficult and tedious, even for users with skills in SPARQL. First, users are not familiar with the structure and labels of entities, classes and relations. Second, KGs are bound to be incomplete, as they capture only major facts about entities and their relationships and miss out on many of the more subtle aspects.
We demonstrate TriniT, a system that facilitates exploratory querying of large KGs, by addressing these issues of "vocabulary" mismatch and KG incompleteness

TriniT supports query relaxation rules that are invoked to allow for relevant answers which are not found otherwise. 

The incompleteness issue is addressed by extending a KG with additional text-style token triples obtained by running Open IE onWeb and text sources. 

[Dois problemas de pesquisa: descasamento de vocabulário (simbólico) e a incompletude do KB. Tratamento através do relaxamento de consultas com regras e da adição de triplas em tempo de consulta a partir de fontes externas em texto]

1. MOTIVATION & INTRODUCTION

The power of this model comes from the exibility with which new data can be added without the need for a schema to be de ned upfront. .... The flexibility with which data can be added to a KG comes at a price: users who are not familiar with the detailed structure and node and edge labels of the KG often have a hard time formulating proper queries.

[Se o esquema fosse conhecido e definido a priori seria mais fácil formular consultas? Seria?]

The vocabulary of the KG, though, is not rich enough to support all these subtleties, resulting in an empty answer.

[Dicionário de sinônimos ajudaria? Os word embeddings resolvem?]

The examples point out two major problems that users have in exploratory querying of KGs. First, the user is not familiar with the structure and vocabulary of the KG (users A, B, C). Second, the KG itself is incomplete, lacking speci fic knowledge required to satisfy a user's query or not supporting the query at all (users C, D). In some cases, multiple rounds of query reformulation may eventually lead to the
desired answers, but users will likely view this as tedious.

[Os dois problemas apresentados nos exemplos (Legal). Se é tedioso a tendência é que a satisfação do usuário seja baixa.]

2. FRAMEWORK

Extended Knowledge Graph (XKG) No KG will ever be complete. ... However, we can run Open Information Extraction tools (e.g. ReVerb/OLLIE [2]) on these sources and collect textual triples consisting of two noun phrases (for S and O) and a verbal phrase connecting them (for P).

[Mas aqui está sendo estendido em tempo de consulta. Poderia ser estendido em tarefas de engenharia. Usar o histórico de consultas para Extrair informações de corpus de texto para aumentar o KG.]

Extended Triple Patterns

3. QUERY RELAXATION

A relaxation rule replaces a set of triple patterns in the original query with a set of new patterns. Each rule has a weight w <entre> [0; 1] that reflects the semantic similarity between the original set of triple patterns and their replacement, and is used to score answers and guide top-k query processing. The decision about which rules to invoke is adaptively made at run-time during top-k query processing.

[Query equivalente a semelhança é 1.0. Reescrita de predicados. As regras são mineradas ou podem ser especificadas manualmente. ]

4. ANSWER SCORING

Computing relevance scores and providing the user with a list of ranked results is crucial for knowledge exploration and general usability. We use a query-likelihood approach for scoring answers, which is standard in IR [18]. We adapt and extend this approach for our triple-pattern setting [14]. A triple pattern is viewed as a document that emits triples with certain probabilities. The probability assigned to an SPO fact in response to a triple pattern is proportional to the frequency with which the fact is observed (a tf-like eff ect) and inversely proportional to the total number of matches for the triple pattern (an idf-like e ffect corresponding to selectivity). Additionally, the scoring model takes into consideration the weights associated with a relaxation rule,...

[Métrica semelhante ao TF/IDF de IR]

5. DEMONSTRATION

 

The TriniT interface allows users to pose queries on the XKG, with a mix of traditional-SPARQL triple patterns and text-style token triple patterns. Answers can be explored in the browser, with links to the XKG. Users can de fine their own relaxation rules. 

[Não é keyword e nem NL Question, é SPARQL e triple pattern]

In particular, we can handle queries that connect multiple entities by their relationships, potentially returning lists of entity pairs or entire tuples.

[Não recupera somente triplas]

Answer Explanation: Since the user's original query can be modi fied by relaxation rules to obtain more relevant answers, it is important for the user to understand how a speci fic answer was obtained. TriniT's answer explanation interface shows an answer's provenance.
The answer explanation provides three important pieces of information: (i) the KG triples that contributed to an answer, (ii) the XKG triples that contributed to an answer and their provenance, and (iii) the relaxation rules that were invoked to obtain an answer. In addition to showing the changes made to the user's query to obtain answers, answer explanation helps users better understand the schema of the underlying KG and its shortcomings. In the long term, this allows her to formulate queries better aligned with the KG.

Query Suggestion: Finally, TriniT suggests alternative formulations of the user's original query that are more suitable for the KG. This helps the user to learn more about the structure and node/edge labels of the underlying KG, making future queries easier to formulate.

[O usuário aprende o esquema do KG a medida que vai interagindo com ele]

This way, the user gradually gains a better understanding of the KG. 

6. RELATED WORK

Exploratory querying of databases in general has recently received much attention [7]. ... None of these approaches address the pain points of formulating queries and of coping with incomplete data or knowledge bases.

[Lidar com incompletude e com o desconhecimento do usuário do esquema]

In fact, we plan to use it as back-end for our own work on QA [13].

[Desenvolver em camadas, aqui resolveu a conversão / ajuste das GQL]

[7] G. Koutrika, L. V. S. Lakshmanan, M. Riedewald, K. Stefanidis. Exploratory Search in Databases and the Web. EDBT/ICDT Workshop 2014

Consequently, there is a need to develop novel paradigms for exploratory user-data interactions that emphasize user context and interactivity with the goal of facilitating exploration, interpretation, retrieval, and assimilation of information.
A huge number of applications need an exploratory form of querying. Ranked retrieval techniques for relational databases, XML, RDF and graph databases, text and multimedia databases, scientific and statistical databases, social networks and many others, is a first step towards this direction. Recently, several new aspects for exploratory search, such as preferences, diversity, novelty and surprise, are gaining increasing importance.
From a different perspective, recommendation applications tend to anticipate user needs by automatically suggesting the information which is most appropriate to the users and their current context.
Also, a new line of research in the area of exploratory search is fueled by the growth of online social interactions within social networks and web communities. Many useful facts about entities (e.g. people, locations, organizations, products) and their relationships can be found in a multitude of semi-structured and structured data sources such as Wikipedia, Linked Data cloud, Freebase, and many others.
Therefore, novel discovery methods are required to provide highly expressive discovery capabilities over large amounts of entity-relationship data, which are yet intuitive for end-users

[14] M. Yahya, D. Barbosa, K. Berberich, Q. Wang, G. Weikum. Relationship Queries on Extended Knowledge Graphs. WSDM 2016: 605{614

Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...