Mohamed Yahya, Klaus Berberich, Maya Ramanath, and Gerhard Weikum. 2016. Exploratory querying of extended knowledge graphs. Proc. VLDB Endow. 9, 13 (September 2016), 1521–1524. https://doi.org/10.14778/3007263.3007299
ABSTRACT
... However, querying a KG to explore entities and discover facts is difficult and tedious, even for users with skills in SPARQL. First, users are not familiar with the structure and labels of entities, classes and relations. Second, KGs are bound to be incomplete, as they capture only major facts about entities and their relationships and miss out on many of the more subtle aspects.
We demonstrate TriniT, a system that facilitates exploratory querying of large KGs, by addressing these issues of "vocabulary" mismatch and KG incompleteness.
TriniT supports query relaxation rules that are invoked to allow for relevant answers which are not found otherwise.
The incompleteness issue is addressed by extending a KG with additional text-style token triples obtained by running Open IE onWeb and text sources.
[Dois problemas de pesquisa: descasamento de vocabulário (simbólico) e a incompletude do KB. Tratamento através do relaxamento de consultas com regras e da adição de triplas em tempo de consulta a partir de fontes externas em texto]
1. MOTIVATION & INTRODUCTION
The power of this model comes from the exibility with which new data can be added without the need for a schema to be dened upfront. .... The flexibility with which data can be added to a KG comes at a price: users who are not familiar with the detailed structure and node and edge labels of the KG often have a hard time formulating proper queries.
[Se o esquema fosse conhecido e definido a priori seria mais fácil formular consultas? Seria?]
The vocabulary of the KG, though, is not rich enough to support all these subtleties, resulting in an empty answer.
[Dicionário de sinônimos ajudaria? Os word embeddings resolvem?]
The examples point out two major problems that users have in exploratory querying of KGs. First, the user is not familiar with the structure and vocabulary of the KG (users A, B, C). Second, the KG itself is incomplete, lacking specific knowledge required to satisfy a user's query or not supporting the query at all (users C, D). In some cases, multiple rounds of query reformulation may eventually lead to the
desired answers, but users will likely view this as tedious.
[Os dois problemas apresentados nos exemplos (Legal). Se é tedioso a tendência é que a satisfação do usuário seja baixa.]
2. FRAMEWORK
Extended Knowledge Graph (XKG) No KG will ever be complete. ... However, we can run Open Information Extraction tools (e.g. ReVerb/OLLIE [2]) on these sources and collect textual triples consisting of two noun phrases (for S and O) and a verbal phrase connecting them (for P).
[Mas aqui está sendo estendido em tempo de consulta. Poderia ser estendido em tarefas de engenharia. Usar o histórico de consultas para Extrair informações de corpus de texto para aumentar o KG.]
Extended Triple Patterns
3. QUERY RELAXATION
A relaxation rule replaces a set of triple patterns in the original query with a set of new patterns. Each rule has a weight w <entre> [0; 1] that reflects the semantic similarity between the original set of triple patterns and their replacement, and is used to score answers and guide top-k query processing. The decision about which rules to invoke is adaptively made at run-time during top-k query processing.
[Query equivalente a semelhança é 1.0. Reescrita de predicados. As regras são mineradas ou podem ser especificadas manualmente. ]
4. ANSWER SCORING
Computing relevance scores and providing the user with a list of ranked results is crucial for knowledge exploration and general usability. We use a query-likelihood approach for scoring answers, which is standard in IR [18]. We adapt and extend this approach for our triple-pattern setting [14]. A triple pattern is viewed as a document that emits triples with certain probabilities. The probability assigned to an SPO fact in response to a triple pattern is proportional to the frequency with which the fact is observed (a tf-like effect) and inversely proportional to the total number of matches for the triple pattern (an idf-like effect corresponding to selectivity). Additionally, the scoring model takes into consideration the weights associated with a relaxation rule,...
[Métrica semelhante ao TF/IDF de IR]
5. DEMONSTRATION
The TriniT interface allows users to pose queries on the XKG, with a mix of traditional-SPARQL triple patterns and text-style token triple patterns. Answers can be explored in the browser, with links to the XKG. Users can define their own relaxation rules.
[Não é keyword e nem NL Question, é SPARQL e triple pattern]
In particular, we can handle queries that connect multiple entities by their relationships, potentially returning lists of entity pairs or entire tuples.
[Não recupera somente triplas]
Answer Explanation: Since the user's original query can be modified by relaxation rules to obtain more relevant answers, it is important for the user to understand how a specific answer was obtained. TriniT's answer explanation interface shows an answer's provenance.
The answer explanation provides three important pieces of information: (i) the KG triples that contributed to an answer, (ii) the XKG triples that contributed to an answer and their provenance, and (iii) the relaxation rules that were invoked to obtain an answer. In addition to showing the changes made to the user's query to obtain answers, answer explanation helps users better understand the schema of the underlying KG and its shortcomings. In the long term, this allows her to formulate queries better aligned with the KG.
Query Suggestion: Finally, TriniT suggests alternative formulations of the user's original query that are more suitable for the KG. This helps the user to learn more about the structure and node/edge labels of the underlying KG, making future queries easier to formulate.
[O usuário aprende o esquema do KG a medida que vai interagindo com ele]
This way, the user gradually gains a better understanding of the KG.
6. RELATED WORK
Exploratory querying of databases in general has recently received much attention [7]. ... None of these approaches address the pain points of formulating queries and of coping with incomplete data or knowledge bases.
[Lidar com incompletude e com o desconhecimento do usuário do esquema]
In fact, we plan to use it as back-end for our own work on QA [13].
[Desenvolver em camadas, aqui resolveu a conversão / ajuste das GQL]
[7] G. Koutrika, L. V. S. Lakshmanan, M. Riedewald, K. Stefanidis. Exploratory Search in Databases and the Web. EDBT/ICDT Workshop 2014
A huge number of applications need an exploratory form of querying. Ranked retrieval techniques for relational databases, XML, RDF and graph databases, text and multimedia databases, scientific and statistical databases, social networks and many others, is a first step towards this direction. Recently, several new aspects for exploratory search, such as preferences, diversity, novelty and surprise, are gaining increasing importance.
From a different perspective, recommendation applications tend to anticipate user needs by automatically suggesting the information which is most appropriate to the users and their current context.
Also, a new line of research in the area of exploratory search is fueled by the growth of online social interactions within social networks and web communities. Many useful facts about entities (e.g. people, locations, organizations, products) and their relationships can be found in a multitude of semi-structured and structured data sources such as Wikipedia, Linked Data cloud, Freebase, and many others.
Therefore, novel discovery methods are required to provide highly expressive discovery capabilities over large amounts of entity-relationship data, which are yet intuitive for end-users
[14] M. Yahya, D. Barbosa, K. Berberich, Q. Wang, G. Weikum. Relationship Queries on Extended Knowledge Graphs. WSDM 2016: 605{614
Comentários
Postar um comentário
Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.