Pular para o conteúdo principal

Hierarchical knowledge graphs: A novel information representation for exploratory search tasks - Leitura de Artigo

Sarrafzadeh, B., Roegiest, A., & Lank, E. (2020). Hierarchical knowledge graphs: A novel information representation for exploratory search tasks. arXiv preprint arXiv:2005.01716. ACM Transactions on Information Systems, Vol. 4, No. TOIS, Article 1. Publication date: April 2020.


5 IMPACT OF INFORMATION EXTRACTION ERRORS ON HKGS

In this section, we evaluate the performance of HKGs in light of errors in information extraction. To understand why we wish to explore the impact of errors in information extraction, consider Figure 5. In typical web search, users formulate queries, inspect retrieved documents, and either view documents or, if they find that the returned documents are not exactly appropriate, reformulate queries to refine the set of documents retrieved. Because a user can directly examine the results of a query retrieval operation, the user can refine the search query to modify the retrieved documents as needed. However, when performing information extraction, one challenge that the user faces is a limited ability to influence the quality of extracted information. Even if the set of retrieved documents is correct, errors in information extraction propagate through the representation of the entity-relationship tuples.

[Erros introduzidos pelas ferramentas de IE afetam a hierarquia de visualização]

5.1 Experimental Design

To examine the impact that different levels of precision and recall have on exploratory search while using HKGs, we use two different information extraction outputs. One set represents the raw, uncorrected output of an IE algorithm; the second represents human-corrected output, used in the previous section to evaluate the potential of HKGs. We use these two outputs to populate our hierarchical knowledge graphs and leverage the interface that we designed (described in Section 3) to support interaction with these HKGs.  

[Executar as tarefas em uma versão do KG com e outra sem curadoria]

Characterizing Precision and Recall of Automatic vs Hand-Tuned IE.

5.1.9 Hypotheses and Research Questions. Quantitative data allows us to test the following hypotheses:
• Automatically generated hierarchical knowledge graphs result in a lower performance (i.e. task outcomes - measured by essay qualities) than do manually curated HKGs.
• Automatically generated hierarchical knowledge graphs result in more document views and more time spent reading documents (i.e. proxies for effort) than do manually curated HKGs.

5.4 Discussion

The goal of this section was to explore the impact of error prone information extraction on exploratory search tasks supported with HKGs. From our quantitative results, we note that Group, rather than Error condition or Task, resulted in significantly different performance on exploratory search tasks, as highlighted by the dependent measure Mark. To investigate this further, we looked more closely at any confounds within each group that could potentially impact the variance we see in search performance and behavior.

Summarizing these observations, while we observe no initial effect of error on performance, combining qualitative data with post-hoc statistical analysis, we find some evidence that precision and recall rates may impact one of our tasks, the History task, more than the Politics task. This, potentially makes sense; because the History task is an investigate task with, as noted qualitatively by our participants, a set of answers that are targeted rather than open-ended, errors in precision and/or recall might result in concepts useful to the search task being omitted from the data set.

[A tarefa investigativa teria sofrido mais impacto. Quais outros tipos de tarefas de Exploratory Search também seriam afetadas?]

6 SYNTHESIS, IMPLICATIONS, AND FUTURE WORK

Inspired by our earlier work contrasting the support networks and hierarchies provide for exploratory search, this paper explores a novel data structure, hierarchical knowledge graphs. The goal of HKGs is to combine the complementary advantages of individual data structures. Specifically, we observed that hierarchies provide better sensemaking for searchers new to a topic area by structuring the information space; whereas networks contain greater information within the data structure, thus reducing the need to read documents to acquire information.

[Hierarquia fez mais diferença para quem sabia menos do assunto]

Alongside this demonstration of complementary nature of networks and hierarchies, in our second experiment, we demonstrate a disassociation between output and outcomes within our exploratory search system. While benchmarking demonstrates error-prone information extraction, our measures of user performance demonstrate only limited impacts, and then only qualitative, of errors in information extraction (the output) with respect to user performance (the outcomes). Obviously, there exist a number of domains where output accuracy is highly relevant: specific document lookup, legal document discovery, and research reference lookup are all examples of such. However, it is possible that other domains, such as exploratory search, may be more resilient to errors in output accuracy,

[Erros poderiam ser omissões também? KGs são incompletos. O domínio poderia fazer diferença pq o usuário tem o seu conhecimento prévio.]

6.1 Limitations and Future Work

... another obvious area of future work is to add additional exploratory search tasks from Marchionini’s taxonomy [41]. Our emergent qualitative results in our second experiment indicate that, within the broad category of exploratory search tasks, different types of exploratory search may be impacted differently by errors. Understanding where and how the current levels of precision for IE algorithms impact each of these different task types will help to clarify where and how useful variable IE accuracy is for different types of tasks at a more fine-grained level.

[Relação entre as demais tarefas de Exploratory Search e a acurácia dos algoritmos de IE]

... future work is to investigate other potential factors that may affect user performance in search tasks. For example, cognitive biases can result in irrational search behavior and influence searchers’ relevance judgment of information. As well, bounded rationality impacts the way information seekers optimize their information processing efforts even at the cost of achieving a sub-optimal outcome. Our qualitative data provides some evidence that biases might be salient: participants who approached the exploratory Politics task with an intention of finding evidence to support the power of the president they believed is more powerful limited browsing behaviors because participants felt ‘already informed’ on the topic. Given the difficulty in fully controlling for cognitive biases and ability, this final research question would require extensive studies to both identify factors and to model them in a tractable way to measure their effects.

[Como isolar possível viés do que o usuário acredita que já sabe ao realizar uma busca exploratória: mindset do escoteiro x mindset do soldado ... Crenças]

7 CONCLUSION

In the field of information retrieval, alongside document retrieval, issues of representations that allow users to make sense of information and interfaces that allow users to interact with search results are important areas of inquiry. In this paper, we explore hierarchical knowledge graphs, an extension of knowledge graphs that leverages connectivity to generate hierarchies from the underlying knowledge graphs. Our mixed method experimental results argue that hierarchical knowledge graphs support the overview advantages of hierarchical representations, the information content advantages of knowledge graphs, and exhibit resilience to information extraction error rates common in contemporary information extraction algorithms.  

[Grafos de conhecimento hierárquicos, uma extensão dos KG que aproveita a conectividade para gerar hierarquias a partir do KG usando o grau dos vértices. Vantagens de visão geral das representações hierárquicas, as vantagens de conteúdo de informação dos KGs no nível de detalhe e exibem resiliência a taxas de erro de extração de informações comuns em algoritmos de extração de informações contemporâneos.]

      


 


Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...