Exploring Knowledge Graphs for Exploratory Search

Exploring Knowledge Graphs for Exploratory Search - Leitura de Artigo (2014)

Bahareh Sarrafzadeh, Olga Vechtomova, and Vlado Jokic. 2014. Exploring knowledge graphs for exploratory search. In Proceedings of the 5th Information Interaction in Context Symposium (IIiX '14). Association for Computing Machinery, New York, NY, USA, 135–144. https://doi.org/10.1145/2637002.2637019

ABSTRACT
In order to provide the user with more support in performing exploratory activities, recent research has been focused on identifying the types of tasks users perform, and understanding the nature of these tasks. However, most of the proposed models focus on either traditional document retrieval or the use of linked data for finding relevant information. We believe neither of these two types of information resources can offer sufficient support for complex search tasks on their own. We propose that a hybrid approach that combines the coherent content of text with the organized structure of graphs should be taken to better support information finding and sense making.

[Combinação de texto com KG seria mais adequado para exploração.KG sozinho não resolve.]

Currently, there is limited insight into the types of information seeking activities performed when a knowledge graph is combined with document retrieval to support exploratory search. This paper describes a general framework that provides the first step towards examining users’ exploratory search behaviour when interacting with knowledge graphs and their corresponding documents.

[Framework para Exploratory Search]

We conducted a user study that suggests searchers perform different information seeking activities for a complex search task compared with a simple search task. These findings provide insights that can be used to inform the design of a new search framework, which enables more effective information finding and analysis.

[Estudo qualitativo: usuários, questionário, captura de tela, anotação das ações de busca.]

1. INTRODUCTION

There is a growing realization in the IR community that the current paradigm of retrieving a ranked list of documents is inadequate in solving complex information needs. Examples of exploratory search tasks include: learning about a new domain (e.g., “astronomy 101”) or finding hidden connections between two events or concepts (e.g., “impacts of WWI on economy”). It can be argued that current search engines are generally sufficient when the need is well-defined in the searcher’s mind. However, when information is sought to address broad curiosities, for learning and other complex mental activities, retrieval is necessary but not sufficient [18].

[Contexto para motivar a proposta: IR não é suficiente, não atende a outras atividades de busca de informação]

In order to bridge the gap between what search engines currently offer with the support needed for more complex search activities, different extensions have been proposed (Section 2). These solutions focus on retrieving information as opposed to documents to address the user’s information need. A dominant technique towards automatic retrieval of information is Information Extraction.....The outcome can be represented as a Knowledge Graph, that is a network of some domain knowledge represented by labelled nodes and labelled links between them.

[Extração de informação de corpus de texto para gerar KG seria uma alternativa para a busca em si]

There has been some efforts (e.g., [7]) for utilizing Linked Data to enable user-oriented exploratory search systems.

[LOD em busca exploratória, ver 7]

The main goal of this paper is to develop a better understanding of how users search for relevant information using a new design based on Knowledge Graphs that are derived from text. We conducted a user study which is exploratory and observational in nature and provides the opportunity to document and analyze interesting interaction patterns (Section 4). We also identified frequent interaction patterns performed during an information seeking session (Section 4.4). Further investigation of the similarities and differences observed between simple and complex search tasks can be utilized to understand the reasons behind the lack of support from the current search engines for complex search tasks. Finally, we examined the obstacles and challenges faced by the participants during their exploration and propose future directions that can lead to better understanding of the requirements of a new search model that supports information seeking activities (Section 5).

[Estudo qualitativo de Exploratory Search, avaliou o tipo de interação do usuário com o sistema para cada classe de tarefa]

2. RELATED WORK

... Information Extraction (IE), Question Answering (QA) and Summarization all generate a focused response to a user’s information need in the form of entities, sentences or text snippets. ... However, there is a growing realization that search in the real world is inherently interactive and the users thus have to be at the heart of the search process. ... Currently, researchers have developed numerous theoretical models of how people go about doing search tasks. The vast majority of these models represent information seeking as an interactive, evolving and learning behaviour.

[Snippets, ver também como usam. Information Seeking.]

2.1 Interactive User Modeling

There is a body of work that focuses on observing users’behaviour and identifying the challenges searchers face during their search session, common information seeking activities among them and gaining insight into how to support these activities.

[Estudos sobre Interfaces para Exploratory Search]

2.2 Utilizing Graphs for Search

The models most similar to our work are those which make use of entities and the relations between them to support search. Dimitrova et al. [7] designed a semantic data browser based on external Linked Data resources to support exploratory tasks.

[LOD ... KG somente]

Yogev et al. [21] describe an extended faceted search solution that allows to index, search and browse rich Entity-Relationship (ER) data. The output of the search system is a ranked list of entities that are distributed over different facets. These facets can be used by the user to focus the search on a specific entity type or to explore another direction by navigating to another related entity in the ER graph. ...

[Isso deve ser bem legal pq o modelo vai ser mantido]

The “Knowledge Graph” enhances Google’s search in three main ways: query disambiguation, providing a summary of related facts to the user’s query, and exploratory search suggestions (based on what other users explored next).

[A busca do Google não é mais somente por strings]

However, we extract a broader set of entities and concepts and we identify semantic relations based on dependencies between them. These relations are not limited to a predefined set of predicates and provide context for understanding the connections between entities. (2) We generate graphs automatically using the documents collection retrieved for the user’s query. Our knowledge graphs thus are derived from the same information space that the searcher is interested to explore.

[Em tempo de consulta ou faz processamento prévio? Por exemplo gerar o grafo de cada documento em batch e gerar as conexões entre grafos online de acordo com os documentos recuperados]

[No experimento foi feito um processamento e a correção da saída da ferramenta de OIE.]

3. ENABLING A NEW SEARCH PARADIGM

We propose a new search framework that takes advantage of knowledge graphs to mitigate the problem of information overload by providing a semantic organization of the information space. We also argue that knowledge graphs cannot enable an effective framework for supporting complex search tasks if applied in isolation.

[Não usar KG de modo isolado]

3.1 The Proposed Framework

Although Knowledge Graphs could be powerful tools to support navigation and learning for exploratory search, they cannot replace the document search and retrieval for searchers.

[Combinar KG com documentos. A crítica para não usar KGs isoladamente é forte e diz respeito a pera da conexão lógica entre as afirmações depois de retiradas do texto. Seria uma perda de contexto? ou seria uma perda gerada pelo uso de OIE para construir o KG? ]

When we extract information from text and restructure it as a knowledge graph to visualize semantic relations between concepts, we lose discourse relations (i.e., information on how two segments of discourse are logically connected to one another) which are crucial for comprehension and inference from a text.

[Seria como retirar uma frase do seu contexto / parágrafo original]

... we identified different types of edges and connections:
Connecting each document to its corresponding graph: In each document, only the sentences containing an extracted triple (entity1, relation, entity2) are linked to their corresponding part of the graph and vice versa. ...
Connecting the graphs: Documents fetched for the user’s query may discuss different aspects of the same topic or provide different perspectives. Therefore, the corresponding graphs are not independent of one another. The same domain terms and entities appear in these graphs and they have to either be represented as a single node in the aggregated graph, or mapped through a set of inter-graph links to preserve these connections.

[Anotar os documentos com as triplas e interligar o subgrafo de cada documento]

3.1.1 A Sample Search Scenario

Consider the following scenario: while reading an article about “Napoleon’s invasion of Russia”, the user comes across the entity “Treaty of Tilsit”. By traversing to this node in the graph, he analyzes a set of related facts, which includes:
- Which countries first signed this treaty? Which countries followed later?
- When was this initially signed and why?
- What were the terms of this agreement?

[Outras perguntas podem surgir durante a busca e resposta também envolve o apredizado do usuário. O exemplo trata um assunto ou entidade (o tratado) e as perguntas poderiam ser de um Q&A]

3.2 Evaluating the New Framework

We formulated a list of research questions to investigate our hypothesis ... In order to find answers to these questions we designed a user study in three steps: (1) Extracting Knowledge Graphs from Text, (2) Mapping Graphs and Documents and (3) Employing search tasks with different levels of complexity.

[Etapas da avaliação, incluiu a construção do Grafo, correção do Grafo e a anotação dos documentos antes da busca em si]

3.2.1 Generating Knowledge Graphs and Mappings

3.2.2 Simple and Complex Search Tasks

People’s day-to-day search activities can vary greatly in their motivations, objectives, and outcomes. These search activities can be broadly classified into two groups: “Simple” and “Complex”. Simple search tasks are similar to “known-item” search tasks and usually involve looking up some discrete, well-structured information object: for example numbers, names and facts [12]. Complex search tasks, on the other hand, are seen to be more exploratory and involve investigating, learning and synthesis of information [19].

[Look up pode estar dentro de um processo de Exploratory Search mais não resolve sozinho]

There are two activities which mediate the exploration process: information foraging theory [13] describes how searchers collect relevant pieces of information; sense making [6] describes the process through which people assimilate new knowledge into their existing understanding.

[13] P. Pirolli and S. Card. Information foraging. Psychological review, 106(4):643, 1999.

What Is Information Foraging?

Information foraging is the fundamental theory of how people navigate on the web to satisfy an information need. It essentially says that, when users have a certain information goal, they assess the information that they can extract from any candidate source of information relative to the cost involved in extracting that information and choose one or several candidate sources so that they maximize the ratio:

Rate of gain = Information value / Cost associated with obtaining that information

In other words, if people have a question, they will decide which webpage to go to based on (1) how likely it is that the page will provide an answer to their question, and (2) how long it’s going to take to get the answer if they go to that page.

[6] B. Dervin. Sense-making theory and practice: an overview of user interests in knowledge seeking and use. Journal of knowledge management, 2(2):36–46, 1998.

Sense making is the integration of reasons into an argument for understanding and believing what something means in answer to a question.

Laursen's draft theory of sense making in formal inquiry settings (Copyright: Bethany Laursen)

[Aqui o argumento (thinking) pode ser formado por Afirmações Contextualizadas recuperadas em uma Busca Exploratíoria]

4. DESIGNING THE EXPERIMENTS

We designed a within-subject study in which each participant needed to complete two search tasks using the same interface. We conducted our experiments within the framework provided by the TREC 2007 Question Answering track.

4.1 Construction of Results Lists

We created a list of 10 documents for each search task. In order to investigate the contribution of the graphs in identifying “relevant” documents we used different procedures to construct “artificial” result lists. For the “complex” task we were interested to see how many “nuggets” (i.e., a piece of text containing relevant information) would be found by the searchers. Therefore, we retrieved all documents related to the given topic and ordered them based on the number of nuggets they contained (as identified by NIST assessors).

4.2 Experiment Setup

We recruited 20 (6 female) participants from a very diverse pool for this study, all of whom use the Internet on a regular basis to search for information. They ranged from students to senior engineers and faculty members. Their study area covered many major areas of Computer Science, Architecture, Chemistry, Biology and Physics. The participants were given 10 minutes to complete each task. They also needed to complete two post-task questionnaires that would assess their familiarity with the topic and the experience they just had.

[Avaliação Qualitativa e Quantitativa]

4.3 Search Interface

4.4 Results and Observations

While finding relevant information is the goal of all information seeking systems, we cannot successfully design an effective approach unless we learn about searcher’s information seeking patterns. By testing the designed interface we were mainly interested to learn how people go about finding information.

We defined a set of actions by observing the activities performed by different participants over the course of their interaction with the system. Table 2 lists the more prominent actions. For each participant sequences of actions (one per search task) were generated by using the logs of screen videos and observing the users’ interaction with the system during the experiment.

[Anotar as ações do usuário, a navegação no grafo e no conjunto de documentos]

4.4.1 General Characteristics of Interaction Patterns in Simple and Complex Tasks

4.4.2 Switching between Graphs and Documents

4.4.3 Click Patterns: Node v.s Edges

This user study revealed that different searchers take advantage of the provided graph in a variety of ways: while some participants found the edges more effective to locate relevant pieces of information, others made use of the mappings between nodes and text to find important terms more quickly in text.

[Visualizar o resultado em grafo]

4.4.4 Starting the Exploration

We also identified the common activities performed by the group who start their exploration from the Graph side (SfG) as compared with the group who start from the Document side (SfT).

In fact, “query nodes” was identified as the main starting point for exploring the graph. One should note that since the participants did not submit a query to the system, we refer to the main entities in the task description as “query nodes”.

4.4.5 Common Patterns for Finding Answers

One of the most interesting outcomes of analyzing the interaction patterns was to identify the ones that led to locating an “answer”. While for the simple task an answer is a known factoid (mostly an entity), for the Complex task an answer is a snippet of text that contains some evidence or support for a given statement.

[Snippets de texto para tarefas de busca Complexas]

Discussion.

The more complex structure of the diagram in Figure 3 indicates that searchers take a more diverse set of paths to an answer. This observation can justify why the complex search tasks are not well supported by the current search engines. In fact, different searchers exhibit different information seeking behaviour in order to locate the relevant pieces of information in retrieved documents. A better understanding of the common interaction patterns can help the search engines to identify and facilitate these activities.

The second observation was also stated by the participants in the provided questionnaires. They found the graphs are more helpful for the simple task as they had a better idea of what they were looking for.

[KGs sozinhos não resolveriam as necessidade de informação mais complexas. Ou os documenos foram acessados só pq estava disponíveis?]

The third observation is intuitive. Since the answers for the simple task are entities, nodes should be more helpful to locate the factoid information in text. On the other hand, relevant evidence supporting the “position of California w.r.t Stemcell research” is expressed by sentences / text snippets and more context is required to judge and identify these answers by the searchers.

[Text snippets dos documentos foram mais usados para chegar na resposta quando a tarefa era mais complexa]

4.4.6 Summary of Findings

1 (c-d). Overall, the participants started their exploration from the graphs more than the documents. However, this trend was significantly stronger for the simple task.

2 (c-d-e) Overall, the graphs provided more support for the simple task as compared with the complex task. Also, nodes were proved more useful in locating an answer for the simple task, while edges appeared more frequently in that paths that led to an answer during the complex task.

[Buscas complexas se beneficieiam das interligações entre as entidades do grafo]

5. CONCLUSION AND FUTURE WORK

We conclude that utilizing graphs of concepts and relationships, which are derived from documents, can be effective for finding relevant information when the information need is well defined. Our findings also demonstrate that providing meaningful relations that explain how different entities of a domain are connected are crucial for supporting more complex search task.

Ranking and Suggestion Generation.

We identified a major barrier to effective application of automatically generated knowledge graphs to complex search scenarios. As noted by many searchers, for the larger graphs, it was not clear where to start and where to go next in the graph.

Since the users of exploratory search systems are usually engaged in complex search scenarios it is easy for them to get lost or frustrated in the middle of a search session and just abandon their exploration. It is also very difficult for them to keep track what they have browsed so far and what is there to explore further

[Ranqueamento ajuda no ponto de partida para exploração do grafo. A melhor resposta precisa de algum ranqueamento ou não?]

Finally, as we monitored the searchers finding relevant information about a topic they were not very familiar with (e.g., “Stemcell research”), we realized they were making use of the graphs to learn basic facts (e.g., “Stemcells are undifferentiated biological cells”) about the salient entities or the query terms.

[Aprendizado sobre a necessidade de informação ocorre durante a busca]

Connecting the Documents.

One of the main challenges for conducting an effective exploratory search is to fight the information overload.

As identified by many participants, the poor visibility of labels for the large graphs was the main barrier for utilizing the graphs for exploration and information finding.

[É importante obter um subgrafo que seja passível de assimilação a cada iteração. Como limitar o tamanho.]

[7] V. Dimitrova, L. Lau, D. Thakker, F. Yang-Turner, and D. Despotakis. Exploring exploratory search: a user study with linked semantic data. In Proc. of the 2nd IESD, page 2. ACM, 2013.

Pesquisa de Doutorado da Veronica

Pesquisar este blog

Exploring Knowledge Graphs for Exploratory Search - Leitura de Artigo (2014)

What Is Information Foraging?

Marcadores

Comentários

Postar um comentário

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

Exploratory Search: From Finding to Understanding - Leitura de Artigo