Pular para o conteúdo principal

Exploratory Search: Beyond the Query-Response Paradigm - Leitura de Livro II

Exploratory Search: Beyond the Query-Response Paradigm
Ryen W. White and Resa A. Roth
Synthesis Lectures on Information Concepts, Retrieval, and Services, 2009, Vol. 1, No. 1 , Pages 1-98
(https://doi.org/10.2200/S00174ED1V01Y200901ICR003) 

Related Work

Bates (2004) suggests that browsing is a cognitive and behavioral expression of exploratory behavior and she claims that it has four elements: (1) glimpse a scene; (2) target an element of a scene visually and/or physically (if two or more elements are of interest, they are examined serially, not in parallel); (3) examine item (s) of interest; and (4) physically or conceptually acquire or abandon examined item(s). This sequence is repeated indefinitely as people explore in satisfaction of their curiosity. To this end, exploratory search systems should offer collection overviews (glimpses), the ability to traverse trails through the collection (exploratory browsing), and document examination/retention.

Bates (1989) developed the berrypicking approach to information-seeking behavior. The term “berrypicking” is an analogy to picking berries in a forest; berries are scattered on bushes, not in bunches. People must pick the berries singly. In a similar way to information foraging and wayfinding (Lynch, 1960), the approach views the searcher as moving through an information space, gathering fragments of information as they move and seeking cues that aid navigation decisions. However, the emphasis in berrypicking is on the dynamism of needs during the search, rather than the act of searching (foraging) itself.

[Mudança nas necessidade, no que se busca, e não somente no ato de buscar]
[Rever a tese do aluno do Daniel, esse conceito está lá]

Berrypicking states that new information encountered gives the searcher new ideas and directions to follow and, consequently, a new conception of the query. At each stage of the search process, searchers are not just modifying the search terms but the query itself. Bates described the approach as an “evolving search”; as the search progresses, the desired outcome may also change.

At each stage of the search, with each different conception of the query, the user may identify useful information and references. The query is not satisfied by a single final retrieved set, but by a series of selections of individual references and fragments of information at each stage of the evermodifying search. Searchers’ understanding of their information need is enhanced as they encounter additional information during a search.

Sense-making is the creation of situational awareness and understanding in situations of high complexity or uncertainty in order to make decisions. It is “a motivated, continuous effort to understand connections (which can be among people, places, and events) in order to anticipate their trajectories and act effectively” ... Sense-making generally involves the following steps identified through observation and cognitive task analysis: (1) knowledge gap recognition; (2) generation of an initial structure or model of the knowledge needed to complete the task—concepts, relationships, and hypotheses; (3) search for information; (4) analysis and synthesis of information to create insight and understanding; and (5) creation of a knowledge product or direct action based on the insight or understanding.

In the model, users which have a particular task and situation encounter a trouble spot or a “gap” that impedes their progress. The user, or human actor, must overcome the gap by finding help or making sense of the current situation in order to attain their desired outcome.

In a similar way to sense-making, the anomalous state of knowledge (ASK) hypothesis states that there is a gap between what one knows and what they would like to know, and the need to fill the gap is what drives one to seek and retrieve information.

[Lacuna de conhecimento do indivíduo]

Exploratory searchers are constantly engaged in sense-making activities as they move through the information space. These movements are interrupted when a gap is encountered that requires information to be bridged. Sense-making is an individual process of construction, not a process of utilizing existing information. Exploratory searches typically involve a prolonged engagement in which individuals iteratively look up and learn new concepts and facts. The knowledge acquisition causes the searcher to dynamically change and refine their information goals, and to ask more informed questions that probe deeper into the problem and the information space. Exploratory search can be viewed as a subcomponent of sense-making.

Exploratory search is a class of information seeking.


His model shows that information-seeking behavior influences the person in context and their informational needs, which is also true of exploratory search. In other words, as a person searches, they may decide to investigate information other than that which they were initially seeking. In exploratory search, the search process profoundly influences users’ task perceptions. Wilson’s model illustrates that intervening variables (e.g., cognitive abilities, demographics, task and related environmental constraints) affect information-seeking behavior, which is true of exploratory search to a greater extent, given the focus on intelligence amplification.

In addition to Wilson’s model of IIR, Ingwersen (1992) contributed to the understanding of IR interaction with his own conceptual model. Ingwersen’s model incorporates the context, or the socio-organizational environment, of the information seeker. To further elaborate, context includes the scientific or professional domains with information preferences, and the strategies and work tasks that shape the user’s awareness.

[Contexto da tarefa ou do indivíduo que está realizando a busca, aqui são as pré condições ou restrições]

It is important when establishing exploratory search as a viable subdiscipline of information seeking that we position it relative to existing disciplines such as IR, information visualization, information foraging, and sense-making.


Features of Exploratory Search Systems


The development of new search tools requires novel research and collaborative efforts among computer scientists, social scientists, psychologists, library and information scientists, and practitioners who may lead the way with novel search applications on the Web. The provision of tools to support the exploration of such information spaces can yield great rewards for users, especially when contextual factors such as user emotion, task constraints, and dynamism of information needs are considered.

... a set of features that must be present in systems that support exploratory search activities.

1. Support querying and rapid query refinement: Systems must help users formulate queries and adjust queries and views on search results in real time.

2. Offer facets and metadata-based result filtering: Systems must allow users to explore and filter results through the selection of facets and document metadata.

3. Leverage search context: Systems must leverage available information about their user, their situation, and the current exploratory search task.

[Mais fácil em sistemas especialistas como o Quem@PUC ou busc@NIMA mas os usuários podem usar os sistemas para buscar informações com outros propósitos e não há como controlar]

4. Offer visualizations to support insight and decision making: Systems must present customizable visual representations of the collection being explored to support hypothesis generation and trend spotting.

[A interface de visualização do WDQS tem linha do tempo, mapa, gráficos, etc ... ]

5. Support learning and understanding: Systems must help users acquire both knowledge and skills by presenting information in ways amenable to learning given the user’s current knowledge/skill level.

[Conhecer o perfil do usuário, fazer um registro dos mesmos?]

6. Facilitate collaboration: Systems must facilitate synchronous and asynchronous collaboration between users in support of task division and knowledge sharing.

[O log das buscas é uma forma de colaboração entre quem busca e quem mantém o KG]
[O feedback explícito do usuário que busca poderia ser uam outra forma mas o usuário que está aprendendo pode julgar a relevância das respostas]

7. Offer histories, workspaces, and progress updates: Systems must allow users to backtrack quickly, store and manipulate useful information fragments, and provide updates of progress toward an information goal.

[Seria possível fazer uma analogia de um carrinho de compras?]

8. Support task management: Systems must allow users to store, retrieve, and share search tasks in support of multisession and multiuser exploratory search scenarios.

Information needs can be expressed in the form of keyword queries or as natural language statements such as fully formed questions.

Queries are used by the ESS to retrieve a set of information objects (documents, Web pages, fragments) presented to users in descending order of relevance. Support for queries and their subsequent refinement allows users to navigate to potentially relevant parts of the information space.

[O q seria relevância em ESS?]

However, user-defined queries may be based on users’ existing knowledge and create limited opportunity for exploratory search. The presentation of query suggestions may help users select additional query terms.

Techniques such as RF (relevance feedback) in the IR community and query-by example in the database community can help users choose additional query terms. The techniques work by users providing the system with examples of relevant documents, and in turn, the system presents a set of related queries or documents.

[Tenho um trabalho relacionado que usa QBE. Técnicas antigas. Poderiam acrescentar Vageu Queries.]

Large Web search engine companies can use their historical query log data to find queries commonly issued by other users immediately following the current query, and offer them as query suggestions to other Web searchers ...

[Mas isso não atende aos casos da cauda longa]

Such suggestions are generally of most use for narrowing a search to target a particular subtopic (e.g., from [Hubble telescope] to [Hubble telescope pictures]) rather than supporting exploration (which may lead users to learn more about the telescope, for example). The reason for narrow suggestions is that most searches on the Web are for known items or are revisitations to previously encountered Web pages.

Dynamic queries allow users to see an overview of the database, rapidly explore and conveniently filter out unwanted information. Users move through information spaces by incrementally adjusting a query (with sliders, buttons, and other filters) while continuously viewing changing results.

[KG Summarization tem em outro trabalho relacionado que fala de KG hierarquico e é focado em interface]

Information seekers often express a desire for interfaces that organize search results into meaningful groups, in order to help make sense of the results, and decide on actions

[Agrupar as afirmações pode contexto]

Clustering refers to the grouping of items according to some measure of similarity. In document clustering, similarity is commonly computed using associations and commonalities among features, where features are typically words and phrases

Faceted categories are a set of meaningful labels organized in such a way as to reflect the concepts relevant to a domain (Hearst, 2006). They are usually created manually, although assignment of documents to categories can be automated to a certain degree of accuracy.

[Contexto temático para o domínio da afirmação e contexto de acurácia para a associação dessa afirmação com o domínio. Aqui seria o contexto do contexto, aninhamento!!!]

The Relation Browser (RB) is a general-purpose search interface that can be applied to a variety of data sets (Marchionini and Brunk, 2003). The RB aims to facilitate exploration of the relationships between (among) different data facets, display alternative partitions of the database with mouse actions, and serve as an alternative to existing search and navigation tools. RB provides searchers with a small number of facets such as topic, time, space, or data format. Each of the facets is limited to a small number of attributes that will fit on the screen. Simple mouse-brushing capabilities allow users to explore relationships among the facets and attributes, and dynamically update results as brushing continues.

[Relacionamentos entre partes dos dados, Facetas como tópicos, espaço, tempo ou formato de dados seriam as dimensões contextuais]

Facets and keyword searching allow users to easily move between searching and browsing strategies.

Similar to many successful faceted search interfaces, Phlat combines keyword search and metadata browsing in a seamless manner, allowing people to quickly and flexibly find their own content based on desired result properties. In addition, Phlat provides a facility for tagging items with a uniform system of user-created metadata.

[O q o usuário que busca poderia acrescentar ao KG?]

... Web search results lack personal context, making rank the only reasonable alternative for ordering results.

[Se for uma ferramenta de busca especializada poderá usar uma ordenação que esteja relacionada com esse contexto]

LEVERAGE SEARCH CONTEXT

Tools to support result retrieval using contextual information are valuable because information needs during exploratory searches are ill-defined. Context can be used directly during search and retrieval for tasks such as: (1) query disambiguation (e.g., a query for “jaguar” may mean the car manufacturer or a species of animal; Glover et al., 1999); (2) query expansion based on analysis of the top-ranked documents (Xu and Croft, 2000); (3) result ranking in link analysis algorithms using anchor text; or (4) to support document selection through  query-biased summarization (Tombros and Sanderson, 1998).

Context can be captured explicitly by asking searchers to mark useful queries or search results over time to build (e.g., Bharat, 2000) or to indicate useful text fragments (e.g., Finklestein et al., 2001), or by implicitly mining contextual information from users’ interaction behavior (e.g., Dumais et al., 2004; Kelly and Belkin, 2004; Shen et al., 2005). The selection of a domain-specific search engine rather than a general-purpose search engine also provides valuable implicit contextual information (Lawrence, 2000).

Watson (Budzik and Hammond, 2000) uses contextual information in the form of text in the active document and proactively retrieves documents from distributed information repositories via a new query.

OFFER VISUALIZATIONS TO SUPPORT INSIGHT/DECISION MAKING

Exploratory search systems must provide overviews of the searched collection and large-scale result sets to allow information to be visualized and manipulated in a variety of ways. Information visualization and the use of graphical techniques help people understand and analyze the data, and they are important during hypothesis generation.

Information visualization tools provide users with the ability to explore a range of data dimensions seamlessly. These capabilities of information visualization, combined with computational data analysis, can be applied to analytic reasoning to support the sense-making process and exploratory search.

[Exemplo do manyEyes da IBM]

SUPPORT LEARNING AND UNDERSTANDING

Learning and understanding are important aspects of exploratory search that goes beyond IR; systems can no longer only deliver the relevant documents, but must also provide facilities for deriving meaning from those documents

Exploratory search systems must also offer topic coverage and controllable result diversity to allow users to learn more about an entire subject area topic or focus their search on a particular subtopic.

FACILITATE COLLABORATION

Collaboration goes beyond the user interface: information that one team member finds is not only presented to other members in pursuit of shared learning, but used by the underlying system in real-time to improve the effectiveness of all team members, while allowing each to work at their own pace. Exploratory search systems need to utilize collaboration between searchers attempting the same task, either at the same time or with latency from delays between Web page postings or bookmarking activities. Individuals may know each other before the task begins or be matched by the system once their interests become clear. Searching as part of a group with common goals and interests is a mutually beneficial activity that can help all members navigate a complex information landscape more effectively.

OFFER HISTORIES, WORKSPACES, AND PROGRESS UPDATES

Google’s Notebook application allows users to collect snippets of content from several Web pages and combine them in a single document.

[http://www.google.com/notebook ... Em julho de 2012, o Google Notas foi descontinuado]

White and colleagues (2006c) proposed the use of searcher-constructed concept maps to support oral history search. Oral history archives are rich in named entities and inter-entity relationships that can be tagged and made accessible to a search system. Concepts maps containing these entities and relationships may therefore be a reasonable way to facilitate search and use in these archives. Figure 4.10 shows an example of a concept map. It has been constructed interactively and maintains its state for an entire search session, or longer, if explicitly saved by the searcher. Searchers can annotate nodes in the map and create relationships between them.

[Aqui tem um grafo na interface]

normal.img-011.jpg

Concept maps allow searchers to build a representation of their interests that may be helpful to their search. Searchers can store information fragments on a canvas, link these fragments interactively to create a concept map, and use the map to drive future searches or help to better understand their information problem.

SUPPORT TASK MANAGEMENT

Since exploratory searches likely transcend multiple search sessions, it is important that exploratory search systems provide a mechanism for users to save their state and allow them to return to previous search sessions later. The state not only includes the documents viewed, but includes all other contextual variables such as queries issued, relevant documents marked, paths followed, and potentially other applications opened.

SUMMARY

When conducting an exploratory search, it may be necessary for searchers to employ multiple interaction modes such as textual queries, query-by-example, facets/selections, dynamic queries, and guided tours to obtain the new understanding they seek. These techniques should exist harmoniously at the interface, with a balance between analytic and browsing strategies. To support intelligence amplification, exploratory search systems need to increase user responsibility as well as control; they must require human intellectual effort and must reward users for effort expended.

Comentários

Postagens mais visitadas deste blog

Aula 12: WordNet | Introdução à Linguagem de Programação Python *** com NLTK

 Fonte -> https://youtu.be/0OCq31jQ9E4 A WordNet do Brasil -> http://www.nilc.icmc.usp.br/wordnetbr/ NLTK  synsets = dada uma palavra acha todos os significados, pode informar a língua e a classe gramatical da palavra (substantivo, verbo, advérbio) from nltk.corpus import wordnet as wn wordnet.synset(xxxxxx).definition() = descrição do significado É possível extrair hipernimia, hiponimia, antonimos e os lemas (diferentes palavras/expressões com o mesmo significado) formando uma REDE LEXICAL. Com isso é possível calcular a distância entre 2 synset dentro do grafo.  Veja trecho de código abaixo: texto = 'útil' print('NOUN:', wordnet.synsets(texto, lang='por', pos=wordnet.NOUN)) texto = 'útil' print('ADJ:', wordnet.synsets(texto, lang='por', pos=wordnet.ADJ)) print(wordnet.synset('handy.s.01').definition()) texto = 'computador' for synset in wn.synsets(texto, lang='por', pos=wn.NOUN):     print('DEF:',s...

truth makers AND truth bearers - Palestra Giancarlo no SBBD

Dando uma googada https://iep.utm.edu/truth/ There are two commonly accepted constraints on truth and falsehood:     Every proposition is true or false.         [Law of the Excluded Middle.]     No proposition is both true and false.         [Law of Non-contradiction.] What is the difference between a truth-maker and a truth bearer? Truth-bearers are either true or false; truth-makers are not since, not being representations, they cannot be said to be true, nor can they be said to be false . That's a second difference. Truth-bearers are 'bipolar,' either true or false; truth-makers are 'unipolar': all of them obtain. What are considered truth bearers?   A variety of truth bearers are considered – statements, beliefs, claims, assumptions, hypotheses, propositions, sentences, and utterances . When I speak of a fact . . . I mean the kind of thing that makes a proposition true or false. (Russe...

DGL-KE : Deep Graph Library (DGL)

Fonte: https://towardsdatascience.com/introduction-to-knowledge-graph-embedding-with-dgl-ke-77ace6fb60ef Amazon recently launched DGL-KE, a software package that simplifies this process with simple command-line scripts. With DGL-KE , users can generate embeddings for very large graphs 2–5x faster than competing techniques. DGL-KE provides users the flexibility to select models used to generate embeddings and optimize performance by configuring hardware, data sampling parameters, and the loss function. To use this package effectively, however, it is important to understand how embeddings work and the optimizations available to compute them. This two-part blog series is designed to provide this information and get you ready to start taking advantage of DGL-KE . Finally, another class of graphs that is especially important for knowledge graphs are multigraphs . These are graphs that can have multiple (directed) edges between the same pair of nodes and can also contain loops. The...