Exploratory Search: Beyond the Query-Response Paradigm
Ryen W. White and Resa A. Roth
Synthesis Lectures on Information Concepts, Retrieval, and Services, 2009, Vol. 1, No. 1 , Pages 1-98
(https://doi.org/10.2200/S00174ED1V01Y200901ICR003)
Evaluation of Exploratory Search Systems
When evaluating exploratory search systems (ESSs), it is impossible to completely separate human behavior from system effects because the tools are so closely related to human acts, they become symbiotic. This symbiosis is intentional; exploratory search systems act as cognitive prosthetics, and must be closely coupled to the user and their intentions.
[Como vou conseguir avaliar a modelagem da base de dados (KG Contextual) sem ser afetada pela eventual interface?]
While search systems are expanding beyond the support of simple lookup into complex information-seeking behaviors, evaluation of search systems has remained limited to those that encourage minimal human–machine interaction.
TREC provides a medium for the evaluation of algorithms underlying the analytical aspects of IR systems, yet it struggles because the experimental methods of batch retrieval are not suited to studies of how search systems are used by human searchers. Search systems are not used in isolation from their surrounding context, they are used by people who are influenced by environmental and situational constraints such as their current task.
[Métodos essencialmente quantitativos não seriam adequados]
It is unclear if the evaluation of exploratory search will blossom within the TREC paradigm; however, researchers are increasingly turning their attention toward new ways to systematically investigate ESS effectiveness and the information-seeking process.
High levels of interaction, integral to exploratory search, pose an evaluation challenge: there is potential for confounding effects from the different exploration tools, the desired learning effect is difficult to measure, and the potential effect of fatigue limits evaluation to a small number of topics. All of these attributes make it difficult to achieve the statistical significance required by a meaningful quantitative analysis. A key component of exploration is human learning, a topic studied extensively by cognitive psychologists (e.g., Landauer, 2002). Subject-matter learning is a viable way to evaluate ESSs, as a function of exploration time and effort expended.
[Estudos quantitativos com poucas casos não são generalizáveis]
Support for more-rapid learning across a number of users and a range of tasks is indicative of a system that is more effective at supporting exploratory search activities.
Similarities between exploratory search, sense-making, and information foraging signify that an analysis of the costs involved in the process in terms of gain for time spent representing/understanding the task and finding/selecting information may also be useful for comparing exploratory search systems
[Tempo de exploração para atingir um objetivo é uma métrica possível para comparar com um grupo de controle mas não é boa]
Ultimately, researchers studying exploratory search must measure the depth and effectiveness of learning rather than focus on efficiency. Time may be less appropriate as a measure of outcome.
Subjective measures such as user satisfaction, engagement, information novelty, and task outcomes are important, but it is through measurement of interaction behaviors, cognitive load, and learning that one can truly evaluate the effectiveness of ESSs.
[Satisfação do usuário como métrica qualitativa]
METRICS
To evaluate exploratory search systems, we must target the longer-term effect on the user of using the cognitive prosthetic as well as their current task performance.
A workshop entitled “Evaluating Exploratory Search Systems,” organized by Ryen White, Gary Marchionini, and Gheorghe Muresan, held in conjunction with the 2006 ACM SIGIR Conference ....
[Será que vale a pena olhar as publicações?]
Engagement and enjoyment: The degree to which users are engaged and are experiencing positive emotions can be a strong indicator of system performance. Amount of interaction required during exploration, extent to which the user is focused on the task, and content with the system’s response can indicate whether the system is fulfilling its role in supporting search activity. The number of actionable events (purchases or forms filled, bookmarking, feedback or forwarding events, etc.) can be used as a metric to approximate levels of engagement and enjoyment.
[Medida subjetiva, como isolar o efeito do viés do avaliado em relação ao pesquisador?]
Information novelty: Since the goal of exploration is to encounter information not seen before, it is appropriate to include the amount of new information encountered as a way of measuring effectiveness of an exploratory search system. The rate at which users encounter new information is an important aspect of determining how effectively exploratory search systems provide users with new information.
[Medida objetiva mas como saber que o usuário desconhecia? Perguntar a cada interação?]
Task success: Task success should not only be based on whether the user reaches a particular target document, but also on whether they were able to encounter a sufficient amount of information and detail en route to reaching their goal. As one workshop participant remarked “[exploratory search] is more about the journey than the destination.” Since task success may be based on the difficulty of the task, metrics such as the clarity measure may also be appropriate.
[Storytelling para o usuário contar sobre a experiência de busca?]
Task time: Time spent to reach a state of task completeness is an effective way to assess efficiency of exploration activities. Task time can include total time spent, time spent looking at irrelevant documents, and proportion of time spent engaged in directed search versus amount of time spent exploring. Task completeness would be indicated by experimental subjects based on their own perceptions of their task state.
Pirolli (2007) suggested that exploratory search systems could be evaluated through cost structure analysis by finding metrics of learning or expertise and then by comparing how exploration with one system versus another produces better or worse gains against those metrics. It may not be possible to compute a goal for each task. In such cases, one must compare searcher knowledge before and after the task, ask them for feedback about their experience, and focus on their perceptions of task completeness.
Measuring how quickly users reach a particular state of knowledge may contradict other measures of system performance such as engagement, enjoyment, and learning. In such cases, it may be in the interests of users to maximize rather than minimize time spent on task.
[Para atingir o objetivo da tarefa que envolve aprendizado mais tempo pode ser melhor que menos tempo. Medir aprendizado pode ser através da comparação do que se sabe antes com o que se sabe depois.]
Learning and cognition: Learning is key to exploratory search. Through measuring cognitive and mental loads, the attainment of learning outcomes, the richness/completeness of a user’s post exploration perspective, the amount of the topic space covered, and the number of insights users acquire, we can compare exploratory search systems in terms of learning and cognition.
The “informativeness” measure (Tague-Sutcliffe, 1992) took a first step in this direction by combining subjective user responses regarding information utility with a penalty derived from the system’s ability to return relevant items, ranked in descending order of relevance. Toms and colleagues (2005) proposed the use of factor analysis (FA), and O’Brien (2008) proposed the use of structural equation modeling (SEM) as a way to examine the interrelationships between multiple search system evaluation metrics, allowing different types of data (e.g., user attitudes, observed behaviors, system performance) to be examined simultaneously for a more holistic approach to evaluating exploratory search system performance.
[Buscar sobre FA e SEM]
METHODOLOGIES
The role of evaluation in exploratory search is primarily to assess the success of the information seeking
process at reaching the information target(s) for the current session, if those exist, and achieving higher-order learning objectives for the searcher, such as the ability to apply their gained knowledge to related situations or to design a new product, resulting from knowledge synthesis. Evaluation methodologies are tightly connected to how user interaction behavior is represented and to the metrics adopted for measuring success.
[Como isolar o efeito da interface? Como focar somente no efeito da presença ou não do contexto das afirmações? O que quero medir é o efeito positivo do contexto em buscas exploratórias (que me parece já ter sido defendido por argumentos) ou o efeito de um KG Contextualizado nessas buscas?]
Effective evaluation of exploratory search systems requires researchers to first learn about the range of information-seeking tasks, processes, and search strategies which users engage in during exploratory search scenarios.
In the evaluation of interactive search systems, data can only be collected from small numbers of users and about small numbers of tasks. Small sample size limits the generalizability of such studies’ findings. The use of complimentary methods, such as laboratory studies, log analyses, and ethnographic observations provide clarity in understanding how systems support users in the search process (Grimes et al., 2007). Without the use of these methods, exploratory search system evaluation will require the development of longitudinal research designs involving larger numbers of more diverse users. One way to address the need for such research designs is to create a “living laboratory” on the Web that contains evaluation resources and infrastructure for bringing researchers and users together (Kelly et al., 2009).
Crowdsourcing marketplaces, such as Amazon’s Mechanical Turk, are emerging as a popular way to obtain relevance assessments for IR experimentation (Alonso et al., 2008). Crowdsourcing can also be used to solicit research participants for exploratory search evaluation. To do so, the community will need to develop economic models to incentivize participation and develop infrastructure to recruit, retain, and experiment with participants.
The goal of reaching methodological rigor in studying exploratory search has not yet been reached. However, as the field matures, a set of methods accepted by the research community is expected to emerge. In order to reach these goals, a wide variety of candidate methods must be employed, with the expectation that the best methods and practices will eventually prevail. For example, naturalistic, longitudinal studies should be employed alongside lab experiments in controlled conditions. Both of these techniques are useful for different reasons. Naturalistic, longitudinal studies are better suited to observe the information seeker’s behavior and search strategies, as well as changes in information needs and behavior that occur over time. Moreover, they are invaluable in developing and testing interaction models, and in ensuring that assumptions in user models hold, in general, or in certain contexts.
Designing tasks to study exploratory search can be difficult because of the need to induce an exploratory rather than directed style of search. Also difficult is the need for tasks to be constructed in such a way that the results can be compared between subjects in a single study and across multiple studies by different research groups.
[Dificuldades para o desenho experimental]
The characteristics propose that an exploratory task: (1) indicates uncertainty, ambiguity in information need, and/or need for discovery; (2) suggests knowledge acquisition, comparison, or discovery task; (3) provides a low level of specificity about the information necessary and how to find the required information; and (4) provides sufficient imaginative context in order for the test persons to be able to relate and apply the situation.
[Como formular as tarefas a serem propostas no experimento]
Traditional measures of IR performance based on retrieval accuracy may be inappropriate for the evaluation of these systems. The use of metrics based on engagement, enjoyment, novelty, task time and success, and learning provides an opportunity for understanding exploratory search system performance and for the comparison of different systems. Exploratory search evaluation methodologies must include a mixture of naturalistic and longitudinal studies.
FA ...
Toms, E. G., O’Brien, H., Kopak, R., and Freund, L. (2005). Searching for relevance in the relevance
of search. In Proceedings of the 5th International Conference on Conceptions of Library and
Information Sciences, pp. 59–78.
SEM ...
O’Brien, H. (2008). Defining and Measuring Engagement in User Experiences with Technology.
Unpublished doctoral dissertation, Dalhousie University, Halifax, Canada.
Future Directions and Concluding Remarks
6.1.2 Context Awareness
In a similar way to many other search activities, exploratory search activities occur within a work task context. At present, search is typically regarded as a means to find the necessary information to complete an aspect or aspects of a task. Given the dynamic and uncertain nature of exploratory searches, it will be necessary to embed support for exploratory search in many existing desktop applications. ... Coupling information search with use allows systems to take advantage of knowledge about users’ immediate task context (and less immediate context communicated through a semipermanent user profile) to tailor search results or user experience. Over time, the search system could also keep track of the users’ current search skill and knowledge level and adapt the search results displayed to its estimate of the user’s current level.
[Quando se deduz contexto usando a localização ou o idioma de quem busca]
6.1.3 Task Adaptation
General-purpose search systems will no longer suffice for the complex search tasks in which users engage. These systems may rely on users to select the most appropriate interface for their current task, or the search interface could make recommendations.
[Buscadores temáticos]
6.1.4 Decision-Making Support and What-If Analysis
Exploratory search systems must provide users with the ability to reason about the data they view to support decision making. Systems that target decision making will offer overviews and summaries of the data, dynamic queries, and “what-if ” analyses. These systems will allow users to see the possible effects of their decisions and assign probabilities to each of the outcomes. In the context of search, decision-making support tools will help users select the optimal paths to follow through the information space. Systems will also gather information from disparate sources to provide users with enough information to make decisions regarding the task at hand. One of the most important decisions that users make when engaged in exploratory search activities is completion of the search task. Exploratory search systems will support choices about the finality of one’s search by offering details on subtopics yet to be explored. Subtopics will be identified through automatic clustering of documents viewed and related documents based upon a crawl of corpora such as the Web.
[É o que a ferramenta de Tuning faz e que possivelemente a ferramenta de análise de rede social poderá fazer]
6.1.6 Collaborative and Social Search
Although search is often a solitary activity, the search task often involves multiple individuals. As such, it may be in the searcher’s interests to collaboratively explore the information space and participate in shared learning. Aspects of the task can be allocated to different individuals or groups, making task completion more efficient. However, division of the task has the potential to hinder aspects of the learning for team members, making the attainment of a shared learning objective difficult.
Collaborative exploratory search systems will provide a way to summarize (or facilitate rapid access to) already encountered information. The systems could tailor these summaries (or sets of links) to the respective skill levels of team members. This would allow the team to move rapidly toward task completion with minimal interruption from backtracking in review of information encountered by other team members. Immersive chat-rooms with high-quality streaming video and audio will let users converse in real time with those with similar interests and goals, from remote locations.
[A definição que eu tinha de busca social era outra: usar as redes sociais, físicas ou virtuais, para perguntar a especialistas ou pessoas que teriam mais conhecimento sobre um tema, como por exemplo, viajar para um local]
6.1.8 New Evaluation Paradigms
The deployment of exploratory search interfaces at Web scale opens up new opportunities for their evaluation. In this lecture, we have discussed the evaluation of such interfaces by using small numbers of users engaged in laboratory settings or in longitudinal, naturalistic studies. The ability to monitor the use of exploratory search applications by thousands, if not millions, of users, allows system designers to monitor how these systems are used and adapt their components to suit the needs of their user population. Exploratory search may occur over multiple search sessions, and it can be difficult to evaluate exporatory search system effectiveness in a laboratory setting. The creation of shared data sets, pluggable search, indexing, and interaction components (such as crawling mechanisms, ranking algorithms, and interaction logging instrumentation) will cut the lag time from conception to implementation for system developers. More effective evaluation methodologies will improve the quality of the exploratory search systems that reach users.
[Não tem dataset e nem tarefas de busca pré determinadas como nos benchmarks e nem as métricas de avaliação bem estabelecidas]
Comentários
Postar um comentário
Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.