VAGUE QUERIES
https://youtu.be/7tmqQ-y-hNQ
Consultas vagas: Consultas que permitem resultados aproximados ao que se busca
https://dl.acm.org/doi/pdf/10.1145/45945.48027
Utiliza métricas de distância e de similaridade para o resultado
1988
Requisitos
- Simplicidade Conceitual
- Adaptabilidade
- Externalidade ao SGBD
Estende o modelo relacional com um único conceito: métrica de similaridade na linguagem de consulta, é um novo comparador
Usuário escolhe qual é a métrica de similaridade / distância
Externalidade para posterior incorporação (como o Daniel comentou que é usual em BD)
Interativo: pergunta ao usuário qual é a interpretação de similar, qual o critério de ordenação do resultado, se o usuário deseja flexibilizar mais a consulta (em caso não houver resultado)
Não é linguagem natural, é linguagem do BD (SQL) estendida
Extraído do Texto
A specific query establishes a rigid qualification and is concerned only with data that match it precisely. A vague query establishes a target qualification and is concerned also with data that are close to this target.
To determine similarity between data values we introduce the notion of distance. Each database domain is provided with a definition of distance between its values called duta metric.
Often, distances between values of a given domain may be measured according to various metrics.
In our model, vague queries are distinct from specific queries only by their “soft” selection qualification. Thus, the same level of expertise is required to issue specific or vague queries.
Another kind of vague request occurs when the user does not possess the knowledge required for formulating a proper query (this may be because the user is not familiar with the data model, the query language, the organization of the particular database, or because the user does not have a well-defined retrieval goal). This problem has been approached in two ways.
(1) Interactive query constructors help users crystalize their requests. A notable example is RABBIT [26], which applies a paradigm of repetitive reformulation of an initial goal. At each iteration in the construction process the user is presented with the answer to the current query. Having observed the answer, the user can then refine the query by critiquing it in one of several ways available.
(2) Browsers, such as TIMBER [22], SDMS [7], BAROQUE [15] or KIVIEW [18], provide users with a variety of features for exploratory searches. Often, the information is represented as a network, and the retrieval process is iterative. At each iteration the user is presented with information that corresponds to the current location on the network. The user can then issue a new command to advance the search in a particular direction. Elements of browsers are also present in the ME system [9]. The ME database is a network of files connected through links which represent weighted terms. A retrieval request is a set of terms, and a spreading activation process is used to match the files that are most relevant. As the user changes the terms of the query in one terminal window, the window that shows the matched files is updated dynamically.
A user interface to databases that is capable of handling vague requests appears to be more “intelligent.” This is because answering questions with information that is only close to what was requested, or somehow related to it, is a common feature of human interaction. Such interaction is known as cooperative behavior, and there has been much focus on how to improve man-machine interaction by emulating such behavior through various techniques. Various cooperative interfaces (including those mentioned above) are discussed in [11]. Not surprisingly, this added intelligence is made possible by including additional semantic information in the database, namely distances.
In particular, we distinguish between attributes and domains. An attribute is a named column in a relation. A domain is a set of values (possibly infinite). Each attribute is associated with one domain. The domain contains all the values that may appear in that attribute.
Often, database domains are numerical, and the absolute value distance is satisfactory. Sometimes, although a domain is nonnumerical, its values are strictly ordered (for example, a domain RANK with values such as Exce11ent, Good, Fair, and Poor). Such domains are easily metricized by mapping the domain onto a range of integers (while preserving the order), using the absolute value metric to derive distances, and then storing the distances in a table. Metrics can also be derived from domain partitions. Assume that a domain can be partitioned into a collection of disjoint sets called clusters, each containing values that are judged to be similar. A tabular metric can then be defined as follows: All intracluster distances (distances between two values that are in the same cluster) are set to 0, and all intercluster distances (distances between two values in different clusters) are set to 1. This metric can be refined if a hierarchical partitioning of the domain is available (i.e., clusters are possibly partitioned into further subclusters). The metric is derived from the clusters at the bottom level (level 0). Again, the distance between two values that are in the same cluster is set to 0. The distance between two values that are not in the same cluster‘is set to the level of the cluster that contains both.
The extensions to the relational data model that have been described in this article should become an integral part of the database system. However, it is also possible to provide similar functionalities by constructing a simple system “on top” of existing database systems. The advantage of this approach is that it can also be implemented in cases in which the database system in use cannot be modified.
In recent years (!) there has been much interest in issues regarding databases with incomplete information (for a review of this topic see [13, chap. 121). Incomplete information in metricized databases involves two new issues, first, how the availability of distances affects the conventional approaches to incomplete information, and, second, how to deal with incompleteness of the distance information itself.
1. BOLC, L., AND JARKE, M., Eds. Cooperative Interfaces to Information Systems. Topics in Information Systems, Springer-Verlag, Berlin, West Germany, 1986
11. KAPLAN, S. J. Cooperative responses from a portable natural language query system. Artif. Zntell. 19, 2 (Oct. 1982), 165-187.
13. MAIER, D. The Theory of Relational Databases. Computer Science Press, Rockville, Md., 1983.
=======================================================================
https://www.vldb.org/conf/1990/P696.PDF
1996
Usa o VAGUE
4 variações: Cris Data, Crisp Query e Crisp Result .... Fuzzy Data, Fuzzy Query e Fuzzy Result
Linguagem VQL: mais conceitos adicionados ao SQL
Tem uma arquitetura para estender um SGBD
Extraído do texto
Instead of retrieving only a set of answers, our approach yields a ranking of objects from the database in response to a query. By using relevance judgements from the user about the objects retrieved, the ranking for the actual query as well as the overall retrieval quality of the system can be further improved.
On the other hand, handling of user requests that cannot be expressed in two-valued logic is difficult with current DBMSs
In all these applications, the query languages of current DBMSs offer little support. Mostly, users are forced to submit a series of queries in order to retrieve some objects that are possible solutions to their problem. Moreover, they often cannot be sure if they tried the query that retrieves the optimum solution.
For a vague query, a system based on our approach first will yield an initial ranking of possible answers. Then the user is asked to give relevance judgements for some of the answers, that is, he must decide whether an answer is an acceptable solution to his problem. From this relevance feedback data, the system can derive an improved ranking of the answers for the current request.
A second kind of weighting (called query condition weighting below) refers to the different criteria specified by the user, which may not be of equal importance for him.
=====================================================================
Conceitos antigos que podem ser úteis para a proposta .....
Comentários
Postar um comentário
Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.