THE OPEN WORLD ASSUMPTION: ELEPHANT IN THE ROOM
author:
Fonte https://www.mkbergman.com/852/the-open-world-assumption-elephant-in-the-room/
The main argument is that the closed world assumption (CWA) and its prevalent mindset in traditional database systems have hindered the ability of enterprises and the vendors that support them to adopt incremental, low-risk means to knowledge systems and management. CWA, in turn, has led to over-engineered schema, too complicated architectures and massive specification efforts that have led to high deployment costs, blown schedules and brittleness.
[CWA seria motivo do fracasso de projetos em integração de dados ]
Relational Approach - Closed World Assumption (CWA)
That which is not known to be true is presumed to be false; it needs to be explicitly stated as true. Negation as failure (NAF) is a related assumption, since it assumes as false every predicate that cannot be proven to be true. Under CWA, any statement not known to be true is false. Everything is prohibited until it is permitted.
(Open) Semantic Web Approach - Open World Assumption (OWA)
The lack of a given assertion or fact being available does not imply whether that possible assertion is true or false: it simply is not known. In other words, lack of knowledge does not imply falsity.
Everything is permitted until it is prohibited.
Relational Approach - Unique Name Assumption (UNA)
The unique name assumption (UNA) is premised that different names always refer to different entities in the world.
(Open) Semantic Web Approach - Duplicate Labels Allowed
OWL allows different synonym labels to be used for the same object; same names may refer to different objects. Identity assertions must be explicitly stated.
[SameAs nas Ontologias]
Relational Approach - Complete Information
The data system at hand is assumed to be complete. (Missing information is often handled via the null statement in SQL, but that has been controversial and contentious in its own right.) This is also known s the domain-closure assumption.
(Open) Semantic Web Approach - Incomplete Information
A central tenet of OWA is that information is incomplete. A corollary is that the attributes of specific objects or instances may also be incomplete or partially known.
[KBs são incompletos por essência]
Relational Approach - Single Schema (one world)
A single schema is necessary to define the scope and interpretation of the world (domain at hand).
(Open) Semantic Web Approach - Many World Interpretations
Schema and data instance assertions are kept separate. Multiple interpretations (worlds) for the same data are possible.
[Schemafull x Schemaless]
Relational Approach - Integrity Constraints
Integrity constraints prevent “incorrect” values from being asserted in the relational model. It is useful for validation/parsing/data input and is related to the single model that contains only the facts asserted. Strict cardinality is used for checking validation.
(Open) Semantic Web Approach - Logical Axioms (restrictions)
Logical axioms provide restrictions through property domains and ranges. Everything can be true unless proven otherwise, and multiple possible models can satisfy the axioms. This provides more powerful inferencing, though can also be unintuitive at times. Cardinality and range restrictions exhibit different behavior for objects (inferred) or datatypes.
[Early bind x Late bind]
Relational Approach - Non-monotonic Logic
The set of conclusions warranted on the basis of a given knowledge base does not increase (in fact, it likely shrinks) with the size of the knowledge base [5].
(Open) Semantic Web Approach - Monotonic Logic
The hypotheses of any derived fact may be freely extended with additional assumptions. Additional assertions tend to reduce the inferences or entailments that can be applied. A new piece of knowledge cannot reduce what is known [5]. New knowledge can arise through inference.
[Inferência, Dedução]
Relational Approach - Fixed and Brittle
Changing the schema requires re-architecting the database; not inherently extensible.
(Open) Semantic Web Approach - Reusable and Extensible
Designed from the ground up to reuse existing ontologies (axioms) and to be extensible. Database design and management can be more agile, with schema evolving incrementally.
[Reusar ontologias, integrar com fontes que usam essas ontologias]
Relational Approach - Flat Structure; Strong Typing
Information organized into flat tables; linkages and connections between tables based on foreign keys or joins. Strong data typing orientation.
(Open) Semantic Web Approach - Graph Structure; Open Typing
Inherent graph structure, supporting of linkage and connectivity analysis. Datatypes are inherently loose, though axioms can add strong types. Datatypes treated in the same way as classes, and datatype values are treated in the same way as individual identiers (i.e., a data value is treated as referring to an object).
[Tipo de dados]
Relational Approach - Querying and Tooling
SQL and query optimizers well developed. Tooling well developed. Disjunction not supported; negation must be accommodated through approaches such as NAF. Sums and counts are easier due to unique name premise. Answer closure (one answer passable to a next calculation) is easier than OWA. Most tools are not suitable for any arbitrary schema.
(Open) Semantic Web Approach - Querying and Tooling
SPARQL and emerging rule languages used for querying; performance at scale and with broad distribution a concern. Queries require contextual information for proper set selection. Negation and disjunction are allowed and are powerful constructs. Tools generally less developed. Exciting opportunities for ontology-driven applications working against a small set of generic tools.
[SQL x SPARQL - maturidade e disseminação]
The number of negative facts about a given domain is typically much greater than the number of the positive ones. So, in many bounded applications, the number of negative facts is so large that their explicit representation can become practically impossible [7]. In such cases, it is simpler and shorter to state known “true” statements than to enumerate all “false” conditions.
[CWA o que não está na relação, está no complemento da relação e não precisa ser representado explicitamente]
However, the relational model is a paradigm where the information must be complete and it must be described by a single schema. ... This makes CWA and its related assumptions a very poor choice when attempting to combine information from multiple sources, to deal with uncertainty or incompleteness in the world, or to try to integrate internal, proprietary information with external data.
OWA allows suppliers without cities and names to be stored along alongside suppliers with that information. ... Duplicate checking now occurs based on the logic of the system and not unique name evaluations.
- Knowledge is never complete
- Knowledge is found in structured, semi-structured and unstructured forms
- Knowledge can be found anywhere
- Knowledge structure evolves with the incorporation of more information
- Knowledge is contextual — the importance or meaning of given information changes by perspective and context. Further, exactly the same information may be used differently or given different importance depending on circumstance. Still further, what is important to describe (the “attributes”) about certain information also varies by context and perspective. Large knowledge management initiatives that attempt to use the relational model and single perspectives or schema to capture this information are doomed in one of two ways: either they fail to capture the relevant perspectives of some users; or they take forever and massive dollars and effort to embrace all relevant stakeholders’ contexts
- Knowledge should be coherent
- Knowledge is about connections
- Knowledge is about its users defining its structure and use — since knowledge is a state of understanding by practitioners and experts in a given domain, it is also important that those very same users be active in its gathering, organization (structure) and use.
Open world is simply a way to think about the information we have and how we act on it. OWA technologies are neutral to the question of open or public sources.
Thus, open world frameworks provide some incredibly important benefits for knowledge management applications in the enterprise:
• Domains can be analyzed and inspected incrementally
• Schema can be incomplete and developed and refined incrementally
• The data and the structures within these open world frameworks can be used and expressed in a
piecemeal or incomplete manner
• We can readily combine data with partial characterizations with other data having complete
characterizations
• Systems built with open world frameworks are flexible and robust; as new information or structure is
gained, it can be incorporated without negating the information already resident, and
• Open world systems can readily bridge or embrace closed world subsystems.
There are also questions about performance and scalability with open semantic technologies.
Comentários
Postar um comentário
Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.