Pular para o conteúdo principal

Knowledge Graph OLAP: A Multidimensional Model and Query Operations for Contextualized Knowledge Graphs - Leitura de Artigo

Vídeo -> https://youtu.be/d8BNdEaL9bU

Git -> https://jku-win-dke.github.io/KG-OLAP/appendix/

Harald Sack, Christoph G. Schuetz, Loris Bozzato, Bernd Neumayr, Michael Schrefl, and Luciano Serafini. 2021. Knowledge Graph OLAP. Semant. web 12, 4 (2021), 649–683. https://doi.org/10.3233/SW-200419

KR 2021 - Knowledge Graph OLAP: A Multidimensional Model and Query Operations for Contextualized Knowledge Graphs
 
Abstract:
 
A knowledge graph (KG) represents real-world entities and their relationships.
 
The represented knowledge is often context-dependent, leading to the construction of contextualized KGs. The multidimensional and hierarchical nature of context invites comparison with the multidimensional OLAP cube model from data analysis.
 
Traditional systems for online analytical processing (OLAP) employ cube models to represent numeric values for further analysis using dedicated query operations.
 
In this paper, along with an adaptation of the OLAP cube model for KGs, we introduce an adaptation of the traditional OLAP query operations for the purposes of performing analysis over KGs. In particular, we decompose the roll-up operation from traditional OLAP into a merge and an abstraction operation. The merge operation corresponds to the selection of knowledge from different contexts whereas abstraction replaces entities with more general entities. The result of such a query is a more abstract, high-level view -- a management summary -- of the knowledge.

1. Introduction
 
The majority of a KG’s contents are facts/instances or assertional knowledge (ABox), although KGs may also include  terminological /ontological knowledge (TBox) representing “the vocabulary used in the knowledge graph” in order to allow for “ontological reasoning and query answering” over the facts.
 
In a strive for successful management, KGs are increasingly subject to contextualization, i.e., the enrichment of facts with context metadata information such as time and location.
 
Frameworks such as the Contextualized Knowledge Repository (CKR) serve to organize knowledge within hierarchically ordered contexts along multiple contextual dimensions, e.g., spatial and temporal.
 
[CKR usou named graphs, reificação]
 
Similarly, context dimensions span a multidimensional space where each cell represents a context that comprises facts of a KG.
 
Based on the CKR framework, KG-OLAP extends the idea of Graph OLAP to the management of contextualized KGs. Unlike Graph OLAP, which deals with more structured graphs focused on the relationships between simple entities, KG-OLAP deals with more complex, semi-structured KGs with assertional and terminological components that must be adequately dealt with.
 
2. Use Case: Air Traffic Management
 
In this regard, situational awareness refers to a “person’s knowledge of particular task-related events and phenomena” [ 26 ], i.e., knowledge about the world relevant for ATM, which must be accurately represented and conveyed to the various stakeholders.
 
[Conhecimento de um domínio específico]
 
3. Multidimensional Model
 
In this section, we introduce the KG-OLAP cube model for the management of contextualized KGs. We
first introduce the model informally before providing a formal definition. We define the model as a specialization of the Contextualized Knowledge Repository (CKR) framework
 
3.1. KG-OLAP Cube Model
 
KG-OLAP adapts the multidimensional modeling paradigm from data warehousing in order to organize multidimensional KGs. Hence, the KG-OLAP cube is the central modeling element. Following the
basic structure of the CKR framework, the KG-OLAP cube consists of two distinct layers: an upper and a lower layer. The upper layer describes the structure and properties of a cube’s cells; the lower layer specifies cell contents. The two layers employ distinct and possibly disjoint languages.
 
The dimensions are hierarchically organized into levels. The definition of a cube’s dimensions and their hierarchical organization – the cube’s multidimensional structure – into levels is referred to as KG-OLAP cube schema.
 
3.2. Formalization
 
In the following, we adapt and extend the definitions of the CKR framework – building on the CKR definition in a generic description logic (DL) language – in order to fit the needs of KG-OLAP and
its query operations (see Section 4).
 
3.2.1. Basic Definitions
 
We first define the basic notions of a KG-OLAP cube before relating the KG-OLAP cube definitions to
the CKR framework. The multidimensional structure is expressed using a cube vocabulary Ω, which is a DL signature. Ω is composed of the mutually disjoint sets NRΩ of atomic roles, NCΩ of atomic concepts, and NIΩ of individual names. The vocabulary further specifies a set F ⊆ NIΩ of cell names, a set D ⊆ NRΩ of dimensions, a set L ⊆ NIΩ of levels, a set I ⊆ NIΩ of dimension members, and for every dimension E ∈ D, a set DE ⊆ I of dimension members of E. The cube language LΩ for expressing a KG-OLAP cube’s multidimensional structure is thus a DL language over cube vocabulary Ω.
 
For every dimension A ∈ D, we define the role ≺A of dimensional ordering for A as a strict partial order relation over dimension members DA, i.e., an irreflexive, transitive and anti symmetric role over couples 〈d, d′〉 ∈ DA × DA. In the following, we also employ the non-strict dimensional ordering A over DA. In general, we assume that each dimension is ordered in a simple hierarchy (or tree). Thus, if we denote with ̇≺A the direct successor relation in the dimensional ordering, we require that d ̇≺Ae1 and d ̇≺Ae2 implies e1 = e2, i.e., ̇≺A is functional, and we assume that, for every DA, there is a maximum, i.e., an all level with one all member.
 
We further formally define for every dimension A ∈ D its set LA ⊆ L of levels. We define the role ≺L
A as a strict order relation over LA and a role lev associating dimension members in DA to levels in LA.
 
[Não consigo compreender essa parte, formalizações em geral]
 
4. Query Operations
 
In this section, we introduce a set of query operations for working with KG-OLAP cubes. We distinguish between contextual and graph operations. Contextual operations alter the multidimensional structure of a cube. Graph operations modify the RDF graph in the knowledge modules of the cells. Formally, the operations are defined as transformations of KG-OLAP cubes.
 
4.1.1. Slice and Dice
 
The slice-and-dice operation restricts a cube to a set of cells with a specific subset of dimension attribute values; the operation selects a subcube of an input KG-OLAP cube. The slice-and-dice operation selects a partition of the cube for subsequent manipulation.
 
4.1.2. Merge
 
The merge (or contextual roll-up) operation changes the granularity of a cube and its dimensions. Given an argument granularity specified as a vector of dimension levels l, the merge operation combines the contents of knowledge modules at granularities that are more specific than the given granularity.
 
4.2. Graph Operations
 
Graph operations – abstraction, pivoting, and reification – alter the structure of the RDF graphs inside the knowledge modules of a cell.
Abstraction replaces sets of entities with individual and more abstract entities.
Pivoting moves metaknowledge (contextual information) inside the modules.
Reification allows to represent relations as individuals.
 
4.2.1. Abstraction
 
Abstraction serves as an umbrella term for a class of graph operations that, broadly speaking, replace entities in an RDF graph with more abstract entities. This abstraction is based on various types of ontological information, e.g., class membership and grouping properties.
We also refer to abstraction as ontological roll-up.
 
4.2.2. Pivoting
 
The pivoting operation attaches dimensional properties (dimension attribute values) of a cell to a specified set of individuals inside the cell’s object knowledge. Pivoting allows for the preservation of contextual knowledge in case of a merge operation.
 
4.2.3. Reification
 
The reification operation takes “triples” in the object knowledge of a cell and creates individuals that represent such triples. Reification allows for the preservation of duplicates in case of a union merge, which facilitates subsequent counting of occurrences in the course of the analysis. Furthermore, in combination with pivoting, the reification operation allows for attaching contextual information to context-dependent knowledge, preserving information about the context of a triple in case of a merge union.
 
5. Proof-of-Concept Implementation
 
In this section we sketch the foundations of a proof-of-concept implementation of a KG-OLAP system using off-the-shelf quad stores.

[Named Graphs suportados por Triplestores]
 
5.1. Architecture, Model, and Operations
 
A mapping of the formal language to an actual RDF representation allows for the storage of KG-OLAP
cubes in off-the-shelf quad stores with SPARQL realizations of the query operations. Context-aware rules serve to materialize roll-up relationships for levels and cells as well as inference and propagation of knowledge.
 
6. Related Work
 
Semantic technologies have been used for a variety of tasks in the context of OLAP. Related to KG-OLAP are techniques for data analysis over RDF data. The RDF data cube vocabulary (QB) [46] and its extension, QB4OLAP [ 47], provide an RDF representation format for publishing traditional OLAP cubes with numeric measures on the semantic web, with often SPARQL-based operators that emulate traditional OLAP queries ...
 
Other work has suggested “lenses” over RDF data for the purpose of RDF data analysis, i.e., analytical schemas which can be used for OLAP queries on RDF data. Similarly, superimposed multidimensional schemas define a mapping between a multidimensional model and a KG in order to allow for the formulation of OLAP queries.
 
Fusion cubes supplement traditional OLAP cubes with external data in RDF format, particularly linked open data where typically the data are not owned by the analyst. Fusion cubes are traditional OLAP cubes with numeric measures that can be populated dynamically with statistical data from RDF sources.
...
 
Closely related to KG-OLAP is Graph OLAP (also known as InfoNetOLAP) [17, 18 ], which through its informational and topological OLAP queries provides rich query facilities suitable for graph analysis. In Graph OLAP, graphs are associated with dimensional attributes, which yields a graph cube. The edges of the graphs themselves are weighted; the weights represent the measures to be analyzed. Typical applications of Graph OLAP are analysis of co-author and similar social graphs from different time periods, geographic locations, and so on.
 
....
 
Unlike KG-OLAP, existing work on graph and KG summarization largely ignores contextuality in KGs. In fact, existing work on KG summarization is orthogonal to the KG-OLAP approach. Consequently, future work may adapt summarization algorithms to serve as graph operators in KG-OLAP.
 
[Contexto é pouco explorado em outras abordagens]    


Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...