Pular para o conteúdo principal

Graph Model Definitions - Hyper-relational

Artigo 1 - Improving Hyper-Relational Knowledge Graph Completion

Hyper-relational KGs (HKGs) go beyond conventional KGs by representing facts with more complex semantic information, e.g., using relation-entity pairs as the qualifiers of triplets. The combination of a triplet and its qualifiers together is called a statement.

In a hyper-relational KG GH , we denote the set of entities and relations as V and R respectively. The total number of entities is N and the number of relations is M. The edge connecting them, which we call a statement (or fact), is expressed in the domain V × R × V × P (R × V) where P denotes the power set. 

It’s usually written as (mh,mr,mt, Q) where (mh,mr,mt) is the main triplet of the statement containing head entity mh ∈ V, relation mr ∈ R and tail entity mt ∈ V respectively. Q is the set of qualifiers consisting n relation-entity pairs {(qri, qei)}n i=1 where qri ∈ R and qei ∈ V

(mh,mr,mt, {(qri, qei)})

Artigo 2 - Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

Despite its broad adoption, the triple-based representation of a KG often oversimplifies the complex nature of the data stored in the KG, in particular for hyper-relational data (a.k.a. multi-fold [38] or n-ary [14] relational data), where each fact contains multiple relations and entities.

Such hyper-relational data is ubiquitous in KGs.

However, representing a KG using triplets only often oversimplifies the complex nature of the data stored in the KG, in particular for hyper-relational data, where each fact contains multiple relations and entities (see example above).

Hyper-relational fact: A hyper-relational fact contains a base triplet (h,r,t) and a set of associated key-value pairs (ki ,vi), i = 1, ...,n.

(h,r,t, {(ki, vi)})

Artigo 3 - Logic on MARS: Ontologies for Generalised Property Graphs

We give a formalisation of a generalised notion of Property Graphs, called multi-attributed relational structures (MARS),

There is no standard definition of what constitutes a KG, and the formats used in practice vary. The basis of KGs typically are directed graphs with labelled nodes and edges. What distinguishes them from plain graphs is their enriched structure that includes additional annotations to provide contextual information for every edge or node. Examples include provenance (source information) and temporal validity, but there can be many other types of annotations.

A popular data model for such KGs is the Property Graph model, used by the Neo4J graph database ... It allows sets of attribute–value pairs to be associated with the nodes and edges in a directed graph. Such graphs are also known as attributed graphs.

In fact, even the underlying data model of a multi-attributed graph lacks proper formalisation. Property Graph and Wikidata are highly implementation bound and have no formal specification.

We consider a finite set P of predicates, where each p ∈ P has an associated arity ar(p) ≥ 0. If not otherwise stated, we assume this signature to be fixed and refrain from mentioning it. For the following definition, let Pfin (S) be the set of all finite subsets of set S.

Definition 1. A multi-attributed relational structure (MARS) M consists of a non-empty set ∆M of domain elements and, for each n-ary predicate p ∈ P, an (n + 1)-ary relation pM ⊆ (∆M) n × Pfin (∆M × ∆M).

In other words, a MARS behaves like a relational structure (i.e., hypergraph) over a domain ∆M, where each relation tuple (i.e., hyperedge) is annotated with a finite binary relation over ∆M. We view this relation as a set of attribute–value pairs. There might be multiple values for each attribute, justifying
our terminology. Also note that the same relational tuple may occur with different attribute–value collections within a single MARS. Thus MARS generalise Property Graphs, where attributes are functional and relations are unary and binary. The unary relations can be used to assign attribute–value collections to nodes.

Artigo 4 - Wikidata on MARS⋆

As noted in [9], Wikidatas custom data model supports attributed statements (with the attributes referred to as qualifiers), and allows attributes with multiple values.  

“In spite of the huge practical significance of these data models ..., there is practically no support for using such data in knowledge representation.”

Wikidata’s custom data model goes beyond the Property Graph data model, which associates sets of attribute-value pairs with the nodes and edges of a directed graph, by allowing for attributes with multiple values.

As Wikidata, like RDF [2], has a single domain for everything, including predicates, we will be extending MARS in this direction, but in a way that does not permit some of the strange situations possible in RDF.

Definition 1. A datatype theory, D, consists of a finite set of named datatypes, D, each of which has a finite or infinite set of data values; a finite set of named and typed datatype relations, R, over D; and a finite set of named and typed datatype functions, F, over D. The relations are closed under negation.
So datatype theory for the rationals and integers would have as data values all the rational numbers (with the integers as a subset). The datatype functions and relations could include the comparison relations (both within each datatype and between the two datatypes) and arithmetic functions.

Definition 2. An extended MARS (eMARS), M, is a MARS extended with a datatype theory, D. All datatypes, datatype relations, and datatype functions of D as well as all predicates are distinct elements of the domain of M, δM. All data values in D are also elements of δM. Each datatype is a unary predicate of M which is true on the data values of the datatype.

The domain elements for datatypes, datatype relations, datatype functions, and predicates are all distinct, thus eliminating several unusual situations that can occur in RDF and can be forced in extensions of RDF.

Objects in Wikidata are items, which include predicates (properties). Facts in Wikidata are statements, consisting of a subject (an item) and a main snak. Snaks are predicate-object pairs, or some-value snaks, or no-value snaks. Statements also have associated qualifiers, which are also snaks. Statements have a rank, which is regular, preferred, or deprecated. Wikidata also provides optional typing information for the values of properties. We also use a characterization for each property used in a qualifier in Wikidata; a set of ontological rules for Wikidata; and a set of constraints.

Apresentação sobre o MARS -> https://web.stanford.edu/class/cs520/abstracts/pfps.pdf

Artigo 5 - MillenniumDB: A Persistent, Open-Source, Graph Database - Leitura de Artigo

... it (domain graph model) generalizes existing graph data models such as RDF and property graphs. We also show its utility in concisely modeling real-world knowledge graphs that contain higher-arity relations, such as Wikidata [43]. 

Formally, assume a universe Obj of objects (ids, strings, numbers, IRIs, etc.). We define domain graphs as follows: 

Definition 2.1. A domain graph 𝐺 =(𝑂,𝛾) consists of a finite set of objects 𝑂 ⊆ Obj and a partial mapping 𝛾 : 𝑂 →𝑂 ×𝑂 ×𝑂. 

Intuitively, 𝑂 is the set of objects that appear in our graph database, and 𝛾 models edges between objects. If 𝛾(𝑒) =(𝑛1,𝑡,𝑛2), this states that the edge (𝑛1,𝑡,𝑛2) has id 𝑒, type 𝑡, and links the source node 𝑛1 to the target node 𝑛2 

We can analogously define our model as a relation ... where eid (edge id) is a primary key of the relation 

DomainGraph(source,type,target,eid

Domain graphs can directly capture property graphs, ... However, given a legacy property graph, there are some potential “incompatibilities” with the resulting domain graph


 

  

Comentários

Postagens mais visitadas deste blog

Aula 12: WordNet | Introdução à Linguagem de Programação Python *** com NLTK

 Fonte -> https://youtu.be/0OCq31jQ9E4 A WordNet do Brasil -> http://www.nilc.icmc.usp.br/wordnetbr/ NLTK  synsets = dada uma palavra acha todos os significados, pode informar a língua e a classe gramatical da palavra (substantivo, verbo, advérbio) from nltk.corpus import wordnet as wn wordnet.synset(xxxxxx).definition() = descrição do significado É possível extrair hipernimia, hiponimia, antonimos e os lemas (diferentes palavras/expressões com o mesmo significado) formando uma REDE LEXICAL. Com isso é possível calcular a distância entre 2 synset dentro do grafo.  Veja trecho de código abaixo: texto = 'útil' print('NOUN:', wordnet.synsets(texto, lang='por', pos=wordnet.NOUN)) texto = 'útil' print('ADJ:', wordnet.synsets(texto, lang='por', pos=wordnet.ADJ)) print(wordnet.synset('handy.s.01').definition()) texto = 'computador' for synset in wn.synsets(texto, lang='por', pos=wn.NOUN):     print('DEF:',s

truth makers AND truth bearers - Palestra Giancarlo no SBBD

Dando uma googada https://iep.utm.edu/truth/ There are two commonly accepted constraints on truth and falsehood:     Every proposition is true or false.         [Law of the Excluded Middle.]     No proposition is both true and false.         [Law of Non-contradiction.] What is the difference between a truth-maker and a truth bearer? Truth-bearers are either true or false; truth-makers are not since, not being representations, they cannot be said to be true, nor can they be said to be false . That's a second difference. Truth-bearers are 'bipolar,' either true or false; truth-makers are 'unipolar': all of them obtain. What are considered truth bearers?   A variety of truth bearers are considered – statements, beliefs, claims, assumptions, hypotheses, propositions, sentences, and utterances . When I speak of a fact . . . I mean the kind of thing that makes a proposition true or false. (Russell, 1972, p. 36.) “Truthmaker theories” hold that in order for any truthbe

DGL-KE : Deep Graph Library (DGL)

Fonte: https://towardsdatascience.com/introduction-to-knowledge-graph-embedding-with-dgl-ke-77ace6fb60ef Amazon recently launched DGL-KE, a software package that simplifies this process with simple command-line scripts. With DGL-KE , users can generate embeddings for very large graphs 2–5x faster than competing techniques. DGL-KE provides users the flexibility to select models used to generate embeddings and optimize performance by configuring hardware, data sampling parameters, and the loss function. To use this package effectively, however, it is important to understand how embeddings work and the optimizations available to compute them. This two-part blog series is designed to provide this information and get you ready to start taking advantage of DGL-KE . Finally, another class of graphs that is especially important for knowledge graphs are multigraphs . These are graphs that can have multiple (directed) edges between the same pair of nodes and can also contain loops. The