Pular para o conteúdo principal

Designing Graph Databases With GRAPHED - Leitura de Artigo

Olhando Lattes do meu orientador encontrei um artigo sobre Modelagem Conceitual para Graph Databases de 2019 escrito em parceria com pesquisadores da UnB. 

O artigo trata de uma proposta de notação para um diagrama para o modelo conceitual para dados conectados.

Journal of Database Management
Volume 30 • Issue 1 • January-March 2019


Designing Graph Databases With GRAPHED

Motivação: Although there have been advances in graph database technology, a notation to represent the conceptual graph model continues to present a challenge. There is no approach for data model widely accepted by the academic and business community in the graph databases. One advantage of data models is the ability to verify if some queries can be resolved with their structures, and this is important to validate the model requirements as well.

Definições:

  • multigraphs are graphs that can have several edges between the same pair of vertices.
  • hypergraphs are graphs that enable more than two vertices in an edge (called hyperedge).
  • nested graphs are graphs with nodes built from other graphs, there is, that allows groups of vertices that represent an aggregation level concept (hypernode).

Proposta: An independent implementation  notation for conceptual graph data modeling named GRAPHED (Graph Description Diagram for Graph Databases) that covers  the four types of graphical structures: simple, with attributes, hyperedge and nested (Angles, 2012) besides cardinality, weight and types.

Avaliação: Effectiveness and compatibility verification of GRAPHED in two case studies: fraud identification, and a biological network model.

Detalhamento da Solução:

  • Simple and Attributed Vertex

the optional Type information (long no exemplo de Person) is used to indicate the identifier’s domain.


  • Hypervertex (Hypernode) for Nested Graphs

  • Simple and Attributed Edges

The difference between solid and dashed arrows regards if the relationship was derived from other relationships. It works as a notation for views or inferred relationships such as inferring that people with the same mother are siblings (como definir a regra para a inferência?).
Other optional information that complements the edge is included as part of the label. This information includes the cardinality between the vertices for that relationship – represented by numbers in parenthesis – also, whether the type of relationship is a domain and not a constant string value, as well as the weight description indicating the value type, and the attributes of the relationship.
Edges can have attributes. Therefore, they can be written as in a vertex, appearing below the label with their names and types. 

The other feature for edges was created to indicate not only the labels of the edge but also the domain for relationships if the edge is written with a type. In this case, the relationship is identified not only by a label but with a specific type, which specializes the edges for the specific relationship. 

PARTNER_QUALIFICATION={SHAREHOLDER, DIRECTOR}

 

  • Hyperedges

It also covers hyperedges and hypernodes by respectively generalizing the simple edge and using a restructured concept from the hypernode. Hypervertices are the structures used to create nested graphs. If the definition is extended from a binary relation to a finite subset of V, allowing more than two elements in e, the edges will be able to link several vertices in the same relationship. Then, this new type of subset is called “hyperedge” and the graph that supports it is considered a “hypergraph”. Therefore, a hypergraph H(V, E) is a set of V with a set of relations E, where e = {(v1, v2, …, vn)} | e [pertence] E and {v1, v2, …, vn} [pertence] V. They join vertices and edges in subgraphs and work as nodes as well. Hypergraphs enable several instances to be grouped in a single hyperedge, including instances of the same entity.



Considerações adicionais em relação a outros modelos conceituais para grafos:


1) The ER model was not created with graph structures in mind, yet it adequately covers relationships, even whether there are several entities. However, since an ER model covers simple and attributed graphs using the entities to represent nodes and relations for the edges, at least one extension should be defined to include directions for directed graphs, which are the most common data structure in graph databases.  

A regular ER model could be addressed to represent the hyperedges linking the entities together. Therefore, the ER cardinality notation describes the number of instances for a relationship with another instance of some entity. It does not specify how many instances might appear inside one instance of a relationship.

>> O modelo ER pode ser estendido para contemplar hipergrafos mas não permite uma especificação de cardinalidade diferenciada?

2) RDF describes both schema and instances together, and it has been kept as a relevant theme, considering semantic web researches (Wylot, Hauswirth, Cudré-Mauroux, and Sakr, 2018). Anything in RDF is modeled as a Resource, and the RDF graph is a set of triples, composed of subjects connected to objects of predicates. It fits for modeling graphs where even attributes are described as vertices linked by the predicates. However, it still does not have support graphs where vertices and relationships can have attributes inside them. In this way, a more complex graph needs some notation extensions.

>> Os atributos de um predicado podem ser especificados por reificação ou pelo uso do quad (g, s, p , o)

Referências interessantes a serem exploradas

Angles, R. (2012). A comparison of current graph database models. In 2012 IEEE 28th international conference on data engineering workshops (ICDEW) (pp. 171-177). IEEE. doi:10.1109/ICDEW.2012.31

Gil, D., & Song, I. Y. (2016). Modeling and Management of Big Data. Future Generation Computer Systems, 63(C), 96–99. doi:10.1016/j.future.2015.07.019

Kaur, K., & Rani, R. (2013). Modeling and querying data in NoSQL databases. In Proceedings of the IEEE International Conference on Big Data 2013 (pp. 1-7). IEEE. doi:10.1109/BigData.2013.6691765


Comentários

  1. Nenhuma das 3 referências consta no meu SLR e nem o journal of database management estava na lista de journals para seleção. A referência Angles, R. (2012) está na monografia de disciplina NOSQL Graph Databases

    ResponderExcluir
  2. O Journal of Database Management poderia ser outra opção para submissão do SLR sobre Modelagem de Dados em Grafo

    ResponderExcluir

Postar um comentário

Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...