Vídeo -> https://youtu.be/8nbb6CwpPMA
RDF-star and SPARQL Path-Search Leveled the Advantages of LPG
RDF Keeps 3 Key Benefits: Standards, Semantics and Interoperability
Historical advantages of LPG:
o Attaching properties to the edges of a graph
o Efficient graph traversal
o Attaching properties to the edges of a graph
o Efficient graph traversal
- Gremlin seria mais eficiente nas consultas de caminho
RDF completely leveled those over the last 3 years:
o RDF-star – simple mechanism to attach metadata to the edges
o SPARQL extensions for exploration of multi-hop relationships in graphs
-RDF Star para adicionar metadados as triplas
o RDF-star offers more than edge properties
o Graph traversal boosted by reasoning
o Knowledge graphs’ added value
o RDF is better for knowledge graphs
o Graph traversal boosted by reasoning
o Knowledge graphs’ added value
o RDF is better for knowledge graphs
RDF-star allows edge descriptions, Statements about statements
✓ Allows multiple level of nesting
✓ Backward compatible
✓ Allows multiple level of nesting
✓ Backward compatible
- Reificação acrescenta complexidade aos dados e aumento o espaço de armazenamento e o tempo de carga
Use case: Access control for vocabulary management ... como no exemplo do Allegro mas esse não suporta RDF-Star
Nem todos suportam RDF-Star com aninhamento de triplas, a referência a triplas não "afirmadas" e outras operações
Graphs and Path Traversal, Common tasks:
o Check if two nodes are connected
o Find the shortest path between two nodes
o Find all paths between two nodes
o Find all neighboring nodes of distance X
o Check if two nodes are connected
o Find the shortest path between two nodes
o Find all paths between two nodes
o Find all neighboring nodes of distance X
- BFS, DFS, etc ...
Graph path search use cases
o Road navigation
o Knowledge graph analysis
o Supply Chain analysis
o Causality mining
o Recommendation
o Social network analysis
o Road navigation
o Knowledge graph analysis
o Supply Chain analysis
o Causality mining
o Recommendation
o Social network analysis
Limitations of SPARQL for path search
o Possible with SPARQL 1.1 property paths, HOWEVER:
✓ They uncover the start and end nodes, but not the intermediate ones
✓ Shortest path is tough
✓ They uncover the start and end nodes, but not the intermediate ones
✓ Shortest path is tough
- Converter consultas de caminho de tamanho fixo (NGP) em BGP/CGP ficam complexas
o Workarounds are ugly and slow
o To address this, all major triplestores made SPARQL extensions
Graph path search: Querying SHORTEST PATH
Directional (by default) as well as bidirectional search
-
extensões através de SERVICE, GraphDB tem uma extensão path:search onde
é possível especificar, nó origem, nó destino, máximo de distância, os
predicados do caminho, etc ...
LDBC Social Network Benchmark (SNB)
Linked Data Benchmarking Council: TPC-like body for graph databases
LDBC SNB is the most comprehensive graph analytics benchmark
✓ Analytics-oriented loads designed to simulate operations in a social network platform
✓ Lots of research work invested in a sophisticated data generator, making sure the data
distributions and connectivity are “good”: both realistic and challenging
✓ Analytics-oriented loads designed to simulate operations in a social network platform
✓ Lots of research work invested in a sophisticated data generator, making sure the data
distributions and connectivity are “good”: both realistic and challenging
- Neo4J foi o primeiro a se certificar no SNB
- entidades Pessoas, Cidades, Companhia, Universidade
- GraphDB é o primeiro TripleStore a se certificar em SNB(em processo de auditoria)
- a função de shortest path usa inferência
KG - Unified view across diverse information
Solving such problems requires unified view across:
o Diverse databases w/o centralized control
o Text documents and other unstructured content
o Both proprietary & global knowledge
o Diverse databases w/o centralized control
o Text documents and other unstructured content
o Both proprietary & global knowledge
- diversidade de fontes e formatos
All these problems require comprehensive domain knowledge:
o Awareness of over 100k concepts
o Highly interconnected reference data
o Relationships, which semantics really matter
o Awareness of over 100k concepts
o Highly interconnected reference data
o Relationships, which semantics really matter
- usar Taxonomia da Empresa
What’s the added value of knowledge graphs?
Enterprise KGs serve as hubs for data, metadata and content.
All sorts of data and metadata!
Enriched with even more semantic metadata
All sorts of data and metadata!
Enriched with even more semantic metadata
- metadados padrão e metadados semânticos sobre relações com outras entidades do grafo
How does the knowledge graphs magic work?
1. Link data to knowledge to put it in context
✓ When connected, two entities describe each other
✓ Conceptual networks allow better interpretation than tables
✓ 100K connected entity descriptions is a PhD level
2. Overlay semantic metadata to assure unambiguous interpretation
✓ “Diverse data”: data used for different purposes in different contexts
✓ Data is likely to be misinterpreted outside its primary context
✓ Clear formal specification of the meaning of data pieces is a must
3. Semantic data model + Stable reference data => Easy updates + Reuse
✓ Semantic metadata lowers the cost to discover and reuse data & models
✓ When connected, two entities describe each other
✓ Conceptual networks allow better interpretation than tables
✓ 100K connected entity descriptions is a PhD level
2. Overlay semantic metadata to assure unambiguous interpretation
✓ “Diverse data”: data used for different purposes in different contexts
✓ Data is likely to be misinterpreted outside its primary context
✓ Clear formal specification of the meaning of data pieces is a must
3. Semantic data model + Stable reference data => Easy updates + Reuse
✓ Semantic metadata lowers the cost to discover and reuse data & models
How is RDF better than any other DM paradigm?
o Explicit formal semantics
✓ To align the different modelling assumptions of the different IT systems
✓ Data validation to maintain good quality
✓ To align the different modelling assumptions of the different IT systems
✓ Data validation to maintain good quality
o Interoperability
✓ Federation and remote-access protocols
✓ Web-native syntax and global identifiers
✓ Thousands of datasets available as linked data
o Standards
✓ For everything: syntax, schema, query and update languages, ...
✓ Future proof data management
✓ Reduced vendor lock-in
✓ Federation and remote-access protocols
✓ Web-native syntax and global identifiers
✓ Thousands of datasets available as linked data
o Standards
✓ For everything: syntax, schema, query and update languages, ...
✓ Future proof data management
✓ Reduced vendor lock-in
- RDF e SHACL para especificar a semântica a nível de esquema, protocolos padrão de acesso, diversidade e volume em fontes públicas (LOD), padrões W3C
What do LPG lack to serve knowledge graphs? Nem um dos 3 anteriores
“Schema support and metadata management are crucial aspects of enterprise data management systems. RDF has advantages in both of these areas”
“The need to manage URIs and metadata, as well as perform SEO functions, makes this use case an ideal match for RDF”
“Dependency tracking can be dealt with equally well by LPG and RDF solutions. When it comes to metadata management, however, we believe this is an area in which RDF solutions have traditionally emphasized, and excelled in”
Comentários
Postar um comentário
Sinta-se a vontade para comentar. Críticas construtivas são sempre bem vindas.