Pular para o conteúdo principal

A Survey on Knowledge Graphs: Representation, Acquisition and Applications - Leitura de Artigo

Ji, S., Pan, S., Cambria, E., Marttinen, P., & Philip, S. Y. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems33(2), 494-514.
 
Abstract— 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. 
 
We further explore several emerging topics, including meta relational learning, commonsense reasoning, and temporal knowledge graphs. 
 
To facilitate future research on knowledge graphs, we also provide a curated collection of datasets and open-source libraries on different tasks. 
 
[Olhar os datasets] 

I. INTRODUCTION
 
A knowledge graph is a structured representation of facts, consisting of entities, relationships, and semantic descriptions. Entities can be real-world objects and abstract concepts, relationships represent the relation between entities, and semantic descriptions of entities, and their relationships contain types and properties with a well-defined meaning. Property graphs or attributed graphs are widely used, in which nodes and relations have properties or attributes.
 
[Na definição consideram fatos. Também relacionam com KB em seguida mas não estabelecem com semantic network diretamente]
 
For simplicity and following the trend of the research community, this paper uses the terms knowledge
graph and knowledge base interchangeably.
 
Recent advances in knowledge-graph-based research focus on knowledge representation learning (KRL) or knowledge graph embedding (KGE) by mapping entities and relations into low-dimensional vectors while capturing their semantic meanings [5], [9]. Specific knowledge acquisition tasks include knowledge graph completion (KGC), triple classification, entity recognition, and relation extraction.
 
II. OVERVIEW
 
 
V. TEMPORAL KNOWLEDGE GRAPH

Current knowledge graph research mostly focuses on static knowledge graphs where facts are not changed with time, while the temporal dynamics of a knowledge graph is less explored. However, the temporal information is of great importance because the structured knowledge only holds within a specific period, and the evolution of facts follows a time sequence. Recent research begins to take temporal information into KRL and KGC, which is termed as temporal knowledge graph in contrast to the previous static knowledge graph. Research efforts have been made for learning temporal and relational embedding simultaneously.
 
[Contexto temporal sendo incluído nas abordagens KRL. A mesma abordagem poderia se aplica a outros contextos? Já houve esse movimento em Databases]

B. Entity Dynamics

Real-world events change entities’ state, and consequently, affect the corresponding relations. To improve temporal scope inference, the contextual temporal profile model [181] formulates the temporal scoping problem as state change detection and utilizes the context to learn state and state change vectors.
 
[Contexto temporal aplicado as entidades / nós]
 
C. Temporal Relational Dependency

There exists temporal dependencies in relational chains following the timeline, for example, 
wasBornIn -> graduateFrom -> workAt -> diedIn.
 
[Uma regra semântica poderia inferir contexto temporal em caso de informação faltante]
 
VI. KNOWLEDGE-AWARE APPLICATIONS

Rich structured knowledge can be useful for AI applications. However, how to integrate such symbolic knowledge into the computational framework of real-world applications remains a challenge. The application of knowledge graphs includes two folds: 1) in-KG applications such as link prediction and named entity recognition; and 2) out-of-KG applications, including relation extraction and more downstream knowledge-aware applications such as question answering and recommendation
systems.
 
[Não comenta sobre Busca Exploratória]
 
B. Question Answering

Knowledge-graph-based question answering (KG-QA) answers natural language questions with facts from knowledge graphs. Neural network-based approaches represent questions and answers in distributed semantic space, and some also conduct symbolic knowledge injection for commonsense reasoning.
 
VII. FUTURE DIRECTIONS
 
C. Interpretability

Interpretability of knowledge representation and injection is a vital issue for knowledge acquisition and real-world applications. ... However, recent neural models have limitations on transparency and interpretability, although they have gained impressive performance. Some methods combine black-box neural models and symbolic reasoning by incorporating logical rules to increase the interoperability. Interpretability can convince people to trust predictions. Thus, further work should go into interpretability and improve the reliability of predicted knowledge.
 
[Seria explicabilidade?]
 
APPENDIX D

KRL MODEL TRAINING

Open world assumption (OWA) and closed world assumption (CWA) [214] are considered when training knowledge representation learning models. During training, a negative sample set F0 is randomly generated by corrupting a golden triple set F under the OWA. Mini-batch optimization and Stochastic Gradient Descent (SGD) are carried out to minimize a certain loss function. Under the OWA, negative samples are generated with specific sampling strategies designed to reduce the number
of false negatives.
 
A. Open and Closed World Assumption

The CWA assumes that unobserved facts are false. By contrast, the OWA has a relaxed assumption that unobserved ones can be either missing or false. Generally, OWA has an advantage over CWA because of the incompleteness nature of knowledge graphs. RESCAL [49] is a typical model trained under the CWA, while more models are formulated under the OWA.
 
C. Negative Sampling

Several heuristics of sampling distribution are proposed to corrupt the head or tail entities. The widest applied one is uniform sampling [16], [17], [39] that uniformly replaces entities. But it leads to the sampling of false-negative labels. More effective negative sampling strategies are required to learn semantic representation and improve predictive performance.
 
[Treinamento e a questão CWA/OWA]
 
APPENDIX F

DATASETS AND LIBRARIES

In this section, we introduce and list useful resources of knowledge graph datasets and open-source libraries.

A. Datasets

Many public datasets have been released. We conduct an introduction and a summary of general, domain-specific, taskspecific, and temporal datasets.

1) General Datasets: Datasets with general ontological knowledge include WordNet [234], Cyc [235], DBpedia [236], YAGO [237], Freebase [238], NELL [73] and Wikidata [239]. It is hard to compare them within a table as their ontologies are different.
 
2) Domain-Specific Datasets: Some knowledge bases on specific domains are designed and collected to evaluate domainspecific tasks. Some notable domains include life science, health care, and scientific research, covering complex domains and relations such as compounds, diseases, and tissues. Examples of domain-specific knowledge graphs are ResearchSpace6, a cultural heritage knowledge graph; UMLS [240], a unified medical language system; SNOMED CT7, a commercial clinical terminology; and a medical knowledge graph from Yidu Research8. 
 
More biological databases with domain-specific knowledge include STRING, protein-protein interaction networks 9; SKEMPI, a Structural Kinetic and Energetic database of Mutant Protein Interactions [241]; Protein Data Bank (PDB) database10, containing biological molecular data [242]; GeneOntology11, a gene ontology resource that describes protein function; and DrugBank12, a pharmaceutical knowledge base [243], [244].

3) Task-Specific Datasets: A popular way of generating task-specific datasets is to sample subsets from large general datasets. Statistics of several datasets for tasks on the knowledge graph itself are listed in Table VIII. Notice that WN18 and FB15k suffer from test set leakage [55]. For KRL with auxiliary information and other downstream knowledge-aware applications, texts and images are also collected, for example, WN18-IMG [71] with sampled images and textual relation extraction dataset including SemEval 2010 dataset, NYT [245] and Google-RE13. IsaCore [246], an analogical closure of
Probase for opinion mining and sentiment analysis, is built by common knowledge base blending and multi-dimensional scaling. Recently, the FewRel dataset [247] was built to evaluate the emerging few-shot relation classification task. There are also more datasets for specific tasks such as cross-lingual DBP15K [128] and DWY100K [127] for entity alignment, multi-view knowledge graphs of YAGO26K-906 and DB111K-174 [119] with instances and ontologies.
 
We provide an online collection of knowledge graph publications, together with links to some open-source implementations of them, hosted at
https://shaoxiongji.github.io/knowledge-graphs/.


 

Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...