Pular para o conteúdo principal

Leitura de Artigo - A Comprehensive Approach to Assess Trustworthiness and Completeness of Knowledge Graphs

 A Comprehensive Approach to Assess Trustworthiness and Completeness of Knowledge Graphs

 International Journal of Knowledge Engineering, Vol. 10, No. 1, 2024

 

 Abstract

Completeness and trustworthiness are two dimensions that are used to assess the quality of KGs. Estimation of the completeness and trustworthiness of a largescale knowledge graph often requires humans to annotate samples from the graph.

Estimativa de métricas de qualidade que dependem de interferência humana

Nowadays, to reduce the costs of the manual construction of knowledge graphs, many KGs have been constructed automatically from sources with varying degrees of trustworthiness. Therefore, possible noises and conflicts are inevitably introduced in the process of construction, which severely interferes with the quality of constructed KGs. 

O processo de construção automática para aumentar a completude pode degradar a confiabilidade. Introduz "fatos" incorretos ou conflitantes.

we propose a new approach to automatically evaluate and assess existing KGs in terms of completeness and trustworthiness.

Como avaliar estas métricas de modo automático?

INTRODUCTION

There are several ways to define trustworthiness. For instance, the user’s acceptance of the information as right, genuine, real, and credible is defined by its trustworthiness; trustworthiness also refers to an entity’s or KG’s reputation, which is based on personal experience or third-party recommendations

Certo x Errado está vinculado a Verdade Absoluta

To evaluate the quality of KGs, some papers explore several main evaluation dimensions of KG quality, such as accuracy, completeness, consistency, timeliness, trustworthiness, and availability [5]. Nevertheless, when we improve the accuracy, timeliness, and consistency of a KG, we also increase its trustworthiness.

Dimensões de qualidade que podem influencia na Confiabilidade

we evaluate the entire KG to come up with an accurate trust score, which has been shown in the results of our experiments. To the best of our knowledge, this work is among the first to propose a new approach to evaluate KGs and assign a certain trustworthiness factor score to compare these KGs in terms of their degrees of trustworthiness.

Métrica que permite comparabilidade para escolha de qual KG usar em uma tarefa

Completeness can be subjective because it implies that the quantity of data is adequate for the user’s needs, which might vary considerably. In this context, completeness can be measured as the percentage of available data divided by the required data.

Completude pode variar por tarefa/objetivo/intenção de uso

Unlike other quality dimensions of a KG, the evaluation of KG completeness needs a reference or gold standard to compare results against

Neste caso seria independente da tarefa

PTrustE: J. Ma, C. Zhou, Y. Wang, Y. Guo, G, Hu, Y. Qiao, and Y. Wang, “PTrustE: A high-accuracy knowledge graph noise detection method based on path trustworthiness and triple embedding,” Knowledge-Based Systems, vol. 256, 109688, 2022.

CKRL suggests three different sorts of triple confidences based on local triple and global path information.

Uma métrica por tripla. Em um Multi-layer KG seria possível associar esta métrica a cada tripla mas este par chave/valro não corresponde a um contexto. 

CKRL:  R. Xie, Z. Liu, F. Lin, and L. Lin, “Does William Shakespeare really write Hamlet? Knowledge representation learning with confidence,” in Proc. the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018, April.

Therefore, in this research paper, we first identify the noisy triples using CKRL, then we assess the trustworthiness of KGs by calculating the percentage of correct triples in the KG. Furthermore, we evaluate the completeness of knowledge graphs by measuring the percentage of found triples divided by the queried triples.

Método para calcular as duas métricas

DESIGN OF EXPERIMENTAL STUDY

we use three standard datasets, namely, FB15K, WN18, and NELL995.

FB = FreeBase

In other words, to construct a negative triple, they randomly alter one of the head or tail entities for a given positive triple in KG. It is required for the formation of negative triples that the new head or tail exists in the head or tail position with the same relation in the KG to make it harder and more confusing. ....

To be more precise, to add a noisy triple from an initial positive triple (h, r, t) in KG, either h or t was randomly switched to generate a negative triple. Following this idea, three noisy KGs based on the aforementioned datasets were acquired, with noisy triples making up 10%, 20%, and 40% of positive triples, respectively.

Ruído foi introduzido nos datasets

To calculate the trust score for the entire KG, we calculate the number of trusted triples over the total number of triples, including noise.

Se o trust for maior que 0.5 é fato, caso contrário é ruido

We removed 10, 20, and 40 percent of the triples to have three KGs with different levels of completeness. Then, we run random queries on these three KGs. If there is a matching triple for the query in the knowledge graph, we increase the completeness score. We query 40% of the triples to get a great estimate of the completeness of each KG. To calculate the completeness score for each KG, we divide the number of found triples by the total number of queried triples. The results show that the calculated completeness score mirrors the level of completeness of a knowledge graph.

Triplas foram removidas e consultas executadas aleatóriamente. O resultado das consultas é comparado com o gabarito da consulta ao KG completo.

As consultas poderiam corresponder as perguntas de competência que se espera que o KG responda?


PARA LER


X. Wang, L. Chen, T. Ban, M. Usman, Y. Guan, S. Liu, and H. Chen, “Knowledge graph quality control: A survey,” Fundamental Research, 2021.


 

 



Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

  Jun Shi, Huan Gao, Guilin Qi, and Zhangquan Zhou. 2017. Knowledge Graph Embedding with Triple Context. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM '17). Association for Computing Machinery, New York, NY, USA, 2299–2302. https://doi.org/10.1145/3132847.3133119 ABSTRACT Knowledge graph embedding, which aims to represent entities and relations in vector spaces, has shown outstanding performance on a few knowledge graph completion tasks. Most existing methods are based on the assumption that a knowledge graph is a set of separate triples, ignoring rich graph features, i.e., structural information in the graph. In this paper, we take advantages of structures in knowledge graphs, especially local structures around a triple, which we refer to as triple context. We then propose a Triple-Context-based knowledge Embedding model (TCE). For each triple, two kinds of structure information are considered as its context in the graph; one is the out...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...