Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space

Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space - Leitura de Artigo

Tran H.N., Takasu A. (2019) Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space. In: Doucet A., Isaac A., Golub K., Aalberg T., Jatowt A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science, vol 11799. Springer, Cham. https://doi.org/10.1007/978-3-030-30760-8_14

Abstract.

... In recent years, the knowledge graph has emerged as a universal data format for representing knowledge about heterogeneous entities and their relationships. The knowledge graph can be modeled by knowledge graph embedding methods, which represent entities and relations as embedding vectors in semantic space, then model the interactions between these embedding vectors.

... In this paper, we propose to analyze these semantic structures based on the well-studied word embedding space and use them to support data exploration.

We also define the semantic queries, which are algebraic operations between the embedding vectors in the knowledge graph embedding space, to solve queries such as similarity and analogy between the entities on the original datasets. We then design a general framework for data exploration by semantic queries and discuss the solution to some traditional scholarly data exploration tasks.

1 Introduction

In recent years, digital libraries have moved towards open science and open access with several large scholarly datasets being constructed.

Notably, instead of using knowledge graphs directly in some tasks, we can model them by knowledge graph embedding methods, which represent entities and relations as embedding vectors in semantic space, then model the interactions between them to solve the knowledge graph completion task.

2 Related Work

2.1 Knowledge graph for scholarly data
2.2 Knowledge graph embedding
2.3 Word embedding

3 Theoretical analysis

3.2 Semantic query

We mainly concern with the two following structures of the embedding space.

– Semantic similarity structure: Semantically similar entities are close to each other in the embedding space, and vice versa. This structure can be identified by a vector similarity measure, such as the dot product between two embedding vectors.

– Semantic direction structure: There exist semantic directions in the embedding space, by which only one semantic aspect changes while all other aspects stay the same. It can be identified by a vector difference, such as the subtraction between two embedding vectors.

Definition 1. Semantic queries on knowledge graph embedding space are defined as the algebraic operations between the knowledge graph embedding vectors to approximate a given data exploration task on the original dataset

4 Semantic query framework

Task processing: converting data exploration tasks to algebraic operations on the embedding space by following task-specific conversion templates. Some important tasks and their conversion templates are discussed in Section 5.

5 Exploration tasks and semantic queries conversion

5.1 Similar entities

Given an entity e ∈ E, find entities that are similar to e. For example, given AuthorA, find authors, papers, and venues that are similar to AuthorA. Note that we can restrict to find specific entity types. This is a traditional tasks in scholarly data exploration, whereas other below tasks are new.
Semantic query We can solve this task by looking for the entities with highest similarity to e.

5.2 Similar entities with bias

Given an entity e ∈ E and some positive bias entities A = {a1, . . . , ak} known as expected results, find entities that are similar to e following the bias in A. For example, given AuthorA and some successfully collaborating authors, find other similar authors that may also result in good collaborations with AuthorA.
Semantic query We can solve this task by looking for the entities with highest similarity to both e and A. For example, denoting the arithmetic mean of embedding vectors in A as ̄A,

5.3 Analogy query

Given an entity e ∈ E, positive bias A = {a1, . . . , ak}, and negative bias B = {b1, . . . , bk}, find entities that are similar to e following the biases in A and B. The essence of this task is tracing along a semantic direction defined by the positive and negative biases. For example, start with AuthorA, we can trace along the expertise direction to find authors that are similar to AuthorA but with higher or lower expertise.
Semantic query We can solve this task by looking for the entities with highest similarity to e and A but not B. For example, denoting the arithmetic mean of embedding vectors in A and B as ̄A and ̄B, respectively, note that ̄A − ̄B defines the semantic direction along the positive and negative biases

5.4 Analogy browsing

This task is an extension of the above analogy query task, by tracing along multiple semantic directions defined by multiple pairs of positive and negative biases. This task can be implemented as an interactive data analysis tool. For example, start with AuthorA, we can trace to authors with higher expertise, then continue tracing to new domains to find all authors similar to AuthorA with high expertise in the new domain. For another example, start with Paper1, we can trace to papers with higher quality, then continue tracing to new domain to look for papers similar to Paper1 with high quality in the new domain.
Semantic query We can solve this task by simply repeating the semantic query for analogy query with each pair of positive and negative bias. Note that we can also combine different operations in different order to support flexible browsing.

Outras fontes

Esse artigo está relacionado ao benchmark abaixo

KG20C: A scholarly knowledge graph benchmark dataset

Gerado a partir do MAG, Consultas respondidas com embeddings

Who may work at this organization?
Where may this author work at?
Who may write this paper?
What papers may this author write?
Which papers may cite this paper?
Which papers may this paper cite?
Which papers may belong to this domain?
Which may be the domains of this paper?
Which papers may publish in this conference?
Which conferences may this paper publish in?

https://www.kaggle.com/tranhungnghiep/kg20c-scholarly-knowledge-graph/version/88

Pesquisa de Doutorado da Veronica

Pesquisar este blog

Exploring Scholarly Data by Semantic Query on Knowledge Graph Embedding Space - Leitura de Artigo

Marcadores

Comentários

Postar um comentário

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

Exploratory Search: From Finding to Understanding - Leitura de Artigo