AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases - Leitura de Artigo

AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases - Leitura de Artigo - Ano 2013

ABSTRACT: Inductive Logic Programming (ILP) can be used to mine logical rules from the KB. These rules can help deduce and add missing knowledge to the KB. While ILP is a mature field, mining logical rules from KBs is different in two aspects: First, current rule mining systems are easily overwhelmed by the amount of data .... Second, ILP usually requires counterexamples. KBs, however, implement the open world assumption (OWA), meaning that absent data cannot be used as counterexamples.

[Mesmo problema de algortimos de ML para Link Prediction]

1. INTRODUCTION

Making these KBs complete requires great effort to extract facts, check them for correctness, and add them to the KB. However, KBs themselves often already contain enough information to derive and add new facts. ... As for any rule, there can be exceptions, but in the vast majority of cases, the rule will hold.

[Mesmo que as regras não prevejam todos os fatos corretamente pois há exceções ainda é válido para a maioria. Link Prediction também pode gerar triplas "erradas".]

Finding such rules can serve four purposes: First, by applying such rules on the data, new facts can be derived that make the KB more complete. Second, such rules can identify potential errors in the knowledge base. ... Third, the rules can be used for reasoning. Many reasoning approaches rely on other parties to provide rules (e.g., [27, 31]). Last, rules describing general regularities can help us understand the data better.

[Seria possível derivar regras para inferir contexto faltante? O caso do preferred rank e o contexto de data ser corrente poderia ser um exemplo. ]

We focus on RDF-style KBs in the spirit of the Semantic Web, such as YAGO [35], Freebase, and DBpedia [5]. These KBs provide binary relationships in the form of RDF triples. Since RDF has only positive inference rules, these KBs contain only positive statements and no negations.

[Baseado em triplas e não trabalha com outros modelos, qual seria o impacto da reificação neste trabalho? Como as regras são geradas em bases com reificação? Em trabalhos de ML já se admitiu que a reificação atrapalha a geração do modelo. ]

Furthermore, they operate under the Open World Assumption (OWA). Under the OWA, a statement that is not contained in the KB is not necessarily false; it is just unknown. This is a crucial difference to many standard database settings that operate under the Closed World Assumption (CWA). Consider an example KB that does not contain the information that a particular person is married. Under CWA we can conclude that the person is not married. Under OWA, however, the person could be either married or single.

[Diferentes paradigmas e questão dos contra exemplos. Mas ainda assim assume que o que está afirmado é fato e não alegação que depende do um contexto, como a fonte, e do contexto de uso para ser considerada verdadeira e útil]

Association rule mining [3] is well-known in the context of sales databases. It can find rules such as "If a client bought beer and wine, then he also bought aspirin". The confidence of such a rule is the ratio of cases where beer and wine was actually bought together with aspirin. Association rule mining inherently implements a closed world assumption: A rule that predicts new items that are not in the database has a low confidence. It cannot be used to (and is not intended to be used to) add new items to the database.

[Como completar o KG com novas afirmações? Não seria com regras de associação mas ainda assim poderia indicar potenciais erros e imcompletudes nas afirmações presentes. Se X% das alegações do tipo <sujeito1 predicado1 objeto1> também implicam em <sujeito2 predicado2 objeto2> então a ausência da segunda tripla seria uma incompletude. Na WD existe a constraint "item requires statement"]

ILP approaches deduce logical rules from ground facts. Yet, current ILP systems cannot be applied to semantic KBs for two reasons: First, they usually require negative statements as counter-examples. Semantic KBs, however, usually do not contain negative statements. The semantics of RDF are too weak to deduce negative evidence from the facts in a KB. Because of the OWA, absent statements cannot serve as counter-evidence either.

[OWA e contra exemplos. O treinamento em ML com o uso de exemplos negativos também é um problema. Vide aqui]

Second, today's ILP systems are slow and cannot handle the huge amount of data that KBs provide.

[Escalabilidade e o problema de eficiência(performance)]

More precisely, our contributions are as follows: (1) A method to simulate negative examples for positive KBs (the Partial Completeness Assumption) (2) An algorithm for the efficient mining of rules. (3) A system, AMIE, that mines rules on millions of facts in a few minutes without the need for parameter tuning or expert input.

[Teoria PCA em relação ao OWA e CWA]

2. RELATED WORK

Logical Rule Mining. Sherlock [32] is an unsupervised ILP method to learn first-order Horn clauses from a set of extracted facts for a given target relation. It uses probabilistic graphical models (PGMs) to infer new facts. It tackles the noise of the extracted facts by extensive filtering in a preprocessing step and by penalizing longer rules in the inference part. For mining the rules, Sherlock uses 2 heuristics: statistical significance and statistical relevance.

[Este sistema já foi falado na disciplina do Casanova]

These approaches are not tailored to deal with large KBs under the Open World Assumption.

We compare our system, AMIE, to WARMR and ALEPH, which are the only ones available for download. Our experiments do not only show that these systems mine less sensible rules than AMIE, but also that it takes them much longer to do so.

Generating Schemas. In this paper, we aim to generate Horn rules on a KB. Other approaches use rule mining to generate the schema or taxonomy of a KB. [7] applies clustering techniques based on context vectors and formal concept analysis to construct taxonomies. Some approaches use rule mining for ontology merging and alignment ....

3. PRELIMINARIES

The facts of an RDF KB can usually be divided into an ABox and a T-Box. While the A-Box contains instance data, the T-Box is the subset of facts that define classes, domains, ranges for predicates, and the class hierarchy. Although TBox information can also be used by our mining approach, we are mainly concerned with the A-Box, i.e., the set of facts relating one particular entity to another.

[A mineração de regras do AMIE não usa o esquema]

Manual inspection shows, however, that relations in semantic KBs tend to be more functional than inverse functional. Intuitively, this allows us to consider a fact r(x; y) as a fact about x.

[Não é sobre inverse functional property]

Rules. An atom is a fact that can have variables at the subject and/or object position. A (Horn) rule consists of a head and a body, where the head is a single atom and the body is a set of atoms. An instantiation of a rule is a copy of the rule, where all variables have been substituted by entities. A prediction of a rule is the head atom of an instantiated rule if all body atoms of the instantiated rule appear in the KB. For example, the above rule can predict isCitizenOf(Lisa,USA) if the KB knows a parent of Lisa (hasChild(Elvis,Lisa)) who is American (isCitizenOf(Elvis,USA)).

[Regras do tipo SE-ENTÃO]

Language Bias. As most ILP systems, AMIE uses a language bias to restrict the search space. We say that two atoms in a rule are connected if they share a variable or an entity. A rule is connected if every atom is connected transitively to every other atom of the rule. AMIE mines only connected rules, i.e., it avoids constructing rules that contain unrelated atoms. We say that a rule is closed if every variable in the rule appears at least twice. Such rules do not predict merely the existence of a fact (e.g. diedIn(x; y)) 9z : wasBornIn(x; z)), but also concrete arguments for it (e.g. diedIn(x; y) ) wasBornIn(x; y)). AMIE mines only closed rules. We allow recursive rules that contain the head relation in the body.

[Regras com átomos conectados]

4. MINING MODEL

We distinguish 4 types of facts: True facts that are known to the KB (KBtrue), true facts that are unknown to the KB (NEWtrue), facts that are known to be false in the KB (KBfalse), and facts that are false but unknown to the KB (NEWfalse). The rule will make certain predictions (blue circle). These predictions can be known to be true (A), known to be false (C), or unknown (B and D). When they are unknown to the KB, they can still be true (B) or false (D) with respect to the real world.

[Quatro espectros: o que está no KG e o que não está, o que foi previsto correto e errado. Mas para a camada de confiança não basta estar no KG tem que estar contextualizado: Dual Open World Assumption]

[Falta o escopo para definir como verdade mas esta verdade também depende da tarefa que não é representada no KG]

Goal. Our goal is to find rules that make true predictions that go beyond the current KB. In the figure, we wish maximize the area B, and to minimize the area D. There are two obvious challenges in our context: First, the areas NEWtrue and NEWfalse are unknown. So if we wish to maximize B at the expense of D, we are operating in an area outside our KB. We would want to use the areas KBtrue and KBfalse to estimate the unknown area. This, however, leads to the second challenge: Semantic KBs do not contain negative evidence. Thus, the area KBfalse is empty.

[Como medir o sucesso da abordagem sem um golden standard e sem conhecer o espaço do KBfalse?]

[Na WD existe a opção de criar uma alegação que represente uma negação, por exemplo, <Fulano child no_value> e também de que o valor para uma determinada propriedade ou relação é desconhecido mas deve existir <Fulano bornIn unknow_value>. Vide aqui. Este recurso poderia ser usado para a regra "Se X% das alegações do tipo <sujeito1 predicado1 objeto1> também implicam em <sujeito1 predicado2 objeto2>" ou quando a constraint "item requires statement" existir como (predicado1, predicado2). Pode-se criar <sujeito1 predicado2 unknown_value>]

We will now present different measures that address these challenges.

Support. The support of a rule quantifies the number of correct predictions, i.e., the size of A.

Head Coverage. Support is an absolute number. ... Therefore, we propose to use the notion of head coverage. This is the proportion of pairs from the head relation that are covered by the predictions of the rule

Negative Examples. The central challenge of our setting is to provide counter-examples for the rule mining. These can take the role of KBfalse, so that we can estimate the areas NEWtrue and NEWfalse. There are several approaches to this problem: The standard confidence, the standard positive-only learning evaluation score of ILP, and our new partial completeness assumption.

Standard Confidence. The standard confidence measure takes all facts that are not in the KB (i.e., NEWtrue and NEWfalse) as negative evidence. [Isso não seria Closed World? SIM]
The standard confidence is blind to the distinction between "false" and "unknown". Thus, it implements a closed world setting.

Positive-Only Learning. For cases where the KB does not contain negative examples, Muggleton has developed a positive-only learning evaluation score for ILP ...

Partial Completeness. We propose to generate negative evidence by the partial completeness assumption (PCA). This is the assumption that
if r(x; y) <pertence> KBtrue for some x; y, then <existe>y' : r(x; y') <pertence> KBtrue <união> NEWtrue ) r(x; y') <pertence> KBtrue.
In other words, we assume that if the database knows some r-attribute of x, then it knows all r-attributes of x. This assumption is certainly true for functional relations r, such as birth dates, capitals, etc.

[As triplas (s, p, o) são verdadeiras no KBTrue <união> NewTrue G e falsa se não aparece em KBTrue <união> NewTrue G mas existe uma tripla (s, p, o′) que pertence a KBTrue. Mas no caso do multiple values como assumir isso? Considerando as constraints single-value e single-best-value? Mas o AMIE não usa o esquema para mineirar as regras, somente as instâncias.]

PCA Confidence. Under the PCA, we normalize the confidence not by the entire set of facts, but by the set of facts of which we know that they are true, together with the facts of which we assume that they are false.

[A métrica de confiança da regra depende desta definição]

5. AMIE

After having outlined the basic definitions and the mining model in Sections 3 and 4, we now outline the core algorithm of our framework and its implementation.

5.1 Algorithm

Goal. Our goal is to mine rules of the form defined in Section 3. One of the main problems of any mining approach is to find an efficient way to explore the search space. The naive algorithm of enumerating all possible rules is infeasible for large KBs. Hence, we explore the search space by iteratively extending rules by mining operators.

Mining Operators. We see a rule as a sequence of atoms. The first atom is the head atom and the others are the body atoms.
1. Add Dangling Atom (OD)
This operator adds a new atom to a rule. The new atom uses a fresh variable for one of its two arguments. The other argument (variable or entity) is shared with the rule, i.e., it occurs in some other atom of the rule.
[ rel1(A, B) || rel2 (B, Y) ... átomos conectados por B]
[ rel1(A, b) || rel2 (b, Y) ... átomos conectados por b]

2. Add Instantiated Atom (OI )
This operator adds a new atom to a rule that uses an entity for one argument and shares the other argument (variable or entity) with the rule.
[ rel1(A, B) || rel2 (B, y) ... átomos conectados por B]
[ rel1(A, b) || rel2 (b, y) ... átomos conectados por b]

3. Add Closing Atom (OC)
This operator adds a new atom to a rule so that both of its arguments are shared with the rule.
[ rel1(a, B) || rel2 (B, a) ... átomos conectados por B e a]
[ rel1(A, b) || rel2 (b, A) ... átomos conectados por b e A]

Algorithm. We mine rules with Algorithm 1.
The algorithm maintains a queue of rules, which initially just contains the empty rule.
The algorithm iteratively dequeues a rule from the queue. If the rule is closed (see Section 3), the rule is output, otherwise, it is not.
Then, the algorithm applies all operators to the rule and adds the resulting rules to the queue (unless they are pruned out, s.b.).
This process is repeated until the queue is empty.

[Será que aplica um operador por vez?]

We parallelize this process by maintaining a centralized queue, from which the threads dequeue and enqueue.

[Otimização de estrutura de dados]

We do not feed predictions of the rules back into the KB. All measures (such as confidence and support) are always computed on the original KB.

[As novas triplas poderiam gerar outras triplas. Quando se aplica uma máquina de Inferências isso ocorre até nenhuma tripla nova ser inferida.]

Pruning. If executed naively, our algorithm will have prohibitively high run-times.... We first observe that we are usually not interested in rules that cover only very few facts of the head relation. Rules that cover, for example, less than 1% of the facts of the head relation can safely assumed to be marginal. Therefore, we set <teta> = 0:01 as a lower bound for the head coverage.

[Corte inicial de regras com baixo resultado]

Projection Queries. No matter what operator is applied in particular, the algorithm needs to choose a relation for the new atom that is added to the rule. In addition, the instantiation operator OI also allows the choice of an entity. In order to select only relations and entities that will fulfill the head coverage constraint, we rely on the KB to answer projection queries.

[Consultas para identificar as relações e entidades que podem ser usadas nas "melhores" regras]

5.2 Implementation SQL and SPARQL.

Projection queries are essential for the efficiency of our system. Yet, standard database implementations do not provide special support for these types of queries. Our experience shows that already running the non-nested query on a database of a few million facts can easily take several minutes on an off-the-shelf RDBMS.

In-Memory Database. We have implemented a vanilla in-memory database for semantic KBs. Our implementation indexes the facts aggressively with one index for each permutation of subject, relation, and object.

[Foi otimizado na versão 3]

Summary. We have identified projection queries as the crucial type of queries for rule mining. Since standard database systems and standard SPARQL systems provide no specifically tuned support for these queries, we have implemented a vanilla in-memory database, which has specific support for projection queries.

[Seriam as queries Look Up?]

6. EXPERIMENTS

6.1 Overview Settings.

By default, AMIE finds all rules whose head coverage exceeds the default threshold of <teta> = 1%. AMIE ranks the resulting rules by decreasing PCA confidence. There is no need to deviate from this default configuration when a user runs AMIE. There are no parameters to tune. All experiments with AMIE on all datasets are run in this setting, unless otherwise mentioned.

[Não requer ajustes de configuração]

Knowledge Bases. We run our experiments on different KBs. In all cases, we removed the rdf:type relationship, because it inflates the size of the KBs. We are aware that the rdf:type relationship can be very helpful for rule mining. However, currently no approach (including ours) makes specific use of it. .. Furthermore, we removed all facts with literals (numbers and strings) from the KBs.

[Abordagens ML de Link Prediction também removem triplas com literais]

Evaluations. In all experiments, our goal is twofold: First, we want to produce as many predictions as possible beyond the current KB. Second, the percentage of correct predictions shall be as large as possible. The particular challenge is that we want to evaluate predictions that go beyond the current KB. We are not interested in describing the existing data, but in generating new data. ... Then we compare the remaining predicted facts to the successor of that dataset (YAGO2s [34]). A prediction is "correct" if it appears in the newer KB. A prediction is "incorrect" if it has a highly functional or highly inverse functional relation and contradicts an existing fact in the newer KB, e.g., a different birth place.

Outlook. We note that with the project of predicting beyond current knowledge, we are entering a new, and very risky area of research. We do not expect Horn rules to have extraordinary precisions in the unknown region. Rules can only yield hypotheses about possible facts.

[Inferir novas alegações e não fatos]

6.2 AMIE vs. WARMR and ALEPH
6.3 AMIE with Different Metrics
6.4 AMIE on Different Datasets

7. CONCLUSION

In this paper, we have presented an approach for mining Horn rules on large RDF knowledge bases. We have introduced a formal model for rule mining under the Open World Assumption, a novel measure to simulate counter-examples, and a scalable algorithm for the mining. In contrast to state of-the-art approaches, our system (AMIE) requires no input other than the KB, and does not need configurations or parameter tuning.

In our future work, we plan to consider also the T-Box of the KB in order to produce more precise rules.

[Usar o esquema]

Pesquisa de Doutorado da Veronica

Pesquisar este blog

AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases - Leitura de Artigo - Ano 2013

Marcadores

Comentários

Postar um comentário

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

Exploratory Search: From Finding to Understanding - Leitura de Artigo