Pular para o conteúdo principal

A Tripartite Perspective on GraphRAG

 arXiv:2504.19667v1 [cs.LG] 28 Apr 2025

 

 Large Language Models (LLMs) have shown remarkable capabilities across various domains, yet they struggle with knowledgeintensive tasks in areas that demand factual accuracy, such as in industrial automation and healthcare. Key limitations include their tendency to hallucinate, lack of source traceability (provenance), and challenges in timely knowledge updates. Retrieval Augmented Generation (RAG) techniques have attempted to address these issues by incorporating external knowledge, but they face their own limitations,.... Combining language models with knowledge graphs (GraphRAG) offers promising avenues for overcoming these deficits. However, a major challenge lies in creating such a knowledge graph in the first place. 

Construir KG usando LLM só empurra o problema para o KG

While language models (LLMs) have demonstrated impressive capabilities, they still have their limitations in knowledge-intensive tasks - especially in areas where factually correct information is essential ....

Factual mas CoaKG contém alegações contextualizadas

Another central problem is the lack of provenance, i.e. the lack of traceability of the information source, which makes it difficult to assess the trustworthiness of its output.

O KG precisa ter a proveniência já que não é a fonte primária

A major challenge of common RAG approaches is the selection of appropriate text chunks w.r.t a given query, chiefly because of the reliance on embeddingbased similarity for textual chunk selection, which can be unreliable or lack coverage due to fluctuating chunk sizes and topic diversity within queries. To this end, our approach provides a natural remedy. Applying a transformation to the constructed knowledge graph, we formulate LLM prompt creation as an unsupervised node classification problem.

Não seria necessário GQL na CoaKG Engine, se usar LLM para a consulta em linguagem natural. Mas o LLM seria capaz de deduzir contexto explícito como relações temporais e espaciais não representadas no grafo? A Algebra de Contradomínio poderia ser substituída por Prompt Engeneering?

 

Comentários

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Durante um projeto de pesquisa podemos encontrar um artigo que nos identificamos em termos de problema de pesquisa e também de solução. Então surge a vontade de saber como essa área de pesquisa se desenvolveu até chegar a esse ponto ou quais desdobramentos ocorreram a partir dessa solução proposta para identificar o estado da arte nesse tema. Podemos seguir duas abordagens:  realizar uma revisão sistemática usando palavras chaves que melhor caracterizam o tema em bibliotecas digitais de referência para encontrar artigos relacionados ou realizar snowballing ancorado nesse artigo que identificamos previamente, explorando os artigos citados (backward) ou os artigos que o citam (forward)  Mas a ferramenta Connected Papers propõe uma abordagem alternativa para essa busca. O problema inicial é dado um artigo de interesse, precisamos encontrar outros artigos relacionados de "certa forma". Find different methods and approaches to the same subject Track down the state of the art rese...

KnOD 2021

Beyond Facts: Online Discourse and Knowledge Graphs A preface to the proceedings of the 1st International Workshop on Knowledge Graphs for Online Discourse Analysis (KnOD 2021, co-located with TheWebConf’21) https://ceur-ws.org/Vol-2877/preface.pdf https://knod2021.wordpress.com/   ABSTRACT Expressing opinions and interacting with others on the Web has led to the production of an abundance of online discourse data, such as claims and viewpoints on controversial topics, their sources and contexts . This data constitutes a valuable source of insights for studies into misinformation spread, bias reinforcement, echo chambers or political agenda setting. While knowledge graphs promise to provide the key to a Web of structured information, they are mainly focused on facts without keeping track of the diversity, connection or temporal evolution of online discourse data. As opposed to facts, claims are inherently more complex. Their interpretation strongly depends on the context and a vari...

Knowledge Graph Toolkit (KGTK)

https://kgtk.readthedocs.io/en/latest/ KGTK represents KGs using TSV files with 4 columns labeled id, node1, label and node2. The id column is a symbol representing an identifier of an edge, corresponding to the orange circles in the diagram above. node1 represents the source of the edge, node2 represents the destination of the edge, and label represents the relation between node1 and node2. >> Quad do RDF, definir cada tripla como um grafo   KGTK defines knowledge graphs (or more generally any attributed graph or hypergraph ) as a set of nodes and a set of edges between those nodes. KGTK represents everything of meaning via an edge. Edges themselves can be attributed by having edges asserted about them, thus, KGTK can in fact represent arbitrary hypergraphs. KGTK intentionally does not distinguish attributes or qualifiers on nodes and edges from full-fledged edges, tools operating on KGTK graphs can instead interpret edges differently if they so desire. In KGTK, e...