Wikidata: The Making Of - Leitura de Artigo

Vídeo -> https://youtu.be/P3-nklyrDx4

Denny Vrandečić, Lydia Pintscher, and Markus Krötzsch. 2023. Wikidata: The Making Of. In Companion Proceedings of the ACM Web Conference 2023 (WWW '23 Companion). Association for Computing Machinery, New York, NY, USA, 615–624. https://doi.org/10.1145/3543873.3585579

1 INTRODUCTION

(5) Verifability, not truth: Wikidata relies on external sources for confrmation; statements can come with references; conficting or debated standpoints may co-exist

The data collected in most of these projects can also be considered knowledge graphs, i.e., structured data collections that encode meaningful information in terms of (typed, directed) connections between concepts. Nevertheless, the actual data sets are completely diferent, both in their vocabulary and their underlying data model. In comparison to other approaches, Wikidata has one of the richest graph formats, where each statement (edge in the graph) can have user-defned annotations (e.g., validity time) and references.

3 SEMANTIC WIKIPEDIA

4 MOVING SIDEWAYS (2005–2010)

5 EVOLUTION OF AN IDEA

Another important realization was that verifability would have to play a central role.

The project developed ideas for handling contradicting and incomplete knowledge, and analyzed Wikipedia to understand the necessity for such approaches [63].

6 PROJECT PROPOSAL

Thanks to the on-going collaboration in RENDER, Pavel Richter, then Executive Director of Wikimedia Deutschland, took the proposal to WMDE’s Board, which decided to accept Wikidata as a new Wikimedia project in June 2011, provided that sufcient funding would be available.21 For Richter and Wikimedia Deutschland this was a major step, as the planned development team would signifcantly enlarge Wiki-media Deutschland, and necessitate a sudden transformation of the organization, which Richter managed in the years to come

While looking for funding, at least one major donor dropped out because the project proposal insisted that the ontology of Wikidata had to be community-controlled, and would be neither pre-defned by professional ontologists nor imported from existing ontologies

7 EARLY DEVELOPMENT AND LAUNCH

8 EARLY WIKIDATA (2013–2015)

The editor community started rallying around the tasks that could be done with the limited functionality and started forming task forces (later becoming WikiProjects) to collect and expand data around topics such as countries and Pokémon, or to improve the language coverage for certain languages

It has been a challenge to make the idea of a knowledge graph accessible and attractive to an audience that is not familiar with the ideas of the Semantic Web. Data is abstract, and it takes creativity and efort to see the potential in linking this data and making it machine-readable. A few key applications were instrumental in sparking excitement by showing what is and will become possible once Wikidata grew. Chief among the people who made this possible was Magnus Manske, who developed Reasonator,27 an alternative view on Wikidata; Wiri,28 an early question answering demo; and Wikidata Query, the frst query tool for Wikidata.

27 https://reasonator.toolforge.org

28 https://magnus-toolserver.toolforge.org/thetalkpage

WDQS is a Blaze-graph-based SPARQL endpoint that gives access to the RDF-ized version [16, 21] of the data in Wikidata in real-time, through live updates [37]. Its goal is to enable applications and services on top of Wikidata, as well as to support the editor community, especially in improving data quality.

9 TEENAGE WIKIDATA (2015-2022)

10 OUTLOOK

Indeed, even the original concept has not been fully realized yet. The initial Wikidata proposal (Section 6) was split in three phases: frst sitelinks, second statements, third queries. The third phase, though, has not yet been realized. It was planned to allow the community to define queries, to store and visualize the results in Wikidata, and to include these results in Wikipedia. This would have served as a forcing function to increase the uniformity of Wikidata’s structure.

By selecting a flexible, statement-centric data model – inspired by SMW, and in turn by RDF – Wikidata does not enforce a fixed schema upon groups of concepts.

Wikifunctions in turn is envisioned as a wiki-based repository of executable functions, described in community-curated source code. These functions will in particular be used to access and transform data in Wikidata, in order to generate views on the data. These views – tables, graphs, text – can then be integrated into Wikipedia. This is a return to the goals of the original Phase 3, which would increase both the incentives to make the data more coherent, and the visibility and reach of the data as such. This may then lead to improved correctness and completeness of the data, since only data that is used is data that is good (a corollary to Linus’s law of “given enough eyeballs, all bugs are shallow” [54])

Another aspect of Wikidata that we think needs further development is how to more efectively share semantics – within Wikidata itself, with other Wikimedia projects, and with the world in general. Wikidata is not based on a standard semantics such as OWL [22], although community modeling is strongly inspired by some of the expressive features developed for ontologies. The intended modeling of data is communicated through documentation on wikidata.org, shared SPARQL query patterns, and Entity Schemas in ShEx [52]. Nevertheless, the intention of modeling patterns and individual statements often remains informal, vague, and ambiguous. As Krötzsch argued in his ISWC 2022 keynote [32], a single, fixed semantic model could not be enough for all uses and perspectives required for Wikidata (or the Web as a whole), yet some suficiently formal, unambiguous, and declarative way of sharing intended interpretations is still needed. A variety of powerful knowledge representation languages could be used for this purpose, but we still lack both infrastructure and best practices to use them efectively in such complex applications.

Pesquisa de Doutorado da Veronica

Pesquisar este blog

Wikidata: The Making Of - Leitura de Artigo

Marcadores

Comentários

Postar um comentário

Postagens mais visitadas deste blog

Connected Papers: Uma abordagem alternativa para revisão da literatura

Knowledge Graph Embedding with Triple Context - Leitura de Abstract

Exploratory Search: From Finding to Understanding - Leitura de Artigo