Hoda Sepehri Rad and Denilson Barbosa. 2012. Identifying controversial articles in Wikipedia: a comparative study. In Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration (WikiSym '12). Association for Computing Machinery, New York, NY, USA, Article 7, 1–10. https://doi.org/10.1145/2462932.2462942
ABSTRACT
Wikipedia articles are the result of the collaborative editing of a diverse group of anonymous volunteer editors, who are passionate and knowledgeable about specific topics. One can argue that this plurality of perspectives leads to broader coverage of the topic, thus benefitting the reader. On the other hand, differences among editors on polarizing topics can lead to controversial or questionable content, where facts and arguments are presented and discussed to support a particular point of view. Controversial articles are manually tagged by Wikipedia editors, and span many interesting and popular topics, such as religion, history, and politics, to name a few.
[O mesmo acontece na WD. Alguns claims estão marcados como em disputa mas outros não tem esta identificação]
Recent works have been proposed on automatically identifying controversy within unmarked articles. However, to date, no systematic comparison of these efforts has been made. This is in part because the various methods are evaluated using different criteria and on different sets of articles by different authors, making it hard for anyone to verify the efficacy and compare all alternatives. We provide a first attempt at bridging this gap. We compare five different methods for modelling and identifying controversy, and discuss some of the unique difficulties and opportunities inherent to the way Wikipedia is produced.
[Comparação de métodos para detecção automática de controversas em artigos da WP]
1. INTRODUCTION
(WP) openness is both essential for quality control in Wikipedia as well as the source of concern with respect to content quality. One aspect of quality is credibility: although Wikipedia is generally trustworthy (because of good intentions of most users and because of a dedicated group of administrators that always monitor edits to prevent vandalism and revert inappropriate revisions), it still happens that some revisions are not necessarily trustworthy.
[Preocupação com a veracidade do que está na WP. Os artigos da WP são em geral confiáveis mas isso não evita as controversas]
Another concern, which is the focus of this paper, is that of controversy. Controversy arises as soon as there are sufficiently different and/or contradictory views about a subject, especially when it is hard or even impossible for one to judge where the truth lies. Controversy is unavoidable, as it comes naturally in many topics such as religion, history and politics. Often, opposing views of editors lead to polarization of opinions, resulting in heated disputes rooted at irreconcilable differences in background, belief and perspective. Note that controversy and trust are not synonyms: while it is reasonable to deem a controversial article as untrustworthy, the reverse is not necessarily the case (as there are other reasons that make an article untrustworthy). Unlike trust, controversy arises from the sequence of actions and edits in the article, and, thus, does not apply to a single revision of an article.
[A controversa pode estar caracterizada no histórico de edições do artigo da WP e não somente em sua versão corrente. Isso reforça a definição de Verdade ser Relativa]
2. BACKGROUND
2.1 Controversy in Wikipedia
The New Oxford American Dictionary defines controversy as “disagreement, typically when prolonged, public, and heated”. In the context of Wikipedia, this loosely translates into articles whose edit histories contain one or more “editwars” amongst editors. Typically, these articles infringe Wikipedia’s Neutral Point of View (NPOV) policy as well ...
[PolÃtica de uma Visão Neutra, entender do que se trata. Se em tudo existe ideologia e viés como seria possÃvel ser de fato Neutro]
[Editing from a neutral point of view (NPOV) means representing fairly, proportionately, and as far as possible without bias, all significant views that have been published by reliable sources. All Wikipedia articles and other encyclopedic content must be written from a neutral point of view. NPOV is a fundamental principle of Wikipedia and of other Wikimedia projects. This is non-negotiable and expected of all articles and all editors.
"Neutral point of view", "Verifiability", and "No original research" are Wikipedia's three core content policies. Jointly, these policies determine the type and quality of material that is acceptable in Wikipedia articles. They should not be interpreted in isolation from one another, and editors should familiarize themselves with all three. The principles upon which this policy is based cannot be superseded by other policies or guidelines, or by editors' consensus. ]
The issues of bias and promotion of individual points of views in Wikipedia have been studied recently. Flock et al. [7] point out several problems such as resistance towards new content from occasional editors, the difficulty in changing the content in stable and mature articles, and cases with strong feeling of ownership and defensive behaviour of some editors, arguing that all of them can affect the diversity and neutrality of points of views in Wikipedia. Brandes et al. [5] consider factors that lead to drop-out of editors and found out that contributing to controversial articles increase the probability of drop-out of editors, where editors stop contributing due to feeling frustrated from involving in long debates, and combating with vandalism and edit-wars.
[Tentar impor uma visão neutra também gera conflitos. Como tratar a diversidade de visões? Editores querem ter "razão"? Querem "ganhar" a discussão? Esta visão neutra estaria pré supondo uma Verdade Absoluta?]
2.2 Problem Definition
This paper provides a comparative study of five different methods for modelling controversy in Wikipedia in a framework of identifying controversial articles. More specifically, for studying these methods, we consider the problem of binary classification of articles, where the goal is to label articles as controversial or not. Our evaluation for this task is based on using articles listed in the list of controversial articles in Wikipedia. We considered articles on this list as positive examples, and used articles that did not appear on this list and did not have any history of having dispute tags or long, debated discussions on their discussion pages as our negative examples.
[Existem uma lista de artigos manualmente classificados como controversos e isso seria um "golden standard"]
By viewing controversy identification as a binary classification task, we need to convert the continuous scores obtained from the output of some of the methods we study. Scores in all of these methods are numeric functions, where higher values indicate more controversy. Mapping continuous outputs to binary outputs is a common problem in the medical domain such as in diagnosing diseases.
[Converter o resultado do classificador que é um score em um valor binário: é ou não é controverso]
3. EXAMINED METHODS
We now discuss the five methods we compare. What is common in all of these methods is that they all rely on simple numeric features extracted from the revision history of the article or article discussion page without analyzing the textual content of the pages.
[Os classificadores são baseado em caracterÃsticas numéricas (embeedings) extraÃdos dos históricos de revisões e não do conteúdo do texto do artigo em si]
4. DISCRIMINATIVE POWER
we compare different models in terms of their effectiveness in distinguishing controversial from non-controversial articles, which we refer to as the discriminative power of the methods. The results we report were obtained on the same dataset of 240 controversial and 240 non-controversial articles used in a previous work
5. TRAINING COST
This section studies the effect of the amount of training data on the accuracy of the methods. The costs of collecting training data and training a model are usually very high as they typically involve human efforts. Therefore, it is natural to seek trade-offs between accuracy and amount of training data. It should be noted that the cost of applying a model is not limited only to the cost of providing training samples and can be extended to the cost of complexity and the availability of required resources to extract features and statistics related to that model.
6. MONOTONICITY
We say that controversy score C(·) fulfills the monotonicity criterion if it assigns less or equal score to an article p if some parts of the article were removed from it. That is, if p′ is obtained by excluding some parts of the content of p, then, C(p′) < C(p) for C to satisfy the monotonicity criterion. The intuition behind monotonicity is that removing some parts of an article cannot increase the absolute global controversy level of that article, as doing so can only remove some of the sources of dispute.
[Mas se é baseado no histórico de edição e não no conteúdo em si????]
7. DISCUSSION
Wikipedia is one of the well-known examples of social media which has been studied based on different aspects in recent years. Analyzing controversy in this medium faces a set of challenges which are more or less specific to this domain. Specially, identifying controversial cases based on revision history of articles requires detecting arguments and opinions that can be expressed in a more implicit way. An editor might express his disagreement by simply deleting some content without replacing it with an alternative, or providing any reasoning for his actions. Also, arguments, debates, and biases can be implicit in the differences of two snippets that are very similar. For instance, a snippet of “Most scholars believe that ...” was changed to “Some scholars believe that...” to change the tone and support of an opinion in a Wikipedia article. These implicit, subtle disagreement cases are in contrast to the different vocabulary usage of different opinion camps that were found in forums or news.
[Conteúdo gerado pelo usuário na WP pode conter controversas em trechos do texto que são removidos ou até mesmo reescritos para acomodar diferemtes perspectivas ou suavizar o discurso. No caso da WD qualificadores podem ser usados para diferenciar o contexto de Alegações potencialmente controversas.]
On the other hand, by allowing public access to log of all activities of editors, Wikipedia (and similar wiki systems) provides a valuable source of knowledge that is hardly seen in other domains.
[O log de alterações é público e dá transparência a existência de controversas, os métodos de identificação são baseados nisto]
Besides the score-based vs. classification-based categorization of controversy models discussed through out different experiments in previous sections, depending on the considered aspects and resources used, these models can be further categorized into the following four groups:
• Meta-driven: the methods in this group rely on extracting a set of numeric, simple statistics from the revision history of the article or/and its discussion page where these statistics can be combined into a score or a set of feature vectors to be learned in a machine learning framework as the meta classifier, or in a rule-based system;
• User-driven: in this category controversy is modeled based on editors interactions and their positive or negative collaborations where the structure classifier, and bipolarity are examples of models building a network of editors based on a notion of agreement/disagreement between editors and extracting a score or set of features respectively.
• Content-driven: The third category of methods are methods modelling controversy by analyzing the content of revisions, comments, or the discussion pages. The content analysis can ignore the semantic by applying simple content analysis such as tracking authorship and deleted words in the revision history of the article like in the basic method. Alternatively, the content analysis can depend on semantic of the text by applying Natural Language Processing techniques such as textual entailment of changed versions, or discourse analysis of the discussion pages which with some recent attempts on annotation of discussion pages [2, 16] seems more practical than before.
• Pattern-driven: the basis of methods in this group is analyzing patterns of edits over a history of revisions. The MR method that looks at mutual reverts in the revision history as sign of edit wars is an example of these methods.
In addition to these possible improvements, the research on analysis of controversy in Wikipedia can be extended by giving more insight about controversy beyond simply identifying controversial articles. For instance, as briefly mentioned in section 6 ranking controversy level of different text units of an article and identifying the most contested issues can be an interesting direction for this kind of work. The opposing views and positioning of editors towards each of these issues can give more insight about controversial topics.
[Os casos controversos na WD teriam alguma relação com as controversias da WP?]
8. CONCLUSION
In terms of monotonicity, we found out that most methods including classifier-based methods did not satisfy this property. Non-monotone behavior of a controversy model can limit its usefulness when in addition to discrimination, ranking of different articles, or different parts within the same article is needed.
Comentários
Postar um comentário
Sinta-se a vontade para comentar. CrÃticas construtivas são sempre bem vindas.