Wikipedia Newspapers

Wikipedia can potentially help researchers and readers determine the credibility of news sources. To that end, the unreliable sources list now links Iffy sites to their Wikipedia articles (if one exists).

About 25% of the total Iffy sites — 50% of those with the highest site ranks — are in Wikipedia, either as an article or in the List of fake news websites.

WikiCred (a partnership of MisinfoCon, The Credibility Coalition, and the Wikimedian community) supported this work connecting Iffy with Wikipedia data. A second part of the project is determining which legit news sources have Wikipedia entries. Thanks to News on Wiki for introducing me to the Wikidata collaborative database and helping me speak SparQL.

The result is:

  • A public spreadsheet listing the news outlets in Wikipedia.
  • This map of newspapers with Wikipedia articles, color-coded by language, if known. (It takes 10-20 seconds to run the Wikidata query and render the map.)

Methodology

The map and spreadsheet display results from Wikidata queries. Wikidata is a collaborative database that stores structured data from the Wikimedia projects, like Wikipedia, Wikiquote, and the Wikimedia Commons.

The Wikidata repository houses what-they-call items: entities, like a person, place, or, in our case, a news publication. Each item has a unique ID and a label, e.g., 'The Denver Post' or 'HuffPost'. Items can be classified as being an instance of another item ( 'newspaper'), which can have subclasses ('daily newspaper).

The map above lists only instances of 'newspaper' (or its subclasses). The spreadsheet also lists instances of 'news agency', 'news magazine', 'newscast', 'news broadcasting', and 'United States cable news' (or their subclasses). Other sheet data includes their website, owner, year founded, and Facebook/Twitter IDs (view JSON | download CSV). Both the map and sheet include only news outlets with articles in the English version of Wikipedia.

The SparQL queries that generate the lists are: