Iffy.news: An index of unreliable news sites
Misinformation is a thriving internet industry, supported by social media shares, advertising dollars and political donations.
In the United States, scores of research reports try to measure how falsehoods spread online. These studies often require lists of untrustworthy "news" sources. But many of these lists have grown out-of-date and incomplete.
Better data means better results for researchers, reporters and readers. So I've compiled a more complete dataset: Iffy, an index of unreliable news sites.
The index index combines five major lists of unreliable news sites (then removes inactive sites: 404s, domains for sale, etc.). Teams of reviewers curated all these lists and clearly state their inclusion criteria (see Methodology):
- BuzzFeed News: Top 50 Biggest Fake News Hits On Facebook — "lists of sites that publish completely fabricated stories."
- FactCheck.org: Misinformation Directory — "a list of websites that have posted deceptive content." (FactCheck was created by the Annenberg Public Policy Center at the University of Pennsylvania.)
- Media Bias/Fact Check: Factual Reporting — a "low" or "very low" rating "means the source rarely uses credible sources and is simply not trustworthy for reliable information."
- NewsGuard: News Website Reliability Index — scores less than 60 (out of 100) "generally fails to meet basic standards of credibility and transparency."
- PolitiFact: Fake News Almanac — "a list of every website on which we’ve found deliberately false or fake news stories." (PolitiFact is a joint project of the Tampa Bay Times and Poynter).
Index
Methodology
Our index compiles existing site lists, curated by academic and journalists. For now, we depend on their expertise for accuracy. (A protocol to review and add sites is in the works.)
The site tags above come from those assigned by the original list curators. We grouped their differing labels into our set of six tags.
The combined lists had 1,043 unique domain names. Of these, as of November 2018, the 515 above were still active and another 528 inactive (51 percent), either no longer online or no longer posting stories. We detected inactive sites programatically by retrieving HTTP status codes (404s or 301s), using auto-generated screenshots, and, in some cases, by visual inspection.
We curated the resulting list, trimming it a bit, by removing several sites whose stories, though highly politicized, were mostly not fake: alternet.org, cato.org, heritage.org, nationalreview.com, thedailybeast.com, theintercept.com, thinkprogress.org, and weeklystandard.com. We determined this by checking their stories at PolitiFact and Snopes.
Several sites we reviewed had mostly false fact-check judgments These stayed on the list (links go to examples of their failed fact checks): addictinginfo.org, breitbart.com, dailycaller.com, dailykos.com, and judicialwatch.org
Our Google spreadsheet has additional data: the year of domain registration and the number of scripts each site uses for advertising and tracking (thanks to BuiltWith). There's also a sheet of correlations between factors and averages for individual factors.
Corrections?
If you have additions or corrections, please use this form to notify us. Remember, our list includes only sites whose stories are demonstrably false -- not merely biased or partisan. Send links to fact-checks demonstrating whether the site you'd like us to review publishes fake or fact-based news.