Iffy.news: An index of unreliable news sites

Misinformation is a thriving internet industry, supported by social media shares, advertising dollars and political donations.

In the United States, scores of research reports try to measure how falsehoods spread online. These studies often require lists of untrustworthy "news" sources. But many of these lists have grown out-of-date and incomplete.

Better data means better results for researchers, reporters and readers. So I've compiled a more complete dataset: Iffy, an index of unreliable news sites.

The index index combines five major lists of unreliable news sites (then removes inactive sites: 404s, domains for sale, etc.). Teams of reviewers curated all these lists and clearly state their inclusion criteria (see Methodology):

Index

Methodology

Our index compiles existing site lists, curated by academic and journalists. For now, we depend on their expertise for accuracy. (A protocol to review and add sites is in the works.)

The site tags above come from those assigned by the original list curators. We grouped their differing labels into our set of six tags.

The combined lists had 1,043 unique domain names. Of these, as of November 2018, the 515 above were still active and another 528 inactive (51 percent), either no longer online or no longer posting stories. We detected inactive sites programatically by retrieving HTTP status codes (404s or 301s), using auto-generated screenshots, and, in some cases, by visual inspection.

We curated the resulting list, trimming it a bit, by removing several sites whose stories, though highly politicized, were mostly not fake: alternet.org, cato.org, heritage.org, nationalreview.com, thedailybeast.com, theintercept.com, thinkprogress.org, and weeklystandard.com. We determined this by checking their stories at PolitiFact and Snopes.

Several sites we reviewed had mostly false fact-check judgments These stayed on the list (links go to examples of their failed fact checks): addictinginfo.org, breitbart.com, dailycaller.com, dailykos.com, and judicialwatch.org

Our Google spreadsheet has additional data: the year of domain registration and the number of scripts each site uses for advertising and tracking (thanks to BuiltWith). There's also a sheet of correlations between factors and averages for individual factors.

Corrections?

If you have additions or corrections, please use this form to notify us. Remember, our list includes only sites whose stories are demonstrably false -- not merely biased or partisan. Send links to fact-checks demonstrating whether the site you'd like us to review publishes fake or fact-based news.