How to fact-check research for your reports and presentations

Sunday, April 28, 2024

Research is a valuable tool when you're sizing markets for a pitch or justifying an argument in a blog post. It's easy to Google a few bits of data or copy something from a post on your social feeds. What it gives you is a sense of instant credibility. It must be true because x% of people said it is.

Yet much of this data is outdated, taken out of context, or just plain wrong. You risk damaging your reputation, losing business to more well-informed competitors, and potentially being labeled as a "bad actor" by fact-checkers.

In other words, check before you use a number.

Why you should look deeper before you use a statistic

Before repeating stats out of context, I encourage you to look beyond the headline-grabbing post. Find the underlying source and check what it says.

First, you'll understand the stat's context and how it was obtained. Is it relevant to your market, biased in its sources, focused on a geographic location, or simply out of date?

Second, you might find other insights in the research that are more valuable than a regurgitated headline stat.

Third, you boost your credibility by referencing the primary research and drawing your own conclusions. 

How to find The Source

If the author has done their job correctly, they should link to their source. A simple check that you've landed on the press release, blog post or landing page for the research is enough.

This can be a challenge when a number has become an anchor for SEO. Authors link to other high-ranking articles, running around in circles until the source research is lost.

Fortunately, with a little Google-fu, the source can usually be found by tweaking how you use your favorite search engine:

When information is missing (like who published the research), I'll visit other articles referencing it until I can find someone who has at least mentioned the researcher's name. Another approach I'll try is to set a cut-off date on the search one day before the article was published to exclude it and others who piled in behind it.

These techniques work on Google, Bing and DuckDuckGo.

A word about Statista

A website called "Statista" is often cited as the source when it isn't. The site pulls together research from different sources and sometimes creates its own. Thanks to helpful "share" features, adding a Statista chart to content is easy.

Unfortunately the site doesn't publish its sources without an account. This is in spite of best practices in journalism and research. As such, other people's work is often attributed to them when it shouldn't.

If I see "Statista" as the source, I'll add "-statista" to the search term. This tells the search engine to omit anything with Statista mentioned.

Is it relevant?

With mounting pressure to "post something! Anything!", old stats and bad takes can take on a new life. Inevitably, it's led to out-of-date or irrelevant information being accepted as gospel and circulated widely. The worst example I've seen in recent months is a pitch deck that used market analysis that was more than 5 years old from a different country.

Before committing to a statistic, ask how relevant it is to your objective.

First, check how old the research is. My rule of thumb is to discount anything older than a year and approach with caution stats published over six months ago.

Second, check for geographic relevance. I see a lot of US-centric research used to justify market dynamics in other countries. If you're using data from North America to justify investment in Southern Asia, you will have a problem.

Third, look for bias. This is where research is guided toward a particular outcome. Sometimes bias is inevitable - Brand X is unlikely to publish something claiming Brand Y is better. Other times, it can be in how data sets are selected, such as claiming 98% of people want to "work remotely" when the only people you've asked are remote workers. 

Finally, is the topic relevant? Plenty of numbers are inserted into decks and reports that don't belong there. There's no harm in discarding stats you worked hard to find when they turn out to have nothing to do with your topic.

Generative AI

Unfortunately, Generative AI is not making this easier. Content farms increasingly rely on tools like ChatGPT to churn out hundreds of "SEO Optimized" articles, which are rarely fact-checked and prone to error.

For example, using Bing's COPILOT, estimates for Hydrogen Market size by 2030 utilized data from 2022 and 2021, ignoring more recent studies.

Any time saved in generating content can easily be lost during fact-checking and rewriting when the AI gets it wrong.

Final words: be careful out there

Whether in pitch decks, blog articles, or social media posts, data can add credibility to what we want to say. However, relying on a few numbers copied from a blog or an influencer could be your undoing if the underlying research is old, irrelevant, or biased.

Before you use numbers:

