word cloud of 100k top keywords


I’m Dan! I think the SEO discipline is a research based discipline. One of my favorite concepts is Garbage In, Garbage Out (GIGO), which I’m going to link to rather then explain but I still expect you to read it!  Since bad data begets bad research begets bad tactics begets bad outcomes, I think it’s important to have intellectually honest and valid research.

If only our industry was open to peer review. For those interested in peer reviewing other research, this took me ~60 min all in.

Today I’m going to peer review this study put out by Ryan Jones and Sapient Nitro on Twitter and offer up some counter, contradictory and better research.


Here is the study I’m going to review. I’m just going to be upfront, it’s problematic research. Here’s why:

  1. It didn’t engage in basic data processing (e.g. removing stop words and other common words). This means that the most common pieces of speech are going to surface in the research, but not insights from keyword choices. While there were later claims that the stop words were the point, I honestly don’t understand why that would ever be the case. Without more effort by the authors here I don’t think this is a good justification. For theme classification, stop words are useless (this includes things like intent, which is itself a theme classification). Anyway, here at LSG we use the NLTK library to pre-process our data. Removing stop and other common words is a…

