How Web Marketers Should Classify Search Queries

For years Web marketers and search specialists have used a fairly rigid classification system for identifying queries by type. You should have these three types burned into your soul by now:

  1. Informational queries
  2. Navigational queries
  3. Transactional queries

I’ve identified other types of queries, too, including “discovery queries”, “spectator queries”, and “research queries”.

A picture of a grid used to divide queries into revenue or non-revenue by platform.

Web marketers should use their own query classification models instead of the models used by search engineers.

But all of this is wrong for Web marketing. To understand how search engines work you must understand how search engineers think and speak. So you should know the difference between informational and transactional queries, even if things like Featured Snippets blur the distinctions that were once so clear. But none of that actually works for Web marketing.

Web marketing analysts should focus on which queries achieve their goals. A different classification system simplifies the task of winnowing down the types of queries deserving your attention. To some extent both search engineers and Web marketers need to track search behavior by platform (desktop, mobile, tablet, other) but even platform-based analysis of query traffic creates inefficiencies for Web marketing.

The search engine’s context is transitional. The search engine’s purpose or goal is to guide the searcher from a state of ignorance to a state of knowledge (with no guarantee of satisfaction). Commercial Websites are transactional in nature, not transitional. So you must consider search queries in a transactional context rather than a transitional context.

This illustrative table divides all queries into four categories, or quadrants:

MobileDesktop
Revenue
No Revenue

Assume the above grid includes tablets in “desktop” and voice  in “mobile”. The percent of all queries assigned to each quadrant differs by industry and Website. Consumer interests and idiom influence industry quadrant percentages. Search visibility influences Website quadrant percentages.

These four categories represent Quadrants of Potential Value. The intrinsic value of the “Revenue” row is obvious. The intrinsic value of the “No Revenue” row, if there is any, must be assessed in non-monetary terms of “brand value” and/or “good will”.

You can analyze your revenue via similar grid but that’s not the same as measuring potential. Potential can only be estimated and your estimates will always be crude. As an example: You may find a recently published statistic that says “30% of all mobile queries are local”. So how many local queries will you be able to monetize? If the answer is “none” then those 30% of mobile queries go into your “no revenue, mobile” box. Heart-breaking, I know, but the sooner you ignore the mobile queries you have no financial incentive to compete for, the more focused your monetary optimization strategies should become.

While this may all seem obvious, your query classification does not have to be (and should not be) this simple. You can move to the next step and sub-classify the revenue row. You can filter by language, geography, product, industry, etc.

You’re Creating a Data Model

As you refine your understanding of the types of queries to which your sites are dependent upon, you add more rules to your grid. The purpose of using a table of rules is to help you see just how small the piece of the search pie you’re truly optimizing for.

People are easily impressed by large numbers. Sooner or later you find yourself saying, “Well, Google serves more queries every day so I should not worry about Bing, Facebook, Pinterest, or Twitter queries”. But what if your data model shows you that those platforms service more of the revenue-producing queries you target than Google?

This happens in more industries than Web marketers realize. In fact, among people who tell me they only optimize for Google, about 2/3 have never bothered to look at data for other search platforms. They have no idea of how large a market they are walking away from just because they assume that Google’s collective dominance applies in their verticals. You can’t reliably determine a platform’s potential from untargeted referral data or other people’s anecdotal claims.

The approximate 1/3 of Google-preferring marketers who do look at other search platforms tend to base their assessments on PPC performance. They may see better conversion rates on Bing but overall receive less interested (converting) traffic. So at least they have some consistent data to work with.

Without that data your data model is crude, incomplete, and highly inaccurate. It paints a distorted picture of your search context. It’s a self-imposed box, effectively a penalty on your potential revenue built up from deliberate ignorance. You cannot look at search market share estimates, or even your own analytics referral data, and correctly estimate what potential value another platform offers. Because you only optimize for Google your search referral data is biased toward Google. You have no perspective, just a self-reinforcing tunnel of biased data.

Using bad data models cost you money every day, all day long, in terms of lost opportunity because you’re expending your resources inefficiently.

You Do Not Need Data to Create a Data Model

The data model represents an idea, perhaps the idea of what you believe you are working for. We developed the art of data modeling to help us improve our algorithms. Data modeling speeds up the development of large, complex application systems. We can use the data models to set our expectations and thus judge the quality and performance of our applications. Imagine a program that takes a long series of random numbers and reorders them in an ascending sorted sequence. The data model for such an algorithm is pretty simple: it’s a string or chain of random numbers organized into a sequence. There are many ways to represent the data model but most people would be able to understand it without possessing any coding skill.

The data model creates a visual map of what data you want to use and what you want to do with that data. Think of it as a road map for designing processes. The data model shows you what processes are required to achieve the final, desired state of your data.

It’s a bit like an algebraic equation because you have variables for which you don’t actually have any real values.

Before you analyze your query traffic you should create a data model that describes the data you’re using and the processes you use on that data. This way you can filter out the unnecessary query data before you start crunching numbers.

An informational Website that earns passive income uses a different data model from an ecommerce site’s data model. The ecommerce site earns active income through sales to visitors (and in some cases residual income through other offers). A smaller percentage of informational queries will drive sales for an ecommerce site than transactional queries. An informational site is not likely to attract many transactional searchers.

How Do You Categorize the Queries?

Although every “buy cheap shoes” query easily fits into a transactional category, not every transactional query includes trigger words like “buy”, “how much”, “price”, or even “cheap”. You must begin sorting the queries as best you can and look at data for how visitors arrive at and use your site. But a high abandon rate may mean you have a poor value proposition, a broken conversion page, or some other problem. Abandonment in itself isn’t sufficient to help you classify queries for revenue analysis.

On the other hand, you can analyze your referral traffic specifically to identify problems such as high abandonment rates (not necessarily the same thing as bounce rates). If you believe certain products should be selling well but they are not despite receiving a lot of traffic to their pages, the abandonment rate is a red flag waving in your face.

If you’re reasonably confident that you know which of your queries are transactional and which are informational, then begin there. Your confidence may be misplaced but you can test that later and adjust your data models. The important point is to start somewhere without worrying about the accuracy of your classifications.

Looking at the conversion rates for your queries should give you a better idea of where your transactional and information traffic comes from. In general the fewer conversions a set of queries lead to the more likely they will be informational; conversely, the more conversions a set of queries lead to the more likely the will be transactional. Let the data guide you.

How to Prioritize the Query Analysis

Although I divided the query traffic into quadrants in my illustrative table above, if you have the data you can categorize it with more options. Some examples to consider include:

Platforms (columns): Desktop, Mobile, Tablet, Voice, IoT

Value (rows): Revenue, Non-revenue, Signups (or Potential revenue)

You can assign columns and rows in any way you wish but I think it would be easier to create super-rows for Geographical Regions, Major Search Engines, Paid/Unpaid Search, Site Search, Non-search advertising, etc.

Each super-row would be sub-divided into the 2 or 3 value rows I suggest above. In this way you can drill down to the queries the drive the most revenue from each source of traffic, by platform.

You only need to diversify the categorization to the point where you feel you can do something with the data. If you’re not sure of what you can or should do, add another row or super-row. Keep drilling down into the details until you see something that you feel deserves your attention.

You Cannot Do This In a Standard Reporting Tool

Regardless of where you capture your data, you’ll need to export it to a spreadsheet or a database. You need to manually classify the queries according to context because no search does that for you. You’ll have to sort the queries into groups according to the way you tagged them.

It may be worth taking the extra step of normalizing similar queries (treating them all as if they are the same query). Then again, you may inadvertently mask some distinction between two similar queries. In other words, if you normalize your query data aggressively, you may accidentally classify a transactional query as informational, or vice versa. Misclassification distorts your distributions and changes the percentages. Some misclassifications will be trivial. Others will not. Unfortunately there is no simple rule I can share with you to prevent or reduce misclassifications. You’ll have to figure out where everything goes on a per-Website basis.

What You’ll Gain from Query Classification for Marketing

You’ll see more clearly where you need to improve optimization or content coverage. You’ll also identify parts of a Website that only need to be monitored for disruptions in the flow of traffic. As you improve your data models (they will never be perfect) you’ll find it’s easier to identify content that won’t perform as you hoped. You’ll identify new opportunities for optimization, sometimes faster than via traditional methods. And you’ll also see where you’ve been wasting a lot of time and resources.

That last part may be the single greatest value query classification strategies offer you. It’s easy to see where you’re making money. It’s not so easy to see where you’re wasting money. That is because the search engine optimization process is by its very nature experimental. The SEO Method is to experiment, evaluate, and adjust. So you can’t know in advance what won’t work (and therefore wastes your resources). But query classification uses your own referral data to show you that a page or Website is not achieving what you hoped to. You can see the need to adjust or abandon a strategy sooner in many cases.

Time is money for everyone in this business. If you have a lot of query data to work with you are right to ask if it’s worthwhile to set up this kind of data model. My suggestion is to start out small, beginning with a subset of queries you know well. Teach yourself to see the (sometimes subtle) distinctions between query types in your own data.

Take your time to gradually build a more complex, sophisticated data model. The data isn’t going anywhere and your competitors will never have access to it. You can study it at your leisure. Scale it up from “this isn’t adequate” to “I think I can work with this”. Allocate only as much time as you need to understand what you have created a structure for. Don’t worry about the rest until you’re ready to integrate it into your data model.

That should help you manage your time and see a return on investment as quickly as possible. Some people will do better than others, but everyone should be able to adopt this type of analysis with relatively little effort.

Or you can keep trying to optimize for every possible query because you don’t know any other way to do it. I suspect some people will always be satisfied with that.

Follow SEO Theory

A confirmation email will be sent to new subscribers AND unsubscribers. Please look for it!