Three Ways You’re Ruining Your GA Data, and How to Stop: Tips from a Data Scientist

Marketers who want to be more data-driven often start with Google Analytics. It’s ubiquitous, free, and comparatively easy to use.  It’s also the first client data I have access to in many of my consulting projects. Since I see it all the time, I’ve noticed that most companies’ GA is afflicted with serious hygiene problems.

Here are three of the common issues ruining the usability of GA data, and instructions for fixing them.

Issue 1: Your Data is Mislabeled, so You Probably Don’t Know What You’re Looking At

I am an engineer by training and loved physics. One of the most important lessons I learned grinding out problem set after problem set was that I had to keep track of units. Miles-per-hour or kilometers-per-hour? It makes a difference! (May you rest in piece, Mars Climate Orbiter).

In GA, unit labeling is usually pretty bad. Most GA users fail to keep track of units, and they often don’t even know what the units mean.  Everyone can agree that a mile is 5280 feet (as long as we’re talking about a statute/land mile not a nautical mile), but what is a GA “Unique Visitor?”  “Bounce Rate” has a nice ring to it, but how is it different from a GA “Exit Rate?”

This is obviously problematic. If you don’t know how these terms are defined, how can you have meaningful data-driven discussions? Not really understanding your GA definitions probably means you have simply replaced supporting your hunches with opinions and beliefs with support from  ambiguous and (possibly) misinterpreted numbers (which really just boils back down to your opinions and beliefs). Data is worse than no data when misunderstood, because it gives the impression of truth and authority.

The Fix: Make Sure You Know What You’re Measuring

With GA’s great power comes great responsibility.  The first thing you need to do to make sure your units and measurements mean what you think they do. GA helpfully provides definitions for Bounce Rate, Exit Rate, and lots of other things on it’s official website.

Once you understand what each term means, the second step in the fix is to make sure that everyone else using GA with you understands the definitions too.

Issue 2: Your GA Configuration is Wrong

Since GA is ubiquitous, it must be set up correctly on your site! Right? Wrong! You know what they say about assuming . . .

From the GA client data I see, there are two obvious issues with most GA Configuration. The first is that a lot of times, no one knows that GA isn’t configured and set up properly. The second is that when they do know something is off, re-configuring GA correctly is non-trivial, particularly for larger sites.

The Fix: Inventory Your Site Features and Make These Adjustments

Making sure your GA Configuration is correct means taking inventory of the following and then adjusting accordingly:

  • Does your site have multiple subdomains? If so, is GA set up to track these?
  • Does your site have a search capability? If so, is GA set up to track site search?
  • Do you have at least one GA profile totally devoid of filters, so that you collect all possible data? If not, create one—you need it.

Even if these questions scare you out of answering them, your common sense evaluation can still help. By that, I mean:

  • Have you drilled down through GA’s litany of options (Audience, Traffic Sources, and Content come to mind) and asked yourself, do these numbers make sense?
  • Do you see a day-of-week effect in your data (typically, dips on the weekends)?
  • And, finally, from an organizational perspective, is it clear that someone at your company owns GA (even if it’s everyone or no-one)? 

Issue 3: You’re Not Tracking Your Entire Web Site, So Your Data Has Big Holes

GA captures snapshots of a user’s behavior as they path through your online property. An interesting (and sort of creepy) analogy is to think of someone wandering about your house in total darkness.  The only time you know what your visitor is doing or even which room he or she is in is when a flash of light goes off, like a strobe light. GA provides that strobe light for your website.

Every time a visitor loads a new page containing your GA tracking code, the strobe flashes and data is captured. If you want to piece together how users behave, you need to shine as much light as possible on their actions.

The Fix:  Instrument Everything

To monitor site visitors, you’ll need to instrument everything.  Adding your tracking code to every page isn’t hard, although you will probably want to audit your site just to check. However, making sure GA catches every JavaScript action signifying a meaningful site interaction is a bit more tricky.  For starters, ask yourself:

  • Am I tracking JavaScript events in GA? Am I sure about that?
  • Do I have a consistent framework for categorizing and labeling JavaScript events?

In order to get the benefit of GA tracking, you’ll need to answer all three of these questions in the affirmative to track JavaScript events and categorize and label them properly.

So Now Your GA Data is Clean(er)!

Checking your use of GA to identify these issues and then fixing them will make your website data much cleaner, and give you a significantly more complete and reliable picture of your web analytics. This is a great start. If you’re keen to learn more, or you’re are a heavy user or consumer of GA, I can’t recommend Brian Clifton’s Advanced Web Metrics with Google Analytics enough. The latest edition covers the massive V5 update that rolled out in the back half of 2011.

Sean Murphy is a Data Scientist at Johns Hopkins University, an Oxford MBA, and a Data Entrepreneur. He is much too familiar with Google Analytics and likes Empanadas. He also organizes the Data Business DC Meetups, advises several startups, and provides datascience and dataproduct consulting. 


  • Pingback: 10 Must-Read Big Data Articles

  • Javier Rincón

    What debugger do you use for GA on a mac? Charles maybe? Any other suggestions on how to validate GA settings are right and that data collected is “right”?


Our Google+ page