A great new feature, Tag Firing Priority was rolled out inside of Google Tag Manager around July 1 along with the updated and redesigned debug mode. It is seemingly a small feature, located under the ‘Advanced Settings’ in the Tag (see below).
It’s an exciting update not only because of the application of setting priority, but also because it proves the direction Tag Manager has been heading – toward giving marketers and analysts more comprehensive control over the Tags they load on their site. Without any extra coding on the site, users can now control the firing priority of their Tags within Google Tag Manager’s interface.
Priority affects Tags that have the same firing Rule and is especially relevant for sites that have many Tags and third-party scripts like DoubleClick, Bounce Exchange, and search conversions that fire when the page loads. Tags marked with a higher priority are fired first, followed by lower priority Tags.
The elements on a page are not loaded simultaneously (painfully obvious whenever you experience a slow internet connection). A browser typically ‘reads’ the code of a page like us – from top to bottom. That’s why style and script references go in the header, to be loaded first. The Google Tag Manager container is also toward the top of the page, directly after the opening body tag.
Tags loaded from Google Tag Manager are loaded asynchronously. This is a good thing because a slow-loading Tag will not block other Tags from firing. It also means that if Google Analytics or Tag Manager has trouble loading for some reason, it won’t prevent the rest of your site from loading. This is true for setting priority as well- changing the priority of a Tag will not prevent other Tags from firing, even if one of a higher priority does not fire at all.
Testing Tag Firing Priority
So, why is this necessary? How drastically can prioritizing Tags really affect firing order and how can we find out? An experiment!
For this test, we created ten separate Tags with identical firing Rules and edited the priority to compare firing times.
Before setting it up, we thought about how we’d need to look at the data in Google Analytics. To see how long it took from the first Tag to the last, we needed to include the time each was fired. We also need to be able to group them together by visitor, so we took advantage of the GA cookie ID.
Next, I created 10 separate Tags in Google Tag Manager, named “Event – Priority Test Tag – [01-10].” These are Universal Analytics event Tag types and all will have the firing Rule “All Pages” so that they will fire as soon as the container loads. Note that this firing Rule will not wait for all the elements of the page to load, so there may be page elements, other Tags, and other on-page scripts loading while our Tags are firing.
As with any event that automatically fires on page load, I made sure to set Non-Interaction Hit to “True” so our bounce rate wasn’t affected. We also created a Rule to block even-numbered cookies so that the events will fire about half the time.
For the event parameters, the Tag number [1-10] was used as the action and the Macros we created were used as the label.
For the Tag Firing Priority, we set up the Tags below with 1 and 2 having the highest priority, 9 and 10 the lowest, and 3-8 sharing a priority of 0.
So what did our grand experiment tell us?
As expected, the time between the Tags firing was very small. The mean difference between the first and the last was about 50 milliseconds.
The priority feature worked as expected – the first two Tags were always the first to fire and the last two were almost always the last. Here’s an example of what the data looks like in Google Analytics.
Priority actually mattered… a little. While each site is different, we did have a very small percentage of users where we saw a drop-off happening. They were only served the first Tags and presumably left the page before the lower-priority Tags could fire.
The surprising outlier was over 10 seconds between the first Tag and the last. Outliers aside, the high end of the time difference was closer to 1 or 2 seconds. So while this may seem small, keep in mind this graph from KISSmetrics about how page load affects bounce rate, and the difference a single second can make.
How can I use this in my implementation?
When working on a Google Tag Manager implementation, we recommend the following best practice: when there are multiple Tags firing, give Analytics Tags higher numbers in firing priority (over remarketing, for example). We’re not just saying this because we are Analytics consultants and trainers!
While the chance is very small, it is possible that some transaction or conversion data might be missed if a user leaves the page before the Tag is fired. So why not play it safe and make sure Tags related to Analytics and Ecommerce data should be given highest priority? You want to make sure you catch every hit so you know that you can trust your data when you’re reporting on the KPIs of your site.
Often overlooked, Internal Site Search’s importance shouldn’t be underestimated. Recently, as I was exploring our company’s website, I noticed that our internal search results weren’t as helpful as I anticipated.
I conducted a search on our site for “google analytics”, a term very significant to us at LunaMetrics. I was shocked to see that all the top listed results were blog posts.
While blogging is important to us, it’s also important for our visitors to know that we offer trainings around the country and Google Analytics services to clients. All the relative content we had created through our blog was coming back and actually overpowering our other results, hardly an ideal situation.
We, as marketers, do a lot to get people to our site. From search engine marketing to analysis of internal analytics, we make it a top priority to ensure our website is extremely visible across all channels of the internet. Why then does it seem that we tend to slack when it comes to internal search results of our own site?
Not being able to quickly see our Google Analytics trainings after my query was a definite problem. If you’re in a similar position, here’s how I sought out to address it.
Let’s Break Optimization into Two Steps:
- Checking to see if you have a problem
- Addressing your problem areas
Checking To See If YOU Have A Problem
The first step in checking to see if you have an internal search optimization problem is to create a list of terms you would like to test. The easiest way to do this is to export the list of site search terms from Google Analytics.
Once you have a list of site search terms take the most popular (we chose to do the top 25) and determine what your top results are. You can give this project to an intern (bad idea) or do it programmatically like we did!
After reading Michael’s recent blog post and talking over the problem with LunaMetrics’ Jon Meck, we came up with a solution. Using SEO Tools for Excel, we created a site search scraper. This Excel document uses the SEO Tools XPathOnURL function to crawl pages for specific content.
We used it to diagnosis issues with our internal site search, by crawling our internal search results page for the first result that appeared and storing it in the Excel file.
Internal site search varies depending on your website, so you will have to look through your own results in order to figure out where to pull your search results from. For us, we found that the 1st search result was always within the H3 tag. This is what we wanted to concentrate on. Once the file loads and the site scrape is complete, you will have a list of search results.
The next step is the important one. From the list of search results you need to identify problem areas, or areas where you think the top search result should actually be something else. As I mentioned before, our first problem area was “google analytics” which wasn’t returning results for either Services or Trainings. Once you have identified the problem areas, you are going to address them.
Addressing Your Problem Areas
There are many ways to address content problems with internal search. Depending on what you use for site search, you could dynamically change your search results so they better encompass the problem terms, or you could look for a new site search that would allow you to do so.
If you don’t have that option, you can try the method I tested, which involves using Google Tag Manager to fire suggestions every time one of these “problem terms” is queried, containing the information that you deem more relevant to your company.
I am favoring the GTM route because it still works even when you add more content to your site. If you are always adding new content to your site, you never know if something you just added suddenly becomes more relevant than the result you optimized prior to adding the new content.
Here’s the basic gist of how we’ll address the situation. A person arrives on the search results page, we’ll check to see if they searched for a keyword that we’re optimizing for, and if so, we’ll use GTM to insert a suggestion.
Before I discuss how to implement GTM, you should look at your problem terms and group anything that is similar together, as these keywords will become the terms that trigger your Tag Manager suggestion to fire.
After grouped, you need to generate a new message for each group that better promotes the more relevant content. For us, we included a brief description and HTML links to more relevant content. Here is what it looks like inside of the the suggestion wrapper:
We will be using a Lookup Table Macro inside of GTM, so we’re also limited to just 255 characters for this new message.
How to Implement
1. Create a Macro to pull the site search term from the URL. If your site search term doesn’t show up in the URL with query parameters, you may need to find a creative solution here. For us, it was as simple as using a URL macro and entering the site search parameter, “s”.
2. Create a Lookup Table Macro to match the search query term to the suggestion text. Check out this link to automate the Lookup Table Macro.
3. Create a Rule for when the page is equal to the Search Results page and where the Lookup Table has returned a value. This way it will only fire when someone searches for a term that we listed in the Lookup Table.
4. A Custom HTML Tag was made for the shell going around the search term suggestion. I modified the Expand Message example from this blog post about inserting Ads on your site through Tag Manager.
5. The Lookup Table Macro was inserted into the Custom HTML Tag in order to populate the message inside of the suggestion box.
6. Set the firing rule to be the Search Results rule we just created.
7. Lastly, you’ll want to add some sort of tracking to this, to see if your optimization was worthwhile. For us, we used a class on the links and a Link Click Listener to fire a Google Analytics Event every time someone clicked on our suggested links.
Make sure you test and debug as much as needed to get this box to look right, then Publish to your site!
After successful implementation of this Google Tag Manger Suggestion Box, here is what it looks like now when I search for Google Analytics.
Good luck with your own optimization and let us know what you are doing to stay on top of your internal site search in the comments below!
Have you ever tried to use the “plot rows” feature in Google Analytics and it literally falls flat?
It happens because you can’t keep the chart from graphing the metric total. That thick blue line across the top of your chart flattens everything else. It keeps the size of the chart static, rendering it useless.
Wouldn’t it be great if you could graph only the rows you want and the chart would dynamically resize?
Here’s the key to turning those flat, plotted rows into dynamic data visualizations: motion charts.
I used to think that motion charts were all flash and no substance, and then I found out they were more than a bunch of colorful, moving bubbles. Motion charts deliver insight “in motion”, and they plot rows better than “plot rows” can.
Resizing data is as easy as Alice’s “Drink Me” potion. Read on to find out how it works.
The Problem: Flat Data
In case you’ve never seen this before, let’s look at an example. Suppose you want a visual comparison of transactions for the first two weeks of July, when you targeted users from four states. You go to the Audience > Geo > Location report and drill into the row for United States.
Here’s what you get if you tick the check boxes next to each state and then click “plot rows”: four very flat lines.
The blue line (total transactions) controls the size of the chart and cannot be removed.
The Solution: Plot Rows with Motion Charts
With a few simple steps, you can change the chart to show exactly what you want.
Step 1: Decide how many rows to show.
Change “show rows” so you can see everything you might want to compare. If the states you want are already visible in the top 10 rows, you can skip this step.
But if you need states past the top 10, or you need to compare other states, too, go ahead and show more rows now.
Step 2: Click the motion charts icon.
The motion charts icon has three bubbles and is just above the chart area, on the far right.
Step 3: Select the line graph tab.
Change from bubbles to lines by choosing the tab with the line graph in the top right corner of the chart.
Step 4: Change opacity settings (wrench) to 0%.
Click the wrench just below the chart, on the far right, and drag the slider all the way to 0% opacity. When you start selecting states, all the other states will appear to vanish, leaving you with just the data you want.
Step 5: Select your metric on the left side of chart (sideways).
In this example, you want to change the chart metric to Transactions. The chart metric is the one that appears sideways on the left side of the chart.
Step 6: Click states to compare.
Ready for the good stuff? Start clicking the states you want to compare. What a difference! Each trend line, and several daily spikes, are now clear.
Step 7: Curiouser and curiouser!
What more will you discover, now that the basic setup is done? Try these:
- Hover for details or to focus on one trend line
- Switch back and forth quickly by checking and unchecking rows
- Motion charts are not limited to 6 lines (plotted rows are)
- Magnify lower trends by switching from linear to log scale!
Campaigns, keywords, landing pages, products, articles… what would you like to visualize? Have you used motion charts like this before? Do you have tips to share? Let me know in the comments.
You know what’s been grinding my gears lately? No matter how long I’ve been in the search field, or what happens out there in the industry, some myths continue to persist. Wishful thinking? Lack of education? I say both.
Let’s clear up some common misconceptions with the help of some industry experts from Google+. If you’re an average web user, Google+ probably doesn’t have a place in your life. However, I’ve found it to be a thriving locale for search industry discussion! Add one of these experts to your circles and join the conversation today.
Myth 1: Social Media EQUALS Higher Rankings.
In reality, social media LEADS TO higher rankings. There is a big difference between correlation and causation, which is what study after study proves and Google confirms.
I urge people to think about the process instead:
More shares -> more eyeballs -> more link opportunities -> more links and digital authority -> higher rankings.
But racking up tweets in the name of SEO is not going to take you far until search engines find a consistent way to monitor social signals. Eric Enge of Stone Temple Consulting has some thoughts:
Myth 2: SEO is JUST Links and Words
This myth is a product of SEO evolution because it was not long ago that this industry was built on links and words. Authority from links and keyword research are still essential components, but Google looks to established brands more than ever, and that trend promises to continue.
Want to see what I mean? Search for “masters of public health.” The top 10 pages do not have the most links and probably performed zero keyword research. But do you recognize any names? Harvard, Johns Hopkins, NYU. Investing in your brand will help you more today and tomorrow than building cheap links.
Building your business has never been more in fashion. Rand Fiskin of MOZ weighs in on this idea in the video below. SEO is no longer a few simple strategies.
Myth 3: Technical SEO is Dead
Google and Bing have invested heavily in their Webmaster Tools services to make technical SEO easier for every site owner, but that does not give you permission to forget about redirects, canonical tags and indexation. One-third of my sales calls are with companies after a redesign goes wrong and search traffic flatlines.
It is still important for every webmaster to understand the basics of technical SEO, if only as an insurance policy. Dan Shure of Evolving SEO shared one of our pieces on the great SEO uses of Google Webmaster Tools. Make sure to follow him!
Myth 4: Yeah I Did All of That SEO Once, I’m Done Now
Reality, you’re never finished. This is a world of curve-balls and current best practices because nothing stays still. Take Google Authorship Photos for example, one second the pros are telling clients to add it for authority and higher organic click through rates then the next they are gone and strategies are shifting.
See Luna’s own Andrew Garberson’s take on Google removing Author photos from SERPs:
Myth 5: Paying for AdWords GUARANTEES Higher Rankings
This is an oldie but a goodie. There certainly ARE benefits to using Paid Search for data and coverage during your organic campaign, for instance, reviewing your Matched Search Query report and The Paid/Organic Report leap to mind, but this myth definitely doesn’t hold water.
Here Barry Scwartz shares a video of Matt Cutts’ (Google’s Webspam Lead) favorite myths.
Myth 6: Your Measurement Strategy Will Never Change
We are creatures of routine, and that can be to our detriment in the search game. Monitor, analyze, repeat. But then, issues like (not provided) happen. Lindsey Wassle shared a great new feature in MOZ to help with that. Your measurement strategy should reflect the best data you have access to and should be fluid.
Add me on Google+ and let me know what myths and misconceptions you debunk for businesses.
A fellow LunaMetrician recently returned from SMX Advanced and said it was refreshing to hear how much user experience (UX) and conversion rate optimization (CRO) were included in the SEO conversation this year.
The days of simply ranking for a high-volume keyword or getting visitors to the site have been eclipsed by metrics that more closely resemble offline business objectives. Now SEOs think in terms of sales leads and keep a close eye on landing page bounce rates, conversion rates and direct impact to the bottom line.
But before diving into the world of A/B and multivariate testing, it’s crucial to know where you stand. This 7-minute UX audit for landing pages should be the first step.
Which pages should I test?
Landing pages are (in a simple sense) places on your website dedicated to welcoming a visitor and efficiently completing a goal, or converting. If your website is designed to capture sales leads, these might be service pages, case studies or other informational pages that allow people to contact you to continue the sales process.
The UX Audit
This is the UX audit. Open your landing page(s) in another tab or window and answer the following seven questions. Remember: objectivity is key.
1. Digestible Text
Too much text and someone won’t even start the first line. Too little and they won’t receive enough information to complete the goal. Let’s see where your landing pages lie in that spectrum.
2. Strong Calls to Action
Call-to-action statements are short phrases near the conversion button or link that tell visitors what to do. “Learn more today!” might sound cheesy, until you see how many more people do click to learn more.
Harvard recently redesigned its online courses page after UX testing revealed that only a small fraction of visitors clicked on links below the fold. And that’s Harvard, where students would kill for an opportunity to take an Ivy League course. Chances are your clients are not beating down the door in the quite the same way.
4. Easy on the Eyes
Landing page graphics provide a welcoming first impression and help visitors decide whether or not to read the accompanying text. The right image can make all of the difference, whereas the wrong one, well, you get the idea.
Not every industry or company or product needs to position itself for mobile traffic, but conversion rates from mobile devices are rising in many sectors, so it is at least worth asking the question: Are my landing pages mobile-friendly?
Site operators act as quick indicators that search engines can crawl and index pages on the website. Copy your URL and paste it in Google with “site:” immediately before it. For example site:http://www.domain.com/folder/page
* Note: PPC-only landing pages are sometimes designed not to appear in the search results.
7. Page Speed
Scoring Your Results
12 points is a perfect landing page score, meaning that >9 is good, 6-9 is weak and <6 is, well, time to dedicate some attention to landing pages in need.
Other Landing Page Resources
Did you ever want micro-level geographic information inside Google Analytics? What if you really need “street level” knowledge about your users; like where are they, what neighborhood are they in? Often, when we talk and write about Google Analytics we’re thinking about the big guys. National or even International traffic, filtering by country, comparing one region to another. We’re thinking macro, not micro.
I wrote previously comparing DMA areas to gain insight, but that’s really only helpful if you have a true national or bigger presence. What if you’re just a local Seattle business, and don’t really have much call for looking at traffic outside the Seattle-Tacoma metro area?
Well, first thing you should do is think about taking our Seattle Google Analytics, AdWords, and Tag Manager Training (shameless plug). Second, read on…
Seattle is actually ahead of the game when it comes to data, which is the real reason I’m using them as an example. The city has a Chief Technology Officer, and data.seattle.gov was started in 2010 as a central hub for all local Seattle data. In fact, a number of businesses claimed that the use of this local data helped them with their businesses.
How so? Well, if you’re a local business then the traffic from, and information about, the Queen Anne neighborhood of Seattle might be more important to you than Downtown or Riverview.
But how can you use Google Analytics to help you on this sort of granular level? Also what if you DO care about national level data, but you care about it on a very granular local level as well, maybe looking for interest in your brand to help place billboards, or expand your franchising? The truth is that you can’t, at least not right out of the box. But with a few very easy additions, you can start getting some great local data that can let you make street level decisions about your business in Google Analytics.
Last month I talked about how you can use APIs to add insight to your data. I received great feedback on the post, and more than one person asked for more specific examples and ways that they could use APIs to expand their data in Google Analytics.
One of the things I mentioned in the previous post was using the Geocoding API to get more information about a person’s location from their address on a form, but there’s actually a cooler way to do it.
HTML5 has a Geolocation feature built in, which can grab a user’s latitude and longitude without needing to ask for their address or any other input from the user (other than clicking ok.) First, let’s look at what you get currently if you want to look at Seattle in Google Analytics. One close up is the Metro map.
If you are a local business this is next to useless. Maybe you could care specifically about the metro and have a view focusing on that, rather than random traffic from elsewhere who will never come to your storefront, but it’s still very broad. So you can instead drill down to the city level, and look at actual traffic from within that Metro.
Once again, would it be a huge surprise to anyone that the traffic from Seattle far outweighs the other cities and towns on the map? If you’re looking for specific insight into the people in your city, and within different neighborhoods, this is again, fairly useless. If you’re a restaurant maybe you care about traffic from a few specific neighborhoods more than others. Maybe your own neighborhood traffic is more important. Maybe you’re a huge mega conglomerate and you want to see where people are searching for hardware supplies so you can gain insight on where a great new mega hardware store location in Seattle might best go. This map does not help you in those questions.
Enter the HTML5 Geolocation feature. It’s built into most modern browsers such as IE9+, Chrome, Firefox, Safari, Opera, and more. You can easily query the location information with a few lines of code and learn the user’s position. (I’ll show how in a bit.)
It works on desktop and mobile, and uses various ways to nail down where the user is currently browsing your site to a fairly precise and usually accurate location. You’ve probably seen it in action: Ever been on a website before and have it ask for your location, and then you can hit ok or decline it? That’s this feature.
That’s also the one major caveat involved with it. It MUST throw that alert question to users, and they MUST agree to share their location, and in general you have no say in what that prompt says. Otherwise you don’t get it. Consider your particular site and if this may have an effect on conversion. You WILL lose some people when you pop open this feature. How many depends on your site, and your audience. It’s a number you can figure out relatively easily by testing it. Just be aware that it could affect your conversion, and those that stick around, might decline to give you their location.
But what if you ask, and they allow? Well you’d suddenly get the latitude and longitude of the users who approve of its use.
Our map of Seattle just got a lot more interesting using a great map tool provided by Darrin Ward. Instead of a big bubble with the number 521, we get a large number of coordinates which we can easily throw on a map of our own. Now we can start looking for neighborhood level patterns instead of just city or metro level ones.
For instance, maybe there is a ton of activity coming from the Queen Anne neighborhood, much more so than from other areas. Maybe this makes sense because that’s where your restaurant is, or maybe it means you want to think about putting a new store location there, or that your billboards there are working. Who knows what the insight is, that’s up to you and your business.
The point is that now you can start focusing on a neighborhood level, even a street to street one, in a big city, rather than just throw your hands up in dismay at a blob on a map. This information is just the coordinates exported though completely outside of Google Analytics, we can also make it easier and more readable in Google Analytics (though not on a map inside GA) by leveraging another API like one from GeoNames to grab the Neighborhood names, and their Zip Codes.
Import into Google Analytics
Now we’ve turned those coordinates into the actual neighborhood names and postal codes, and passed them into Google Analytics as Custom Dimensions. What makes postal codes interesting is that they are also targetable in Google AdWords. Thanks to this geodata now a local paid search campaign can target on a zip code level with actual data in Google Analytics backing them up.
Based on these numbers it looks like the 98119 zip code is where most of the action is at. You can track this in various ways, for instance you could send an event as well and have the label set to be the zip code. I’d recommend using an event anyway, just to ensure the API calls have been completed and to make sure you don’t hold up your primary tracker.
Or you can pass in the coordinates as a concatenated item (rounded down a bit to preserve a smidgeon of location anonymity for the paranoid and litigious… 3 decimal places keeps you within about a block). This list also makes it easy to export out the coordinates for placing in a map such as the one above.
How to Implement
So what does this look like in code? Like this (or thereabouts):
Ok, so here’s what we did:
- We set up 4 session level custom dimensions in Google Analytics (in our case dimensions 10-13) and named them Latitude, Longitude, Neighborhood, and Postal Code.
- We put the script above on the first page that users would hit.
You don’t really need to do anything else, besides obviously making sure that Universal Analytics is already on your page. When the user hits OK, then the browser immediately has access to their coordinates. Our script then passes these to a couple of APIs, which return the postal code and neighborhood name. The script then passes this into Google Analytics via custom dimensions and firing an event.
As I mentioned above, I recommend separating this from your main tracker. Sometimes it takes a bit for the Geolocation to return information, or the APIs to send back their results. It’s best to track the pageview first on the page separate from this code, then send an event when you get information back from the API calls to set those session level dimensions. If you try and tie it to the pageview you’ll either end up holding up the pageview and losing tons of your tracking, or you’ll set up a race condition, and get zero of the geolocation information. So don’t try doing that.
Also, I don’t recommend doing this on every page on your site. Once you’ve gotten and set this data, a good option is to set a cookie on the user session. If the cookie is present, then you don’t execute the script. You don’t need to be processing this and hitting the APIs on every page. If you have a huge site, this also might overload some of the free APIs or ones with service limits, and they’ll shut you down. You’ll need to come up with a modified solution to take your traffic levels into account.
On the whole though if you can get this information, it can be very useful for lots of different businesses. It’s like taking the Google Analytics geo reports and applying another level of magnification onto it (micro not macro, get it?). Plus it’s insanely easy to implement.
What do you think? How would this apply to your business as a local business, or a national business? Are you doing something similar on your site already? What other insight do you think this local drilldown could provide on yours? Comment below and let me know what you think!
By far the most common issue I’ve come across with ecommerce sites; duplicate transactions can inflate revenue and ecommerce metrics, altering your attribution reports and making you question your data integrity.
When talking about where to put the ecommerce tracking code, Google suggests the following for Universal Analytics:
… If successful, the server redirects the user to a “Thank You” or receipt page with transaction details and a receipt of the purchase. You can use the analytics.js library to send the ecommerce data from the “Thank You” page to Google Analytics.”
The missing step here is to ensure that either A) the user cannot access the page more than once or B) you have logic in place to make sure the transaction is only sent once. The biggest issues I’ve seen are when this receipt page is automatically emailed to the customer, with the ability for them to return as frequently as they please, each time sending a duplicate transaction.
Many people incorrectly assume this is something that is handled through Google Analytics processing, or that if it does occur, it is a bug. In reality, it is simply an implementation issue and one that is often overlooked.
Within a session, Google Analytics will filter out duplicate transactions provided they have the same information. But if a visitor comes back later that day, or two weeks later, and another transaction is sent, then these will show up in your reports.
Check Your Data
How do we check if this issue exists in your ecommerce data? There’s a fairly simple Custom Report you can create to check for this. I’ve created a template which, if you have the appropriate permissions, you can attempt to import via this link or you search the Solutions Gallery for “Duplicate Transactions.” If you cannot import the Custom Report for some reason, simply create the report yourself with this setup (jpg).
Adjust your date range to at least a month. If you have Transaction IDs that have multiple transactions, then you’re either A) sending in duplicate transactions or B) reusing transaction IDs, both of which should be corrected.
When Do Duplicate Transactions Occur?
The following scenarios are the most likely culprits for sending in the duplicate information:
- Returning to the page via emailed link or bookmark
- Refreshing the page
- Navigating to a different page, and returning via back button
- Page restoring from a closed browser session or on a smartphone
As you implement your solution, try checking each of these scenarios to make sure you’re completely covered!
Server-Side Is Better
There are several schools of thought around fixing duplicate transactions. My background is more on client-side implementations, specifically with Google Tag Manager. However, in general, if you have the resources and time to spend, I would recommend handling this issue server-side.
Without going into specifics, I would add in some sort of server-side logic to ensure that the ecommerce analytics code is only delivered once to the page. This could be using a database to record and check to see if the ecommerce info has already been sent.
It could also be some sort of server-side variable that is similarly checked. Another option I’ve seen is to redirect the user away from the receipt page after the ecommerce info has been sent to Google Analytics, then preventing the user from returning to that page.
Sometimes a page refresh doesn’t require fully reloading the page from the server, however, so make sure to test all of the above scenarios.
A Two-Pronged Approach
Not all of us have access to the server though, and sometimes we just need a solution. My tactic for dealing with duplicate transactions uses two different methods to attempt to determine if the transaction has already been sent.
- A browser cookie records the transaction ID
- A timestamp on the transaction serves as a backup
Cookies by themselves can filter out most duplicate transactions, but can be less than 100% effective due to privacy settings and user preferences. Someone can clear their cookies, browse in incognito mode, or pull up the same receipt on two different devices.
For that purpose, I also use a timestamp to help determine how old the transaction is. This timestamp should come from the page immediately before the receipt page, so very little time should pass. We can set this to be 15 or 30 minutes to be safe, just in case there’s some kind of validation check or third party system before they hit the receipt.
Here is the general user flow that we’ll follow. We’ll check to see if a cookie with this transaction ID exists. If it does, then we know it’s a repeat transaction, and we won’t send the ecommerce information to Google Analytics.
If there’s no cookie, we’ll check the timestamp. If there’s no timestamp, then we know it’s days or weeks old, from before the date we put our new process went into place, so we’ll label this as missing.
If there is a timestamp, how old is it? If it’s more than 30 minutes old, then we’ll assume it’s an old transaction and we’ll label this as expired.
Lastly, if there’s no cookie and it’s been less than 30 minutes, we’ll call this a new transaction. We’ll set a new cookie on this computer and then proceed with the checkout as normal.
Stopping Duplicate Transactions via Google Tag Manager
To get the full functionality of this solution, you will need access to update the site or have a developer you can call to help you get the timestamp into place.
We will need to create the following inside of Tag Manager:
- MACRO – Data Layer Variable – “transactionId”
- MACRO – Data Layer Variable – “timeStamp”
- RULE – “Receipt Page – Transaction Present”
- TAG – Custom HTML – “Duplicate Transaction Checking”
- MACRO – Data Layer Variable – “transactionType”
- RULE – “New Transactions Only”
- TAG – Google Analytics – “GA Ecommerce Transaction”
MACRO – Data Layer Variable – “transactionId”
This macro will simply return the transaction ID from the correctly formatted data layer on the receipt page.
MACRO – Data Layer Variable – “timeStamp”
This will be something we need to add to the site itself. We need to get the timestamp of the transaction, either from some server-side code, or by passing this value through the submit form. Either way, this piece does require you to update the site itself.
Then we can use a simple macro to pull out this value.
RULE – “Receipt Page – Transaction Present”
This rule checks to see if we’re on the receipt page and if there is a transaction present.
TAG – Custom HTML – “Duplicate Transaction Checking”
Here is where the magic happens. This Custom HTML will take care of all of the work, checking for cookies, setting cookies, and checking the timestamp. The result is then pushed to the data layer with a custom event.
MACRO – Data Layer Variable – “transactionType”
Now that the Tag has checked if the transaction is a duplicate, we’ll use this macro to pull out the result.
RULE – “New Transactions Only”
We’ll create a rule that uses the event and transaction type that gets pushed from the Duplicate Transaction Tag.
TAG – Google Analytics – “GA Ecommerce Transaction”
Finally, put it all together with a Google Analytics transaction Tag, with the Firing Rule set to “New Transactions Only.”
That’s all there is to it! It’s a little complicated, but when you break it down step by step, it should make sense logically. I would recommend setting up a test property to send these ecommerce transactions to until you’re sure that this is working properly, then, make the switch at a time when there are few people using the site.
Questions/comments? Did you have duplicate transactions on your site?
If you haven’t heard about Google AdWords Remarketing by this point (1) get out from under that rock and (2) get back to the basics. Generally speaking, advertisers place a tag of some sort on their website. When a user reaches your website, the code snippet is fired on each pageview and subsequently cookies the user’s browser. The advertiser then creates audience lists which reference these cookies by defining particular eligibility conditions.
If you’re familiar with Remarketing then you might want to consider your options moving forward. Perhaps there’s a different implementation that is more appealing to you now. Maybe you didn’t even know that you have options! Check out the list below to learn what you can do.
Many of the options in the overview below are closely related and it’s very important that you understand your current setup before implementing any new Remarketing tags on your site. Consider each of these before you implement any new Remarketing or make sure that you weigh your options before choosing one over another. Of course, you should always review code changes with your development team before changing any code on your site.
Google AdWords Remarketing
This is the most traditional Remarketing setup. Advertisers generate the Remarketing pixel in the Audiences section of their account. The Audience section can easily be found in the Shared Library in your left-hand navigation.
After you generate the code, log in to your website’s CMS and paste the Remarketing tag before the closing </body> tag on EVERY page across your website. The best way to do this is to paste the tag into a common footer element, but if you do not have one you will need to edit each page individually.
Once you’ve completed these steps you are ready to start building audiences in Google AdWords. If you’re new to Remarketing you may want to review these essential rules.
Google AdWords + Google Tag Manager Remarketing
This option is only available if you have Google Tag Manager implemented on your website (or if you intend to do so in the future). Google Tag Manager enhances the marketing-development process by allowing quick edits to small snippets of code on your website – in this case, Remarketing tags.
I can’t count the number of times that I’ve had marketing projects held up because of a development issue. Thus, Google Tag Manager. Learn more about Google Tag Manager.
Set up Google AdWords Remarketing via Google Tag Manager:
And, just like that, Remarketing is now implemented across all pages of your website. I bet after doing that brief exercise that you can see how easy and flexible Google Tag Manager really is.
Why stop with just Google AdWords and Google Tag Manager? Their big brother, Google Analytics, also wants in on the Remarketing action.
With some small changes to your Google Analytics tracking code, you can take the Remarketing experience to the next-level. Imagine building audiences based on almost any dimension or metric available. Well you can realize that dream with Remarketing through Google Analytics. There are almost infinite possibilities that can be created with Remarketing through GA.
There are currently two ways we can enable Remarketing in Google Analytics. The first method, as explained by my captious colleague Sayf Sharif, requires users to upgrade tracking code to the dc.js variation. We won’t go into much detail here as Sayf has provided a thorough explanation.
Option two, and the better of the two, is to upgrade your Google Analytics to Universal Analytics. Universal Analytics has Remarketing built-in, so if you’re looking to upgrade soon (and we’d recommend upgrading soon) this might be the best option for you at this time. If you’re looking for reasons to upgrade, view this handy overview made available by another of my GA cohorts, Jon Meck.
Universal Analytics & Google Tag Manager Remarketing
Finally, if you’re looking for the most streamlined and sophisticated option out there, implement Universal Analytics via Google Tag Manager to start off your Remarketing strategy. This option will provide you with all the power of Google Analytics Remarketing lists while still providing the opportunity to take advantage of the AdWords/GTM combination in the future if you see fit.
To get started with Universal Analytics and Google Tag Manager, you will create a Universal Analytics tag type in GTM and publish it across all pages of your site.
You’ll of course need to fill in the important details like Web Property ID, etc. If you currently have Google Analytics on your site, then you’ll also want to plan this transition carefully. Check out Alex Moore’s Universal Analytics Survival Guide for thoughts on making the switch.
Of course if you need assistance with your Universal implementation or advice on which form of Remarketing to use, you can always contact us.
Which Remarketing option is right for you? Discuss in the comments below.
You’ve heard the term “statistical significance”. But what does it really mean? I’m going to try to explain it as clearly and plainly as possible.
Suppose you run two different versions of an ad, and you want to know if the click-through rate was different (or you are comparing two different landing pages on bounce rate, or two campaigns on conversion rate). Ad A has a click-through rate of 1.1%, Ad B is 1.3%. Which one is better?
Seems like an easy answer: 1.3% > 1.1%, so Ad B is better, right? Well, not necessarily.
Consider a quarter
Suppose you have a quarter (and it’s a fair quarter, no tricks). The rate of getting heads when you flip should be 50%, right? If you flipped the coin an infinite number of times, you could expect it to come out heads half the time. Unfortunately in web analytics, we don’t have time to flip the quarter an infinite number of times. So maybe we only flip it 1000 times, and we get 505 heads and 495 tails. Do we conclude that heads are more likely than tails? What if we only flip it 100 times, or 10?
You can see that sometimes, the difference we measure is merely due to chance, not to a real difference.
When we measure something, there are two ways we could be wrong:
- There could be no difference between A and B, but we think we see a difference (false positive)
- There could be a real difference between A and B, but we fail to see it (false negative)
The significance level says, what’s the chance of making a false positive? That is, how likely is a difference we saw due merely to chance? This number is expressed as a probability (between 0 and 1, or a percentage between 0% and 100%) and is referred to by the letter p.
In our quarters example above with 1000 flips, p = 0.76, meaning there’s a 76% probability the difference was due merely to chance (we’ll see how to compute this in a minute). That’s pretty high! You get to decide what a low enough chance is to be comfortable with, but a common choice is p < 0.05, meaning there’s less than 5% chance we’re wrong (or we’re wrong less than about 1 time in 20). Given that standard, we would say the difference in the rate of heads and tails is “not statistically significant”.
So, statistical significance is related to the chance of a false positive. What about false negatives? There’s a related concept called the statistical power or sensitivity, which can help you estimate up front how large a sample size you need to detect a difference (how many times you need to flip the coin). Statistical power is a bit more complicated, so we’ll save it for another time.
Doing the calculation: the chi-squared test
OK, so how can we find the p-value for our ad test? We can use a statistical test called “Pearson’s chi-squared (χ²) test” or the “chi-squared test of independence” or just “chi-squared test”. Don’t worry; you don’t actually have to know any fancy math or Greek letters to do this. (To sound smart, you should know that “chi” rhymes with pie and starts with a “k”. Not like Chi-Chis, the sadly defunct Mexican restaurant.)
The chi-squared test only applies to a categorical variable (yes/no or true/false, for example), not to a continous variable (a number). Although you might look at your web analytics and say, “I have all numbers!”, in fact many of your metrics are hidden categorical variables. Click-through rate is just a percentage measurement for a yes/no situation: they clicked or they didn’t. Bounce rate: they bounced or they didn’t. Conversion rate: they converted or they didn’t. And conveniently, those are probably three of the most common metrics you’d want to test.
To perform the chi-squared test, you need the number of successes and failures for each variation. This makes what’s called a contingency table, something like this:
Here’s an easy web page where you can fill in your categories and numbers and it will calculate the chi-squared test and give you the p-value. (Remember, lower is better for the p-value, and you should pick some threshold like p < 0.05 that you are going to consider as significant.)
You can also do this in Excel using the CHISQ.TEST function, should you need to.
Either way, the results you get will be something like this:
The “chi-squared statistic” and “degrees of freedom” are just the values the fancy (#notthatfancy) math of the test uses to calculate the part we really care about: the p-value. In this case, you can see that if my ad test was based on only 1000 impressions for each of Ad A and Ad B, p = 0.68. Not significant at the p < 0.05 level.
If my ad test was instead based on 100,000 impressions for each of Ad A and Ad B, p = 0.00004. Definitely significant at the p < 0.05 level! The number of times we flipped the coin makes a big difference.
Be careful of repeated comparisons
Remember, p = 0.05 would mean we’d see a difference where there actually isn’t 1 time out of 20, on average. You need to be careful of this if you do repeated comparisons on the same data. For example, suppose we throw Ad C into the mix in addition to Ad A and Ad B. Now we can make 3 comparisons: A vs. B, A vs. C, and B vs. C. But we’ve also compounded the chances we’ve made an error. If we made 20 comparisons, we’d likely be wrong on one of them. (That’s slightly simplistic, since it depends on whether there really are differences between the ads and how big the differences are, but you get the idea.)
You can take care of this with the fancy sounding “Bonferroni correction”, which is actually very simple. It says this: if you are making n comparisons, divide your p-value threshold by n. So if we were looking for p < 0.05 but we were making 3 comparisons, we’d divide 0.05 / 3 = 0.017, using that as our threshold for significance.
*Really the only test I need to know?
OK, so no. The chi-square test only works for categorical variables, not for continuous variables like pages per session or time on site. So for variables like those, you probably need a t-test instead (we’ll save that for a future installment).
And, if you get enough statisticians together in a room, they will tell you about all sorts of ways your chi-squared test is inadequate. You should use the Yates correction, or a G-test, or Fisher’s exact test. You need McNemar’s test for repeated observations on the same subjects.
To this I say: yeah, yeah, yeah. Fisher’s exact test is better, but really only matters for small sample sizes (like, say, 20). What are you doing testing ads with only 20 impressions? We’re not looking for cancer in rats here. You’re wasting your time. Repeated observation with the same subjects? Again, maybe in a clinical trial, not so much in web analytics. Your chi-squared test is just fine, plus it has the advantage of being widely known, used, and understood, so you can communicate the results to others without spending time haggling over your statistical methods and just get to the point.
In any case, I hope this has been an illuminating look at statistical significance, so that you can finally start to compare those numbers in your web analytics and say, are these really different?
MozCon is a three day marketing conference put on by Moz.com. The conference brings together next-level speakers to talk about everything from SEO to brand development to analytics. This year Erica McGillivray and team will bring 29 speakers to the Emerald City to give their expert opinions on the future of marketing. It is a jam packed three days, so I have outlined eleven of the people I am most excited to see along with some of their own reasons you should watch them.
When: July 14-16, 2014
1. Kerry Bodine – Broken Brand Promises: The Disconnect Between Marketing and Customer Experience
Description: Companies chase the business benefits of customer experience, but advertising and marketing communications that aren’t aligned with the true capabilities of the organization foil these efforts.
Reason to Watch: She applies her knowledge of human-computer interaction to design interfaces for websites, mobile apps, wearables, and robots and aligns it with marketing communications. Very cool. Given the increased focus on mobile this year, I will be interested to see how much of her presentation revolves around mobile and cross-device experiences.
Can’t Make MozCon? Check out her presentation for free on July 9th through EventBrite.
2. Lindsay Wassell – Improve Your SEO by Mastering These Core Principles
Description: Discover how SEO tactics that win in the long run complement web-friendly business practices and core principles, and how to incorporate this approach into optimization strategies for changes in search results.
Reason to Watch: Reviewing the basics is an essential part of continued success, especially in an industry where basic practices change so often. I am looking forward to seeing how she meets SEO basics with a business’s core principles.
Lindsay’s Reason to Watch:
— Lindsay Wassell (@lindzie) June 27, 2014
3. Cindy Krum – Mobile SEO Geekout: Key Strategies and Concepts
Description: Learn all the technical nuances necessary to make your websites rank and perform well in mobile and tablet search!
Reason to Watch: One of the big takeaways LunaMetrics’ own Chris Vella had from SMX Advanced this month was the focus on mobile. Admittedly I am a little weak in this area, so I look forward to learning a lot about optimizing for a mobile user.
Cindy’s Reason to Watch
@seanmcquaide Google says they are v.close to getting more than 50% of their searches from mobile, but lots-o-people still ignore it in SEO
— Cindy Krum (@Suzzicks) June 25, 2014
4. Mike Ramsey – Local Lessons from Small Town USA
Description: Whether your audience is in one region or thousands of major metros across the world, these small town lessons will guide you through the complex world of local search.
Reason to Watch: These days the users location has a big affect over the results they see. As Google’s contextual search grows, so will the need for marketers to take advantage of location to stay in front of their clients’ customers. I’ll be interested to see the similarities and differences between single and multi-location businesses.
Mike’s Reason to Watch:
@seanmcquaide my boyish good charm? And some sweet research that will blow some local minds.
— Mike Ramsey (@MikeRamsey) June 27, 2014
5. Pete Meyers – How to Never Run Out of Great Ideas
Description: Learn how to stay afloat in the coming flood of content, as Dr. Pete provides concrete tactics for sustainably creating high-value content.
Reason to Watch: “Creativity is the power to connect the seemingly unconnected.” -William Plomer. Yes I pulled that from BrainyQuotes and had to Google who William Plomer was, but it is exactly what I’m looking forward to in Dr. Pete’s presentation.
6. Mark Traphogan – Google+ Game of Thrones: Claiming Your Kingdom for Brand Dominance
Description: Be the ruler of your vertical by claiming uncharted ground in Google+ to dragon-power your brand’s Google influence.
Reason to Watch: Who isn’t a sucker for a Game of Thrones themed presentation? The power of Google+ is well-known in the SEO industry, but few are leveraging its full potential, including me. We all know building an audience on Google+ gives you an edge in search results. I am looking forward to hearing Mark’s strategy for finding and growing that audience and then leveraging them like a Lannister. (Of course you should never “leverage” an audience, but I couldn’t resist the alliteration)
Mark’s Reason to Watch:
@seanmcquaide Let's go for funny. How 'bout "Has singlehandedly kept Google+ alive since 2011"
— Mark Traphagen (@marktraphagen) June 25, 2014
7. Justin Briggs – Talking Back to Conversational Search
Description: Looking at how conversational search and knowledge graph are changing how users search and engage with content, Justin will talk about implementing entities at enterprise scale.
Reason to Watch: I’ve been hooked on the idea of conversation search since the first time I set my eyes on the MotoX. Although it’s cool and useful for the end user, it generates many challenges and questions when it comes to SEO. As Google continues to expand the knowledge graph I will be interested to hear how we can take advantage of Google’s growing intelligence.
8. Paddy Moogan – Beyond SEO – Tactics for Delivering an Integrated Marketing Campaign
Description: Everyone talks about the need for SEOs to diversify, but Paddy will give you actionable tips to go away and do it, no matter what your current role is.
Reason to Watch: Diversifying as a professional is a difficult process that takes a lot of thought and energy. Knowing how to take the first step can be the difference between powerhouse and scatterbrained. I look forward to hearing Paddy’s thoughts on how to start along the right path.
@seanmcquaide it will give MozCon attendees lots of actionable tips they can go away and use straight away
— Paddy Moogan (@paddymoogan) June 26, 2014
9. Richard Baxter – Developing Your Own Great Interactive Content – What You’ll Need to Know
Description: Even if you’re not a technical genius when it comes to interactive, front end web development projects, Richard will show you how to make something the Internet loves from ideation and conceptualization to rapid prototyping, launch, and huge coverage.
Reason to Watch: You know what Google can’t suck into a knowledge graph? Unique interactive content. Did I just jinx it? I look forward to taking the skills learned in Dr. Pete’s “How to Never Run Out of Great Ideas” presentation and bringing them to life with this one.
10. Dana DiTomaso – Prove Your Value
Description: Dana will show you how to report so there’s no doubt in your client’s mind that they’d be lost without you.
Reason to Watch: “Make everything as simple as possible, but not simpler.” -Albert Einstein. I have made great strides at increasing the quality of our SEO reports at LunaMetrics. But still I am always looking for what’s missing or rather what needs to be removed. Reporting is a topic which rarely floats across news feeds, but is an essential part of how we prove our worth to clients. I’m looking forward to any and all “Why didn’t I think of that!” moments Dana may deliver.
11. Rand Fishkin – Mad Science Experiments in SEO & Social Media
Description: Whether it’s anchor text or sharing on Google+ instead of Facebook, Rand’s spent the last few months formulating hypotheses and running tests, and now he’ll share these fascinating results to help you.
Reason to Watch: Last but certainly not least is Experiments in SEO. Rand will be presenting results from his own experiments, as well as insights from IMEC Labs and others. I participate in IMEC so I’ll be excited to see the impact of all the random queries and link clicks I’ve done in the past month.
@seanmcquaide Because the experiments I'll be showing off in my presentation will confirm a few theories and debunk a number of others.
— Rand Fishkin (@randfish) June 25, 2014
Tweets around #MozCon
The news that Google plans to drop Authorship photos from the search results was unexpected, but probably not shocking to many SEOs. It will be done in an effort to provide a better mobile experience by decluttering the results, said Google’s John Mueller.
All speculation and mourning aside, we are curious how this affects Google users that have grown accustomed to Authorship photos and Circle information.
Our office is divided:
“The author influences my click a lot for work-related informational searches,” Reid Bandremer said.
“If I’m being honest, no. I scan the title and description to see if the result matches my true intent then look at the author to make a click decision,” Sean McQuaide said.
Do Authorship photos and information influence your behavior when searching on Google? Please complete the poll below to view the results.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
We avoided gossip in the post so please use the comments to let us know what you’re thinking about Google’s update.
Big Query and Big Query Export for Google Analytics give us the power to visualize and explore virtually any trend in our GA data. It’s really quite powerful stuff. Because this tool is still very new, I want to get the conversation started on how advanced reporting can augment our digital analytics.
In this post I discuss data mining and the advanced reporting of Google Analytics data. I provide an R script for generating an E-commerce report with visualizations that are not possible within Google Analytics.
My colleague Jonathan Weber has a nice article on Big Query and Google Analytics Data Export. Check this out for an overall refresher on Big Query, including billing information.
What are the Use Cases for Big Query in GA?
There are two big reasons to use Big Query to process and report on Google Analytics data:
- Sampling. Even for Premium customers, usage of advanced segments and non-standard secondary dimensions can lead to sampling. This can greatly affect the accuracy of quarterly, and even monthly, reporting.
- Data mining and advanced reporting. Google Analytics is, in our opinion at LunaMetrics, the best overall technology for digital analytics. But, like anything, it has its shortcomings. One of these is the customizability of reporting. GA’s ease of use and shallow learning curve also results in the need for additional tools for data mining and more-complex analysis.
Cue Big Query. Our ability to generate complex metrics and mine trends is limited only by our knowledge of SQL. And the flexibility in visualizations by the limits of R, ShufflePoint, Tableau…a wide range of solutions.
Note: As a general cloud service, Big Query is not exclusive to GA Premium customers. However, Google Analytics Premium IS required in order to export complete GA data to Big Query. So, if you are not Premium, you can still upload data to Big Query and use it as a cloud solution for SQL-like data analysis. But you need Premium for working with GA data.
Overview of the Report
The R script should function out-of-the-box for any GA Premium customers with E-commerce data (and with the Big Query export enabled).
If you don’t have Google Analytics Premium, but are still interested in data mining and advanced reporting in for Google Analytics, you can still interface with R! Check out the R for Google Analytics library. By combining their starter template with the code for the graphs from my script, you should be able to achieve similar functionality. You will also need to conduct additional processing of this data in R, since you are unable to take advantage of the SQL-like querying of Big Query. Additionally, those with a high traffic volume in GA will likely experience sampling.
One last note. I owe a shout-out to Hadley Wickham for writing the Big Query R library.
Here is the full report generated from the R script (scroll down for a walk-through):
Figure 1: Average pageviews with transaction versus without transaction
You could also get this information by using advanced segments in Google Analytics, but it’s nice to get it unsampled and to be able to generate box-and-whisker plots.
Figure 2: Average Revenue/Session by Medium
This is something you could calculate from Google Analytics, but that is not readily available as a metric. In this chart we graph two metrics simultaneously; average revenue is given by height, and the weight of each channel (sessions) is given by color.
Figure 3: Top Campaigns by Revenue/Session
This chart is similar to the prior, except that we compare campaigns rather than medium. Again, the height shows the average revenue/session, and the color displays the weight of each campaign.
Figure 4: Product Category of Purchase by Medium (a)
Here, we group products purchased by their category. Then we break down the sales data for each category by medium.
Figure 5: Product Category of Purchase by Medium (b)
This displays the same information as the prior chart, but visualized differently. The darkness of the tile indicates the number of products sold for a given product category and medium.
Figure 6: Likelihood of product to indicate and/or lead to additional transactions
This is my favorite chart. It is something we would never be able to calculate in Google Analytics. Here, we examine the products associated with a user’s initial transaction. We then calculate the percentage of users who made at least one additional transaction. And we segment this data by the initial product purchased. Thus, (pending a statistical test), we see that certain products may be indicators and/or causes of additional future purchases. Maybe they’re really great products and win the customer’s loyalty.
R Script for Big Query E-commerce Report
install.packages('devtools') install.packages('httpuv') devtools::install_github("assertthat") devtools::install_github("bigrquery") library(bigrquery) require(gridExtra) require(ggplot2) project <- "INSERT PROJECT ID HERE" dataset <- "INSERT DATASET ID HERE" ##################################### # Big Query queries # note # you also need to set your dataset id # for TABLE_DATE_RANGE in each query # below. replace "dataset" with the id ##################################### sql1<- "SELECT date, totals.pageviews AS total_pageviews_per_user, FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-14'))) WHERE totals.transactions >=1 AND totals.pageviews >=0 ORDER BY fullVisitorId LIMIT 1000" data1 <- query_exec(project,dataset,sql1, billing = project) sql2<- "SELECT date, totals.pageviews AS total_pageviews_per_user, FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-14'))) WHERE totals.transactions IS NULL AND totals.pageviews >=0 ORDER BY fullVisitorId LIMIT 1000" data2 <- query_exec(project,dataset,sql2, billing = project) sql3 <- "SELECT trafficSource.medium AS medium, count(*) AS sessions, sum(totals.transactionRevenue)/1000000 AS total_rev, sum(totals.transactionRevenue)/(count(*)*1000000) AS avg_rev_per_visit FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-07'))) GROUP BY medium ORDER BY avg_rev_per_visit DESC LIMIT 10;" data3 <- query_exec(project,dataset,sql3, billing = project) sql4 <- "SELECT trafficSource.campaign AS campaign, count(*) AS sessions, sum(totals.transactionRevenue)/1000000 AS total_rev, sum(totals.transactionRevenue)/(count(*)*1000000) AS avg_rev_per_visit FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-07'))) GROUP BY campaign HAVING sessions >= 10 ORDER BY avg_rev_per_visit DESC LIMIT 10;" data4 <- query_exec(project,dataset,sql4, billing = project) sql5 <- "SELECT trafficSource.medium AS medium, hits.item.productCategory AS category, count(*) as value FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-02'))) WHERE hits.item.productCategory IS NOT NULL GROUP BY medium, category ORDER BY value DESC;" data5 <- query_exec(project,dataset,sql5, billing = project) sql7 <- " SELECT prod_name, count(*) as transactions FROM ( SELECT fullVisitorId, min(date) AS date, visitId, hits.item.productName as prod_name FROM ( SELECT fullVisitorId, date, visitId, totals.transactions, hits.item.productName FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-14'))) ) WHERE fullVisitorId IN ( SELECT fullVisitorId FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-14'))) GROUP BY fullVisitorId HAVING SUM(totals.transactions) > 1 ) AND hits.item.productName IS NOT NULL GROUP BY fullVisitorId, visitId, prod_name ORDER BY fullVisitorId DESC ) GROUP BY prod_name ORDER BY transactions DESC;" data7 <- query_exec(project,dataset,sql7, billing = project) sql8 <- " SELECT hits.item.productName AS prod_name, count(*) AS transactions FROM (TABLE_DATE_RANGE([dataset.ga_sessions_], TIMESTAMP('2014-06-01'), TIMESTAMP('2014-06-14'))) WHERE hits.item.productName IS NOT NULL GROUP BY prod_name ORDER BY transactions DESC;" data8 <- query_exec(project,dataset,sql8, billing = project) ################################# #processing ################################# data9 <- merge(data7,data8,by.x="prod_name",by.y="prod_name") data9$perc <- data9$transactions.x/data9$transactions.y data9 <- data9[with(data9, order(-transactions.y, perc)), ] data9 <- data9[1:10,] ################################## # BEGIN PLOTTING ################################## pdf("my first bigquery-r report.pdf",width=8.5,height=11) #plot 1 p1<- ggplot(data1, aes(paste(substr(date,1,4),"-",substr(date,5,6),"-", substr(date,7,8),sep=""),total_pageviews_per_user)) + geom_boxplot() p1 <- p1 + labs(title="Avg Pageviews for Users with Purchase", x = "Date", y = "Pageviews") p1 <- p1 + ylim(c(0,y_max)) p1 <- p1 + theme(axis.text.x = element_text(angle = 35, hjust = 1)) #plot 2 p2<- ggplot(data2, aes(paste(substr(date,1,4),"-",substr(date,5,6),"-", substr(date,7,8),sep=""),total_pageviews_per_user)) + geom_boxplot() p2 <- p2 + labs(title="Avg Pageviews for Users without Purchase", x = "Date", y = "Pageviews") p2 <- p2 + ylim(c(0,y_max)) p2 <- p2 + theme(axis.text.x = element_text(angle = 35, hjust = 1)) grid.arrange(p1,p2,nrow=2) #plot 3 par(mfrow=c(1,1)) p3<- ggplot(data3, aes(x = medium,y=avg_rev_per_visit, fill=sessions,label=sessions))+ geom_bar(stat="identity")+scale_fill_gradient(low = colors(), high = colors()) p3 <- p3 + geom_text(data=data3,aes(x=medium, label=paste("$",round(avg_rev_per_visit,2)), vjust=-0.5)) p3 <- p3+ labs(title="Avg Revenue/Session by Medium", x = "medium", y = "Avg Reveune per Session") p3 + theme(plot.title = element_text(size = rel(2))) #plot 4 par(mfrow=c(1,1)) p4<- ggplot(data4, aes(x = campaign,y=avg_rev_per_visit,fill=sessions, label=sessions))+geom_bar(stat="identity")+ scale_fill_gradient(low = colors(), high = colors()) p4 <- p4 + geom_text(data=data4,aes(x=campaign,label=paste("$", round(avg_rev_per_visit,2)),vjust=-0.5)) p4 <- p4 + labs(title="Top Campaigns by Rev/Session (min 10 sess)", x = "campaign", y = "Avg Reveune per Session") p4 <- p4 + theme(plot.title = element_text(size = rel(2))) p4 + theme(axis.text.x = element_text(angle = 25, hjust = 1)) #plot 5 par(mfrow=c(1,1)) p5 <- qplot(category,data=data5, weight=value, geom="histogram", fill = medium, horizontal=TRUE)+coord_flip() p5 <- p5 + labs(title="Category of Product Purchase by Medium (a)", x = "Product Category", y = "Transactions") p5 <- p5+ theme(plot.title = element_text(size = rel(2))) p5 + theme(axis.text.x = element_text(angle = 25, hjust = 1)) #plot 6 par(mfrow=c(1,1)) p6 <- ggplot(data5, aes(medium,category)) + geom_tile(aes(fill=value))+ scale_fill_gradient(name="transactions",low = "grey",high = "blue") p6 <- p6 + labs(title="Category of Product Purchase by Medium (b)", x = "medium", y = "product category") p6 + theme(plot.title = element_text(size = rel(2))) #plot 7 par(mfrow=c(1,1)) p7<- ggplot(data9, aes(x = prod_name,y=perc,fill=transactions.y, label=perc))+ geom_bar(stat="identity")+scale_fill_gradient(name="transactions", low = colors(), high = colors()) p7 <- p7 + geom_text(data=data9,aes(x=prod_name, label=paste(round(perc*100,1),"%"),vjust=-0.5)) p7 <- p7 + labs(title="Percentage of Transactions resulting in Future \r\n Additional Transactions by same User", x = "Product Name", y = "Percentage of Transactions") #p7 <- p7 + theme(plot.title = element_text(size = rel(2))) p7 + theme(axis.text.x = element_text(angle = 45, hjust = 1)) dev.off()
The SEO sessions at SMX Advanced 2014 were filled with great wisdom and advice from several of the industry’s most authoritative thought leaders. After attending two full days of back-to-back sessions, the multitude of topics discussed in-depth can make it challenging to distinguish the key takeaways.
I reviewed my notes and live tweets and found a few quotes that stuck with me following the conference. Here were the most intriguing thoughts at SMX Advanced 2014:
“Mobile is important, and coming faster than most people in this room realize.”
- Matt Cutts (@mattcutts)
Matt Cutts is the head of Google’s Webspam team, but by the way the attendees hung on every word during the “You&A With Matt Cutts” session, one could have easily confused him for a superstar of sorts. Throughout the session, Matt’s tone of speech stayed uniformly mellow with the exception of a few statements, like the one above. When this simple sentence was delivered, it struck me as warning.
Perhaps I read too far into it but for a moment I had a glimpse into the future, a wake-up call. My thinking shifted from seeing mobile optimization as an important aspect of SEO to what is the future of search. If your business, website, or team doesn’t have a mobile optimization strategy in place by now, you’re already late to the game. Heed Matt’s warning and realize today and now that mobile IS important and its heyday may have already dawned.
“People want the story, not the data.”
- Kerry Dean (@kerrydean)
Sometimes it’s easy for us to get lost in the sea of numbers and charts we digest on a daily basis as analysts of one thing or another. Kerry Dean reminded me during his energetic and hilarious presentation that a data analyst’s job is not just spitting out charts and tracking growth. We’re employed to tell the data’s story, and that’s what makes us valuable.
We answer the who, what, when, where and why when few others can. We turn data into insight and that’s what people and organizations want, not just the numbers. Tell the story.
“Publish data that has meaning to machines, beyond keywords.”
- Jay Myers (@jaymyers)
If you work in search marketing, keywords are your currency and cash is king indeed. Publishing data that has meaning to machines beyond keywords, however, cannot be discounted as means of increasing a search engine’s understanding of your site’s content. Jay Myers of Best Buy was referring to structured data, more specifically, when he summarized his presentation’s message with this statement.
Because of opportunities like rich snippets in search engine results and Google’s progressing ability to understand and use structured data, it’s becoming increasingly beneficial to publish structured data markup. Marshall Simmonds (@mdsimmonds) added the following adage, “If you see rich snippets in SERPs, that’s usually a strong indicator of a healthy site.”
“Consider rewriting your title tag to what Google is changing it to in the SERPs.”
- Greg Boser (@GregBoser)
Upon first hearing this tidbit of advice from Greg Boser, I was skeptical. I believe the operative word in his statement is consider. I tweeted the quote just moments after he said it, however, and Matt Cutts favorited the tweet. That gave me enough confidence to deem the suggestion kosher.
Greg also stated, “Google is rewriting your title tags in SERPs far more than anyone realizes to get you a better CTR.” So it only makes sense that if Google’s intention is to increase your CTR for any given keyword, then their altered title tag must perform better than your custom title tag. Otherwise, Google wouldn’t change it. The difficulty is knowing which keywords cause the title to display differently for each page on your site. Still, the idea is worth mulling over.
“Everybody thinks their website is above average.”
- Matt Cutts (@mattcutts)
Lastly, another perspective shift with Matt Cutts. Whether this was said jokingly or not, it made for a great reminder that every website can be improved. If you spend a lot of time working on one website, it’s easy to rest on your laurels. You see the progress that you’ve made and you are satisfied, but this could be a costly mistake.
Matt is right. Everybody does think that their website is above average. It’s your job to keep everyone grounded, and constantly be in the pursuit of excellence for your website. So shake things up, make a change, try something new – and you might just end up with an above average site after all.
Don’t fall for that old Jedi mind trick and simply ignore what Universal Analytics tells you to ignore… they might be the referrals you are looking for.
Did you know that Universal Analytics’ default setting is not to count referrals from your domain? That’s right, Universal Analytics is going to ignore self-referrals by default. This may not be a good thing if you need the information to fix coding errors, but that’s another story.
Today’s story is how to make sure that your idea of self-referrals matches what Universal Analytics is calling a self-referral. If it doesn’t, you may be ignoring some referrals that you didn’t want to ignore. Depending on your situation, you may need to change a setting or even add some custom code to see all the referrals you want.
Read more to understand what’s really going on with referrals in Universal Analytics so you can make an informed decision about what to ignore.
What’s in your referral exclusion list?
When you start a new property, Universal Analytics automatically adds your domain to that property’s referral exclusion list. This list controls which referrals are ignored. Right now you should take a look at what’s in there, if you don’t already know.
Here’s how Universal Analytics decides what to put in your referral exclusion list by default: It takes the domain you entered when you created the property, such as jobs.tatooine.com, and uses only the second-level and top-level parts, so that the default domain in the list would be tatooine.com. If you’re upgrading to Universal, this may not happen automatically, but you should still check out the list to confirm.
Sounds reasonable, right? In fact, this default setting works out fine as long as you want to ignore referrals from every hostname that contains tatooine.com.
If there are third-level or higher domains, the default setting ignores them all: www.tatooine.com, vacations.tatooine.com, as well as jobs.tatooine.com. The default setting also ignores referrals from second-level domains that include tatooine.com, such as visittatooine.com or cantinatatooine.com.
Whoa, whoa, whoa… you may be thinking, “I didn’t want to go that far. I’ll just change my referral exclusion list to specify exactly the domain I want to ignore.” And so you might enter jobs.tatooine.com – if that’s the property you’re collecting data for and you want to see referrals from everywhere else – and think your job is done.
Unfortunately, the result is that now you’re not excluding any referrals because the Referral Exclusion List accepts only the second-level and top-level parts of any domain. If you enter only third-level or higher domains, it’s like entering nothing at all.
Do you need to ignore any referrals?
In the simplest, ideal scenario, your tracking code would be implemented perfectly and your traffic reports would contain nothing or next to nothing in the way of self-referrals. It’s possible you wouldn’t need to enter anything in your referral exclusion list, i.e. you could remove the default entry. Or you might need only the domain of your third-party shopping cart in the list.
Here are a couple of scenarios with related domains where things could be more complicated:
Scenario 1: Ignore one subdomain, but not main domain or other subdomains
Your property collects data for jobs.tatooine.com. You want to ignore referrals only for that hostname, and you want to see referral traffic from everywhere else, including tatooine.com.
Scenario 2: Ignore main domain and one subdomain but not another
Your property collects data for tatooine.com and vacations.tatooine.com. You want to ignore referrals from these two hostnames, because your tracking counts a user who views pages from both as part of the same session. On the other hand, you want to see referral traffic from other hostnames such as jobs.tatooine.com.
What are your options in each case? As mentioned earlier, the default exclusion tatooine.com is going to ignore referrals from every hostname that contains tatooine.com, and entering anything more specific is like entering nothing at all.
Things look pretty hopeless. Maybe someday Universal Analytics will allow you to specify exact hostnames in the referral exclusion list, but until then…
What makes this work is Universal Analytics’ Processing Flow for Campaigns and Traffic Sources.
When Universal Analytics is deciding what traffic information to record for a single hit, one of the things it may look at is the document referrer, i.e. the page the user was viewing when he or she clicked to arrive at the current page. Universal Analytics writes the document referrer into its own parameter named documentReferrer (clever, eh?) and then compares that parameter with your referral exclusion list.
Our trick is to set documentReferrer to a different name – one we still recognize, but that will not match the name(s) in the referral exclusion list. This can be anything that we make up, as long as WE know what it means. In this case, I’ll use tatooinetheplanet.com instead of tatooine.com.
Step 1: Keep the default domain in the referral exclusion list.
For either of my two scenarios, I want to have tatooine.com in the referral exclusion list. Remember, this is going to match everything with tatooine.com, so the next step is the tricky part.
Step 2: Conditionally rewrite the document referrer.