Online marketing channels such as Paid Search and Display Advertising are used by scores of organizations to improve outreach. Having the ability to improve your visibility by simply purchasing traffic is very useful. What is not useful, at least for most marketing organizations is having to come to grips around whether the money for said traffic was spent as efficiently as possible. Most organizations will simply use the Acquisition reporting in Google Analytics to learn of how much traffic their marketing campaigns generate (bad). Some will even venture to see how many conversions or revenue they produce (better but still bad).
Only the savvy organization will employ the technique known as “multi-session marketing analytics.” This technique uses user and session data in order to analyze the activity of users across multiple sessions. This improves on the simplistic “last click” attribution model used for the regular Google Analytics Acquisition reporting. Google has provided some reporting tools for this type of analysis in the Multi-Channel Funnels and Attribution reporting found under the “Conversions” tab in Google Analytics. The Model Comparison Tool report can even be used to compare different models (i.e. “last click”, “first click”, etc.).
The unfortunate thing about using these models is that there is no such thing as a one-size fits all model for analyzing a site’s marketing channels. Each site has it’s own flavor and it’s own user base with it’s own marketing behavior.
In order to deal with this issue, an organization could either pay for Google Premium (which uses machine learning algorithm to predict the best model to use) or it could run a probabilistic model on its GA data. Markov Chain is one of the easier models to use. It is also a model that is used by a number of marketing attribution analytics consultancies.
I’m no statistician, but I believe the simplest way to explain Markov Chain is that is way of describing the probability of events based on the most previous event state. In this case each event is a session with an assigned medium and the result is the re-assigning of values based on the highest probabilities of conversion. Read more on Markov Chain here.
Good thing for those of us that have no advanced math degree, there is a package for R called ChannelAttribution which allows us to run the Markov Chain model on data direct from the Google Analytics API. It also compares the Markov Chain model to other models such as last touch, first touch and linear without much fuss. There is a great tutorial on using ChannelAttribution on the Lunametrics Blog. Read below to see how this technique can be used with data direct from the Google Analytics API in R. The script seems a bit verbose, but it works very well, thanks to Kaelin Harmon!
library(RGA) library(ChannelAttribution) library(data.table) library(ggplot2) library(scales) profile <- {{replace this with your google analytics view id}} start.date <- {{replace with your start date in YYYY-MM-DD}} end.date <- {{replace with your end date in YYYY-MM-DD}} #pull path data from Google Analytics mcf_data <- get_mcf(profile, start.date, end.date,") + 1 #remove multi word channel names (problematic for ChannelAttribution script) mcf_data$basicChannelGroupingPath <- gsub("(?<=\\w)\\s(?=\\w|\\$)", "", mcf_data$basicChannelGroupingPath, perl = TRUE) #create heuristic multi-channel models H <- heuristic_models(mcf_data, 'basicChannelGroupingPath', 'totalConversions', var_value = 'totalConversionValue') #round to convert from scientific notation (if needed) H$linear_touch_conversions <- round(H$linear_touch_conversions,0) H$linear_touch_value <- round(H$linear_touch_value,0) #create markov model M <- markov_model(mcf_data, 'basicChannelGroupingPath', 'totalConversions', var_value = 'totalConversionValue', order = 1) M$total_conversion <- round(M$total_conversion,0) M$total_conversion_value <- round(M$total_conversion_value,0) #merge data frames and order by total conversion value R <- merge(H, M, by = 'channel_name') R <- R[order(-R$total_conversion_value),] #select and rename columns R1 <- R[, (colnames(R) %in% c('channel_name', 'first_touch_conversions', 'last_touch_conversions', 'linear_touch_conversions', 'total_conversion'))] colnames(R1) <- c('channel_name', 'first_touch', 'last_touch', 'linear_touch', 'markov_model') #transform data from wide to long version R1 <- melt(R1, id = 'channel_name') #("") + scale_y_continuous(labels = comma)
This produces a dataframe and a plot which compares each of the heuristic models (last touch, first touch, linear) and the Markov Model.
If you liked this post, you might want to take a look at my last post on using the GA API and R as an alternative to Google Analytics Premium or just leave me a comment below.
Well… consider yourself added to my blogroll. I have like six other blogs I read on a weekly basis, guess that number just increased to seven! Keep writing!
Thanks for the comment Brian!