Gathering meaningful data during the user journey

Since I started looking into omni-channel metrics last year, I’ve been learning how to best gather meaningful data at each step of the user journey. I recently came across a great piece by Gary Angel titled “A Data Model for the User Journey”. In his article, Gary aims to address the multi-source nature of our data touchpoints, and the issues brought about by the differences in the level and type of detail data. He rightly points out that these differences in data make any kind of meaningful analysis of the user journey virtually impossible. Gary provides a number of useful steps to tackle this problem:

  1. Create a level of abstraction – Gary first suggestion is to get to a level of abstraction where each data touchpoint can be represented equally. One way of doing this is to apply Gary’s “2-tiered segmentation” model. In a 2-tiered segmentation model, the first tier is the visitor type. This is the traditional visitor segmentation based on persona or relationship. The second tier is a visit or unit-of-work based segmentation that is behavioural and is designed to capture the visit intent. It changes with each new touch. Gary summarises this two-tiered approach as follows: “Describing who somebody is (tier 1) and what they are trying to accomplish (tier 2).”
  2. Capture visit intent – One of the key things that I learned from Gary’s article is the significance of ‘visit intent’ with respect to creating a user-journey model. Visit intent offers an aggregated view of what a visit was about and how successful it was. Both the goal and the success of a visit are important items when analysing a user journey.
  3. 2-tiered segmentation and omni-channel – Gary points out how well his 2-tiered segmentation model lends itself to an omni-channel setup. The idea of 2-tiered segments can be used across any touchpoint, whether it’s online or offline. The intent-based segmentation can be applied relatively easily to calls, branch or store visits and social media posts. The model can also be applied – albeit less easily – to display advertising and email (see Fig. 1 below).
  4. Good starting point for journey analysis – When you look at the sample data structure as outlined in Fig. 1 below, with one data row per user touchpoint visit or unit of work, you can start doing interesting pieces of further analysis. For example, with this abstract data structure you can analyse multi-channel paths or enhance user journey personalisation.
  5. Combine visitor level data with user journey data – It sounds quite complex, but I like Gary’s suggestion to model in the abstract the key customer journeys. This can then be used to create a visitor level data structure in which the individual touchpoints are rolled up. Gary’s example below helps clarify how you can best map different data touchpoints to related stages in the user journey (see Fig. 2 below) .

Main learning point: The main thing that I’m taking away from Gary Angel’s great piece is the two segments to focus on when measuring the user journey: the visitor and their goals. The data structure suggested by Gary lends itself really well to an omni-channel user experience as it combines visitor and user journey data really well.

Fig. 1 – Sample data structure when applying the the 2-tiered segmentation to a user journey data model – Taken from: http://semphonic.blogs.com/semangel/2015/03/a-data-model-for-the-user-journey.html

  • TouchDateTime Start
  • TouchType (Channel)
  • TouchVisitorID
  • TouchVisitorSegmentCodes (Tier 1)
  • TouchVisitSegmentCode (Tier 2)
  • TouchVisitSuccessCode
  • TouchVisitSuccessValue
  • TouchTimeDuration
  • TouchPerson (Agent, Rep, Sales Associate, etc.)
  • TouchSource (Campaign)
  • TouchDetails

Fig. 2 – Example of modelling the acquisition journey for a big screen TV – Taken from: http://semphonic.blogs.com/semangel/2015/03/a-data-model-for-the-user-journey.html

  • Initial research to Category Definition (LED vs. LCD vs. Plasma – Basic Size Parameters)
  • Feature Narrowing (3D, Curved, etc.)
  • Brand Definition (Choosing Brands to Consider)
  • Comparison Shopping (Reviews and Product Detail Comparison)
  • Price Tracking (Searching for Deals)
  • Buying

With an abstract model like this in hand, you can map your touchpoint types to these stages in user journey and capture a user-journey at the visitor level in a data structure that looks something like this:

  • VisitorID
  • Journey Sub-structure
    • Journey Type (Acquisition)
    • Current Stage (Feature Narrowing)
    • Started Journey On (Initial Date)
    • Time in Current Stage (Elapsed)
    • Last Touch Channel in this Stage (Channel Type – e.g. Web)
    • Last Touch Success
    • Last Touch Value
    • Stage History Sub-Structure
      • Stage (e.g. Initial Research) Start
      • Stage Elapsed
      • Stage Success
      • Stage Started In Channel
      • Stage Completed in Channel
      • Channel Usage Sub-Structure
        • Web Channel Used for this Journey Recency
        • Web Channel Used for this Journey Frequency
        • Call Channel Used for this Journey Recency
        • Call Channel Used for this journey Frequency
        • Etc.
    • Stage Value
    • Etc.

This stage mapping structure is a really intuitive representation of a visitor’s journey. It’s powerful for personalisation, targeting and for statistical analysis of journey optimisation. With a structure like this, think how easy it would be to answer these sorts of questions:

  • Which channel does this visitor like to do [Initial Product Research] in?
  • How often do visitors do comparison shopping before brand narrowing?
  • When people have done brand narrowing, can they be re-interested in a brand later?
  • How long does [visitor type x] typically spend price shopping?

Related links for further learning:

  1. http://semphonic.blogs.com/semangel/2015/03/a-data-model-for-the-user-journey.html
  2. http://semphonic.blogs.com/semangel/2015/03/in-memory-data-structures-for-real-time-personalization.html
  3. http://semphonic.blogs.com/semangel/2011/04/semphonics-two-tiered-segmentation-segmentation-for-digital-analytics-done-right.html
  4. http://semphonic.blogs.com/semangel/2015/02/the-visit-is-dead-long-live-the-visit.html
  5. http://semphonic.blogs.com/semangel/2015/02/statistical-etl-and-big-data.html

SQL – Learning about the basic “SELECT” statement

I’m still doing my Stanford online course on relational databases. Today, I learned about the basics of SQL, a special programming language designed for managing data held in a relational database or from stream processing in a .

The teacher of the class, Jennifer Widom, kicked off the class by talking about the difference between a Data Definition Language (‘DDL’) and a Data Manipulation Language (‘DML’):

Data Definition Language (‘DDL’)

  • Create a table in the database
  • Drop a table from the database

Data Manipulation Language (‘DML’)

  • Query the database -> “Select” statement
  • Modify the database -> “Insert”, “Alert” or “Update” statement

Jennifer then told us about the Basic “Select” statement (see Fig. 1 below), explaining that the result of such a statement is to return a relation with a set of data attributes. For example, when you take a simple college admissions database as a starting point where there are 3 relations, each relation having its own set of unique attributes:

  • College ( College Name, State and Enrollment)
  • Student (Student ID, Student Name, GPA and Size High School)
  • Apply (Student ID, College Name, Major and Decisions)

Jennifer then gave us the following examples:

Query involving a single relation

select sID, sName, GPA

from Student

where > 3.6

This query will give you the name and student IDs of those applicants with a GPA higher than 3.6.

Query combining two relations

select sName, Major

from Student, Apply

where Student.sID = Apply.sID

This query will give you data on the names and student IDs for those students applied, filtered by Major.

Jennifer pointed out that SQL is a multi-set model and it therefore allows duplicates. You can eliminate duplicate values by adding the keyword “distinct” to your query. Jennifer also mentioned that SQL is an unordered model which means that you can sort results.

You can include an “order by” clause in your query and add “descending” to order the results of your query:

where Apply.sID = Student.sID

and Apply.cName = College.cName

order by GPA desc, Enrollment;

Main learning point: I found this class about creating a basic “select” statement particularly helpful, as it helped me to get a better understanding of how basic SQL queries are constructed.

Fig. 1 – Elements of the basic “Select” statement in SQL – Taken from: http://www.w3resource.com/sql/sql-syntax.php

 

Select SQL

Learning more about running A/B tests

I love running A/B and multivariate (‘MVT’) tests. These are experiments designed to evaluate different design or copy variants, based on actual performance data. Instead of comparing or deciding on product options based on gut feel, an A/B or multivariate test allows to you to compare alternatives based on objective data and predefined success criteria.

However, running these kinds of tests can be quite tough. These are some of the reasons why:

  • Insufficient traffic – Time and traffic are two important prerequisites if you want to be able to draw some meaningful conclusions from your experiments. However, what do you do if you don’t have a large user base yet or when you traffic starts faltering?
  • Not sure which metric(s) to focus on – One of the things that I’ve learned the hard way is the importance of being clear upfront about the exact goal of the test and ensuring that you’ve selected the relevant metric(s) to focus on.
  • Determine required sample size – Working out the sample size you need to in order to reach a “point of statistical significance” with your test can be tricky. Luckily, most A/B testing tools have an automated function for calculating this.

The other day I came across a great post by Optimizely titled “Stats with Cats: 21 Terms Experimenters Need to Know”. Reading trough this piece really helped me in understanding more about how to best design an experiment and tackle some of the common issues which I outlined above.

These are the main things that I learned from Stats with Cats: 21 Terms Experimenters Need to Know:

  1. Statistical significance – Significance is a statistical term that tells how sure you are that a difference or relationship exists. For example, if you want want to be able to confidently tell whether there’s a difference between version A and B, you need a treshold (e.g. 95%) to describe the level of error you’re comfortable with in a given A/B test. Significant differences can be large or small, depending on your sample size.
  2. Confidence interval – This is a computed range used to describe the certainty of an estimate of some underlying parameter. In the case of A/B testing, these underlying parameters are conversion rates or improvement rates.
  3. Bayesian – This is a statistical method that takes a bottom-up approach to data analytics when calculating statistical significance. It encodes the past knowledge of similar, previous experiments into a prior, which is a statistical device. You can use this prior in combination with current experiment data to make a conclusion on a currently running experiment.
  4. Effect size – The effect size (also known as “improvement” or “lift”) is the amount of difference between the original version (‘control’ version) and a variant. This could be an increase in conversion rate (a positive improvement) or a decrease in conversion (a negative improvement). The effect size is a common input into many sample size calculators. For example, Optimizely’s A/B Test Sample Size Calculator lets you enter an expected conversion rate for your control version.
  5. Error rate – The error rate stands for the chance of finding a conclusive difference between a control version and a variation in an A/B test, or not finding a difference where there is one. This encompasses “type 1” and “type 2” errors. A type 1 error occurs when a conclusive outcome (winner or loser) is declared, and the test is actually inconclusive. This is often referred to as a “false positive”. With type 2 errors, no conclusive result (winner or loser) is declared, failing to discover a conclusive difference between a control and a variation when there was one. This is also referred to as  a “false negative” (see Fig. 1 below).
  6. Hypothesis test – Sometimes called a “t-test”, a hypothesis test is a statistical inference methodology used to determine if an experiment result was likely due to chance alone. Hypothesis tests try to disprove a null hypothesis, i.e. the assumption that two variations are the same. In the context of A/B testing, hypothesis tests will help determine the probability that one variation is better than the other, supposing the variations were actually the same.
  7. Fixed horizon hypothesis test – The key thing with a fixed horizon test is that it’s designed to come to a decision about version A or B at a set moment in time, ideally after reaching the point of statistical significance.
  8. Sequential hypothesis test – A sequential hypothesis test is the opposite of a fixed horizon hypothesis test, as the underlying principle of this test is that the experimenter can make a decision on the test at any point in time.

Main learning point: Even though I’m not a statistician or a data analyst, I found it really helpful to learn more about some of the terms that experimenters need to know about. Especially given some of the challenges with respect to running successful experiments, I believe it’s important to think through things such as a null hypothesis or desired effect size before you design and run your experiment.

Fig. 1 – Possible outcomes of A/B experiments – Taken from: http://blog.optimizely.com/2015/02/04/stats-with-cats-21-terms-experimenters-need-to-know/#type-i Error image

Related links for further learning:

  1. http://blog.optimizely.com/2015/02/04/stats-with-cats-21-terms-experimenters-need-to-know/
  2. http://us6.campaign-archive2.com/?u=ad9057edac5b98ad4892b6a6f&id=bffd19fcf8&e=3c8b6fa69a
  3. http://www.optimizesmart.com/understanding-ab-testing-statistics-to-get-real-lift-in-conversions/
  4. https://www.optimizely.com/resources/multivariate-testing/
  5. http://blog.hubspot.com/marketing/how-to-run-an-ab-test-ht
  6. http://www.wordstream.com/blog/ws/2014/02/26/
  7. http://en.wikipedia.org/wiki/Bayesian_probability

The what and why of programmatic marketing

The term “programmatic marketing” is relatively new. Ben Plomion, VP Marketing at Chango, first wrote about programmatic marketing back in 2012. In this article he expands on the ‘what’ and the ‘why’ of programmatic marketing. Ben’s piece formed a great starting point for me to learn more about what programmatic marketing means and what its benefits are.

Let’s start with the ‘what’:

Wikipedia provides a nice and concise definition of programmatic marketing: “In digital marketing, programmatic marketing campaigns are automatically triggered by any type of event and deployed according to a set of rules applied by software and algorithms. Human skills are still needed in programmatic campaigns as the campaigns and rules are planned beforehand and established by marketers.”

I’ve broken this down into some specific elements:

  1. Events – Marketers can set rules around specific ‘events’ which they expect to trigger specific marketing activities (e.g. a display ad or an email). An abandoned online shopping cart is a good example of such an event. For instance, I receive an email with a subject line that says “Do you still want to buy a white pair of Converse All Stars” after I’ve abandoned this product in my shopping basket.
  2. Automatic triggers – Once an event has been selected, an automatic trigger can be created. For instance, if I search for “blue cashmere” jumpers, I’ll be presented with display ads for the blue cashmere jumpers on other applications or sites that I visit or browse.
  3. Rules set by marketers – There’s a strong human element to programmatic marketing. Marketers need to fully understand the customer journeys and metrics related to their product or service. This understanding will help you to make sure the right marketing activity is triggered, for the right customer and at the right time.

Why? What are the benefits of programmatic marketing?

  1. It’s automated – By automating buying decisions, marketers remove the friction of the sales process (including humans placing buying orders) and reduce their marketing costs.
  2. Organising data – A programmatic marketing platform allows marketers to better organise their data and create highly targeted marketing campaigns. The goal is to avoid wasted clicks or impressions. Programmatic marketing helps to target those consumers who have (expressed) an intent to buy, and who are likely to covert into the desired behaviour.
  3. Targeting and personalisation – Programmatic marketing helps in targeting specific user types or segments, having a better understanding of user activity and interests. Programmatic marketing increases the likelihood of consumer action by showing each user a personalised message. The goal is to present users with a more customised call-to-action based on their recent browsing behaviour, for example, or other anonymised data that you know about them.
  4. Reaching consumers across channels and devices – Similar to marketing based on user behavioural data (see my previous point), you can use programmatic marketing to understand and tap into which channels and devices customers use as part of their experience.

Some programmatic marketing techniques to consider:

  1. Dynamic Creative Optimisation – Dynamic Creative Optimisation (‘DCO’)  allows marketers to break an online ad apart into individual pieces, and to create different pieces for different audiences. With these dynamic elements, you can easily rotate the layout of the ad based on user data (see Fig. 1 below). For example, if we know that a user has been looking at cheap flights to Orlando, we can tailor the ad accordingly (see the Travelocity example in Fig. 1 below).
  2. Shopping cart abandonment email campaigns – Every retail or transactional site collects data on users who don’t complete the checkout process. Abandoned shopping cart emails are sent to those customers who added products to their cart but failed to check out. Customers can fail to purchase for a whole a number of reasons, varying from deliberate (e.g. decision not to purchase) to circumstantial (e.g. the website crashed or the session timed out). Sending a users an email to remind them of their abandoned shopping cart is a great way for businesses to act on this data (see some examples in Fig. 2 and 3 below).
  3. Programmatic site retargeting – Programmatic site retargeting (‘PSR’) is designed to increase revenue from someone who has already visited your site or expressed an interest in your product. As the aforementioned Ben Plomion explains here: “PSR crunches all that data and creates a score that determines how much to bid to serve an impression for that user via an ad exchange, allowing marketers to target leads on the cheap”. It’s about using data such as resource pages on your site that a person has visited, or where the user came from, to serve a highly targeted and relevant ad on the favourite site or application of the user.

Main learning point: After having dipped my toe into programmatic marketing, I feel that there’s much more to learn about how programmatic marketing works and about how to do it effectively. Some of the programmatic marketing techniques seem fairly obvious. However, I guess the challenge will in collecting, understanding and selecting the right data to drive your programmatic marketing activity.

Fig. 1 – Good examples of Dynamic Creative Optimisation – Taken from: http://www.adopsinsider.com/ad-ops-basics/dynamic-creative-optimization-where-online-data-meets-advertising-creative/

CDO 1

Fig 2 – Example of an email to remind people of their abandoned shopping cart – Taken from: http://www.shopify.co.uk/blog/12522201-13-amazing-abandoned-cart-emails-and-what-you-can-learn-from-them

Fab

Fig. 3 – Example of an email to remind people of their abandoned shopping cart – Taken from: http://www.whatcounts.com/wp-content/uploads//Hofstra.png

Hofstra

Related links for further learning:

  1. https://www.thinkwithgoogle.com/intl/en-gb/collection/programmatic-marketing/
  2. http://en.wikipedia.org/wiki/Programmatic_marketing
  3. http://www.adopsinsider.com/ad-ops-basics/dynamic-creative-optimization-where-online-data-meets-advertising-creative/
  4. http://digiday.com/platforms/why-programmatic-marketing-is-the-future/
  5. https://www.thinkwithgoogle.com/intl/en-gb/interview/moneysupermarketcom-activating-customer-data-with-programmatic-marketing/
  6. http://www.mediaweek.co.uk/article/1227382/programmatic-marketing-trends-watch-2014
  7. http://www.fanatica.co.uk/blog/programmatic-marketing/
  8. http://blog.clickwork7.com/2014/12/11/programmatic-versus-native/
  9. http://www.256media.ie/2014/10/smart-content-marketing/
  10. https://medium.com/@ameet/demystifying-programmatic-marketing-and-rtb-83edb8c9ba0f
  11. http://www.shopify.co.uk/blog/12522201-13-amazing-abandoned-cart-emails-and-what-you-can-learn-from-them
  12. http://www.clickz.com/clickz/column/2302627/how-programmatic-site-retargeting-can-give-marketing-automation-a-superboost
  13. https://retargeter.com/blog/general/real-time-bidding-and-programmatic-progress

Book review: “Thinking with Data”

It’s oh so easy to get immersed in analytics or big data sets without a clear idea of the questions one wants answered through data. The book Thinking with Data – How to Turn Information into Insights by Max Shron talks about how to get the most of data and how to go about looking for the right data. Max Shron is the founder of Polynumeral, a New York based applied data strategy consultancy. The title of the first chapter of “Thinking with Data” is aptly titled “Scoping: Why Before How” and it covers the main concept behind this book: “CoNVO”. CoNVO stands for context, need, vision, outcome:

  1. Context (Co) – Context emerges from understanding who we are working with and why they’re doing what they are doing. Who are the people with an interest in the results of the project? What are they trying to achieve and why? Shron offers some good examples of context (see Fig. 1 below). The context provides a project with larger goals and helps to keep us on track when working with data. Contexts include larger relevant details, like deadlines and business objectives, which help to prioritise.
  2. Needs (N) – It’s useful to see  how Shron looks at “needs” from a data perspective; “what are the specific needs that could be fixed by intelligently using data? If our method will be to build a model, the need is not to build a model. The need is to solve the problem that having the model will solve.” Shron goes on to explain that “when we correctly explain a need, we are clearly laying out what it is that could be improved by better knowledge.” I’ve included some good examples of needs in Fig. 2 below.
  3. Vision (V) – Shron describes the vision as “a glimpse of what it will look like to meet the need with data”. The vision could consist of a mockup describing the intended results, or a sketch of the argument that we’re going to make, or some particular questions that narrowly focus our aims (see Fig. 3 below).
  4. Outcome (O) – For a data scientist, the “outcome” is all about understanding how the work will actually make it back to the rest of the business and what will happen once it’s there. How will the data and/or insights be used? How will it be integrated into the organisation? Who will use it and why? Shron stresses that the outcome is distinct from the vision; the vision is focused on what form the work will take at the end, while the outcome is focused on what will happen when the work is done (see Fig. 4 below).

Main learning point: Even though I got the sense that “Thinking with Data” is more aimed at data scientists and analysts, I found the book very useful for me as a ‘non-data professional’. Despite it being a very short book, Shron gets his main “CoNVO” concept across very effectively. A good use of data starts with properly scoping the problem that you want to solve. An unstructured scope will make it hard to gather the right insights and to use large data sets intelligently. Using Shron’s CoNVO model will help to gather and analyse data in very targeted and efficient kind of way.

Fig. 1 – Examples of Context – Taken from Max Shron – “Thinking with Data”, p. 3

  • This department in a large company handles marketing for a shoe manufacturer with a large online presence. The department’s goal is to convince new customers to try its shoes and to convince existing customers to return again. The final decision maker is the VP of Marketing.
  • This news organisation produces stories and editorials for a wide audience. It makes money through advertising and through premium subscriptions to its content. The main decision maker for this project is the head of online business.

Fig. 2 – Examples of Needs – Taken from Max Shron – “Thinking with Data”, p. 5

  • Our customers leave our website too quickly, often after reading only one article. We don’t understand who they are, where they are from, or when they leave, and we have no framework for experimenting with new ideas to retain them.
  • Is this email campaign effective at raising revenue?
  • We want to place our ads in a smart way. What should we be optimising? What is the best choice, given those criteria?
  • We want to sell more goods to pregnant women. How do we identify them from their shopping habits?

Fig. 3 – Examples of mockups and argument sketches – Taken from Max Shron – “Thinking with Data”, pp. 9 – 13

Mockups:

Mockups can take the form of a few sentences reporting the outcome of an analysis, a simplified graph that illustrates a relationship between variables, or a user interface sketch that captures how people might use a tool.

Example of a sentence mockup:

The probability that a female employee asks for a flexible schedule is roughly the same as the probability that a male employee asks for a flexible schedule. There are 10,000 users who shopped with service X. Of those 10,000, 2,000 also shopped with service Y. The ones who shopped with service Y skew older, but they also buy more.

Argument sketches:

A mockup shows what we should expect to take away from a project. In contrast, an argument sketch tells us roughly what we need to do to be convincing at all. It is a loose outline of the statements that will make our work relevant and correct. While they are both collections of sentences, mockups and argument sketches serve very different purposes. Mockups give a flavour of the finished product, while argument sketches give us a sense of the logic behind the solution.

Example of the differences between a mockup and an argument sketch:

Mockup – After making a change to our marketing, we hit an enrolment goal this week that we’ve never hit before, but it isn’t being reflected in the success measures. Argument sketch – The nonprofit is doing well (or poor) because it has high (or low) values for key performance indicators. After seeing the key performance indicators, the reader will have a good sense of the state of the nonprofit’s activities and will be able to adjust accordingly.

Summary of the differences between a mockup and an argument sketch:

In mocking up the outcomes and laying out the argument, we are able to understand what success could look like. The most useful part of making mockups or fragment of arguments is that they let us work backward to fill in what we actually need to do.

Fig. 4 – Examples of an outcome – Taken from Max Shron – “Thinking with Data”, pp. 14 – 16

  • The metrics email for the nonprofit needs to be set up, verified, and tweaked. Sysadmins at the nonprofit need to be briefed on how to keep the the email system running. The CTO and CEO need to be trained on how to read the metrics emails, which will consist of a document written to explain it.
  • The marketing team needs to be trained in using the model (or software) in order to have it guide their decisions, and the success of the model needs to be gauged in its effect on the sales.

Thinking with data