Cohort analysis: retention
One of the best metrics for identifying whether users love your product is retention. It tells you whether the visitors, which you worked so hard to acquire, are going to come back. Retention evaluation nicely highlights the advantages of a cohort analysis. Unfortunately, while this startup metric can be invaluable, cohort analysis in Google Analytics requires some work to set up.
The graph below shows the daily distribution of visitors who start using your product in each of three separate weeks.
There are numerous ways to calculate the week of the year. The week index reported by Google Analytics is based on a system that’s common in North America: each week starts on a Sunday and the first week of the year is the one containing a Saturday. This is in contrast to the ISO 8601 date and time standard popular in many countries (including most of Europe) that specifies Monday as the start of the week and the first week of the year as the one containing the first Thursday.
Take a look at the blue line in the above graph. When the new week starts, visitors are tagged and counted as members of the Week 43 cohort. By mid-week, the number of unique visitors to your product tends to rise as some of those acquired earlier in the week come back. Depending on your traffic, the cohort count may peak on the last day of the week. On the next day, visitors begin to be assigned to the next cohort group (Week 44), shown in orange above. Without the constant injection of new visitors each day, the visit contribution from the Week 43 cohort falls off dramatically and then decays slowly. Keeping this decline in check is what retention is all about. Retention analysis makes the long-term value of the newly acquired visitors abundantly clear.
There are better ways to visualize how well you’re retaining new visitors. In the cohort example below, each interval shows the users retained as a percentage of the initial visit.
What does a graph like this tell us? First of all, it says that a large percentage of the first-time users of our product do not return the day after they try it. After that, users tend to leave at a much lower, but consistent rate. In other words, if a user returns once or (better) twice then they are reasonably likely to keep doing so. As it turns out, this is a reasonably common observation that’s valid across a number of product types and analysis intervals.
For example, the graph below is taken from a Mixpanel blog post on the importance of tutorials in social games.
Each line represents the weekly retention of a different social game. Obviously the one week retention varies significantly among these titles. Notice, however, that the retention is mostly flat after that first week. Again, initial engagement is critical for long-term retention.
Cohort analysis: engagement
Grouping users into cohorts based on their signup date can also be used to measure changes in engagement. It’s much easier to make informed decisions about new features and refinements when you can correlate cohort behavior changes in your startup analytics with product alterations. For example, you can determine if users that sign up now are more likely to tell a friend about your product than users that signed up a month ago. By combining this with a split test, in which you, say, offer a new sharing feature to only half of your users, you’ll better understand whether improvements can be attributed to the introduction of the new feature.
Let’s look at a cohort data analysis example. In early November you decide to remove a few seemingly redundant instructions from your web service explaining how users can connect and share with their friends. Now you want to ensure that the change was benign. The graph below is a non-cohort trend of your referral rate. This metric consistently hovers around 30%, showing that the changes made to your product during this time have not had a significant effect on referrals. Phew!
However, while it may be true that the overall referral rate hasn’t changed significantly, it’s possible that your large returning user base (of visitors who read your referral instructions before they were removed) are masking the impact of your change. While there are other approaches, the cleanest way to chart the impact of your change is to use a basic cohort analysis. Instead of plotting the percentage of users who refer their friends each week, we’ll instead graph the percentage of users who sign up each week and then, at some point in the future, decide to refer their friends. This allows us to consistently focus on the experience of new users who aren’t tainted by prior experience with the product.
Looking at the cohort for September 1st in the graph above, we would read the data like this: 30% of users who signed up during the week of September 1st eventually decided to refer their friends. There’s an important distinction between this and the previous graph: the users from each weekly cohort who refer friends may do so in the week that they join or any week thereafter. This means that the values for each weekly cohort will continue to change over time.
This is different from cohort retention analysis where the period over which a user may return is usually fixed. Of course, you should feel free to limit the engagement analysis window to whatever is appropriate for your application. It’s possible, for example, that friend referrals from a weekly cohort are only valuable in your product if they occur within 2 weeks of the user signing up.
The cohort graph reveals a couple of things that weren’t obvious in the non-cohort analysis:
- Starting the week of September 22nd, the referral rate of each weekly cohort consistently exceeded the average referral rate. Is this statistically significant? What changed? If it’s something you did then you need to do more of it.
- Something tragic happened in early November. Perhaps your change wasn’t as benign as you thought or maybe the correlation is coincidental. In any case, there’s a clear need to investigate further.
In both cases, a key performance metric for your startup changed substantially but it was masked by the size of your existing user base. In the same way, it’s possible for a large proportion of new users to mask changes in your existing user base. The cohort analysis uniquely clarifies these changes by tuning your focus to the user groups that provide the most insight into the impact of product changes.