A/B test with Clickstream data

On an ab test, essential required data points are: * startA * successA * startB * successB ## Granularity But then we have the option to look at data by different granularity: 1. Event level 2. Session (or visit) level 3. Users (or visitor) level User can go to start many times (like in web pages), and every time an event is recorded. From those we will calculate test success like: successA / startA ## Which Granularity is best ? **Events:** The sum of events alone will be biased because user can go to start many times, and every time we count successA + 1, thus lowers the attempt success rate, but in reality is the same attempt still. So, no good… ## But between session vs user level ? Depends on the test: **User:** when we are ok to allow user longer period to complete the flow ? like a week of being exposed to it that eventually succeeds. - ex: marketing promotion that reaches user from different channels, continuously for a week, maybe all contribute a little in nudging the user towards a sale. **Session:** When we want measure a success or fail for only an attempt, then best use session. Like each time you try we measure succeed or fail. - ex: Attempting to do a login Practicalities: * I’ve been finding that most often session level is needed. * When only **user** available for an experiment where **session** is desirable, then group data by a small time granularity, saying for example users had 1 day to attempt it.

Alexandre Matos Martins

A/B test with Clickstream data

A/B test with Clickstream data

You May Also Enjoy

Thoughts on validating model accuracy - 2 Jul, 2021 (data)

Supervised and Unsupervised ML interplay - 18 May, 2021 (data)

Statistical Significance test using Permutation - 17 Mar, 2021 (data, montecarlo)

Evolution of metrics for the social media era - 18 Feb, 2021 (data)