Cracking A/B Testing For Interview

Cracking A/B Testing For Interview

What is A/B Testing?

  1. Customer Funnel (Funnel Analysis)
  2. Define Metric (KPIs, Conversion Rate, etc.)
  3. Form Hypothesis
  4. Formulate Testing Plan
  5. Create Variation
  6. Run Experiment
  7. Analyze Testing Result
  8. Make Conclusion

Designing Qs

How long to run an A/B test?

  • First, the sample size needs to be obtained. Next, we will need to get 3 parameters: type II error/power (since power = 1 — type II error ), significance level, and minimum detectable effect.
  • Determine the sample size: sample size approximately equal to 16 multiplied by sample variance divided by delta (difference between treatment and control) squared. Or use the link given previously

  • Next, we will be dividing the sample size by the number of users in each group to get an approximate duration for the experiment. The calculated duration will usually be rounded to a weekly basis at the end.

How does each parameter influence the sample size?

  • If you have more samples, the sample variance will become larger.
  • If you have fewer samples, the delta (difference between treatment and control) will become larger.

How to estimate parameters?

  • Sample variance can be obtained from the data
  • But we will need to use the minimum detectable effect to estimate delta. The minimum detectable effect represents the smallest difference that matters in experiment/practice and is usually decided by multiple stakeholders.

Multiple Testing Qs

A company is running 10 tests for trying different versions of a web page, there is a case wins with a p-value less than 0.05, should the company make this change?

Solution:

  • It divides the significance level by the number of tests.
  • Drawback: conservative method
  • It is the expected value of the number of false positives divided by the number of rejections.

Novelty and Primacy Qs

Primacy Effect / Change Aversion:

Novelty Effect:

Solution:

  1. Rule out the possibility of these effects (the tests will be conducted only on first-time users)
  2. Compare first time users to experienced users in the treatment group (get an actual estimate of the impact of primacy or novelty effect)

Groups Interference Qs

Solution:

For social network market:

  1. Network Clusters
  • split users into different clusters where each of them interacts the most and then assigns the clusters randomly
  • A cluster is composed of an “ego” (an user/individual) and “alters” (the user’s/individual’s direct contacts)
  • measure the one-out network effect (the effect of “alters” treatment on “ego”), the user either has the feature or not
  • simpler and more scalable

For two-sided markets:

  1. Geo-based Randomization
  • split the sample by geolocations (allow to isolate users but will have big variance since each market is unique in certain ways)
  • Split sample by day of the week and assign all users to treatment or control group (only for the short-term treatment effect)
  • Don’t use this for something like a referral program

Case Study: Red vs. Green Button

Goal:

Hypotheses:

  • Compared with a red CTA button, a green CTA button will attract more users’ click
  • A fraction of these additional clicks will comp; let the transaction, thus increase revenue
  • There is a bigger lift for this change on mobile

Null Hypothesis:

  • The green button will cause no difference on Click Through Rate (Number of clicks/ Number of users that experiencing it) or other user behaviors

KPI to measure:

Data to be collected:

  • 90% of visitors → control group
  • 10% of visitors → treatment group

A Summary of Udacity A/B Testing Course

Recently I finished the A/B testing course by Google on Udacity. The course has been highly recommended to people who…

towardsdatascience.com

7 A/B Testing Questions and Answers in Data Science Interviews

A/B tests, a.k.a controlled experiments, are used widely in industry to make product launch decisions. It allows tech…

towardsdatascience.com

Alternate Text Gọi ngay