Common A/B Testing Questions Asked During Interviews
Source: Unsplash
Mục Lục
Introduction
Applying for jobs and preparing for multiple rounds of interviews with multiple companies can be more stressful than the existing job for many. The anticipation of what might be asked and how it might be asked can give sleepless nights. Today, I am going to try covering a tiny topic from the perspective of interview related questions for a data scientist role. I have already covered some key A/B testing concepts in previous posts. Let’s try to peek into what kind of questions might be asked to test whether you have theoretical knowledge or practical knowledge as well.
A/B Testing Interview Questions
- When should we do an A/B test?
An A/B test is usually run to test the success of any change in an existing feature or to test the impact of a new feature. Taking an example from Udacity’s A/B testing course, an A/B test can help you climb the peak of your current mountain. But it cannot help you decide which mountain you should climb.
- What is the first step in running an A/B test?
Once a product manager comes to you with an idea to test, do not just jump to setting up a campaign. Before even setting up and running an A/B test, there are some key steps that need to be done
-
Define the null and alternative hypothesis
-
Define your north star metric and guardrail metrics
-
Power analysis — to determine either sample size or minimum detectable effect for your north star metric
-
Create a test plan
-
Work with engineers/instrumentation teams to get appropriate tags in place
-
Make sure the tags are working
-
Get a sign off on the test plan from product managers and get the tags once again validated by engineers
- What is a null and alternative hypothesis?
The null hypothesis states that there is no difference between test and control. The alternate hypothesis states that there is a difference between test and control
- What is the difference between one-tailed vs. two-tailed tests?
One-tailed tests check for the possibility of change in only one direction while two-tailed tests check for the possibility of change in both positive and negative directions
- How would you explain the p-value to a layman?
For a particular test run, the p-value tells us that assuming the null hypothesis is true, meaning there is no difference between test and control, what are the chances of getting the results that we have by chance
- What are alpha and beta?
Alpha, also called the significance level, tells us the probability of type I error. Beta gives us the probability of type II error i.e. failing to reject the null hypothesis when it’s false
- What are type I and type II errors?
Type I error means rejecting a null hypothesis when it’s true i.e. there wasn’t any difference between test and control but we conclude that there is a difference. Type II error means failing to reject the null hypothesis when it’s false i.e. there was a difference but we couldn’t pick it up
- How long should you run a test?
Based on estimated daily visitors and the number of variations, you can calculate the test duration. For example, if your website gets daily traffic of 10k and the required sample size is 100k and the numbers of variants are 2 i.e. test and control, then the test should be run for 20 days – (100k/10k)*2
It is also advised to run an A/B test for at least 2 weeks to control for any variations due to weekdays vs weekends.
- How to conclude the results of your test?
It depends on a few things —
-
The north star metric should be significantly positive
(or neutral depending on what you are testing)
-
The P-value should be less than the alpha value
-
The upper and lower confidence interval should have minimal difference
-
The lift% should
not
be much lesser than the minimum detectable effect that we care for our primary metric
-
The daily trend of lift% of metrics should have the same sign on most days —
sign test
-
Finally, the guardrail metrics should be neutral, if not positive
- Can your treated samples be different from the assigned samples?
Yes. An assigned sample means anyone who became a part of that campaign (test or control). A treated sample is a subset of the assigned sample, based on additional conditions. For example, if you are running an A/B test on Search pages of your website, you may want to measure results only for visitors who saw Search pages within an assigned campaign
Conclusion
Knowing the theory is usually not enough. The more experiments you run, the better you’ll get at it. Every experiment teaches something new. This is more like a cheat sheet to revise the concepts. Hope this list of A/B testing interview questions helps in reducing some levels of anxiety!
Read the latest articles on our blog!
References: