Personal a/b, a/b/n... study repo.
The electronic house company is an online commerce (e-commerce) that sells various computer products for homes and offices. Customers can purchase from smaller equipment such as mice, headphones and hdmi cables to computers and laptops through the company's website, after purchase the products arrive in the comfort of their homes.
The ux designers team has been working on a new sales page with the objective of increasing the sales rate of a product in the store, this product is the bluetooth keyboard. The product manager said that the average page conversion rate is 13% and he expects a 2% increase in conversion from the new page being developed.
The bluetooth keyboard can be purchased for 4,500 in cash or 12 interest-free installments on your credit card.
But before actually changing the page when it is developed, the product manager would like to test the effectiveness of the page on a smaller group of customers in order to avoid possible drops in conversion or whatever other problems the page may have in production.
Before the new page goes into production, it will be tested on a smaller group of customers in order to avoid risks such as a drop in conversion and others, but it will be possible for this smaller group to calculate the potential and expected revenue rates of this new page!
For this problem, the success metric is "conversion rate", based on status quo conversion (12%) the P.M. expects a new conversion based on ux designers team new page. The expected new conversion is 15% ([13% status quo] + [2% of new page]) for bluetooth keyboard.
The word conversion has several meanings, for this specific problem it is precisely the purchase, "the purchase of the bluetooth keyboard".
For this problem, the success metric is the "price" or "number of itens purchases" not a proportion (0.15% -> 0.03% Lift), is a absolute number, (100 is the price sales mean for page A). The expected new price sales mean is 110 (status_quo * 1.10). Same problesm but with other expected metric for A/B Testing.
For this problem, whe working with several page titles, based on all page titles let's answer the question, which is better ?.
Why AB Testing ?
You do a/b tests in order to improve a business metric for example a click conversion rate (CTR), sales, views or others. To improve these metrics, several techniques are developed, in the world of ux and design, the creation of new pages, color changes or other factors that influence the click, purchase or other metric that you are trying to improve with the elaboration of the new page.
Top 5 E-commerce metrics:
- CTR / Click Through Rate. Total of new customers per button click or similar.
- CVR / Sales Conversion Rate. Total of new customers per successful sale.
- CLF / Customer Lifetime Value. Total revenue acquired by a customer.
- CAC / Customer Aquisition Cost. Price for aquisition of each client.
- CAR / Cart Abandoned Rate. Why are customers abandoning carts??
For this problem, one way to solve is using ab test or other ab test approaches like multi armed bandits.
If do not have Data, its necessary to perform a Experiment Design step to infer number of samples and tools to segment the population into two groups called control and treatment. In this case, the samples were previously collected.
- Descriptive Statistical.
- This is the first step after you get the dataset, in this step you make simple cleanings like removing duplicates rows, erros on public segmentation and others cleanings, and see some statistical status when possible.
- Exploratory Data Analysis.
- In this step you make some analysis on data and metric observed on the public segmentation experiment, the analyzes in this case are the visualization of the obtained distributions and some validations of homogeneity.
- Experiment Design.
- Focus on previos hypothesis definition validation, setup the parameters of test, re-sample if necessary, compute explcit metrics, select and apply inference statistical test and translate results to money!.
If do not have Data, its necessary to perform a Experiment Design step to infer number of samples and tools to segment the population into two groups called control and treatment. In this case, the samples were previously collected.
Same steps of proportion test
- Descriptive Statistical.
- This is the first step after you get the dataset, in this step you make simple cleanings like removing duplicates rows, erros on public segmentation and others cleanings, and see some statistical status when possible.
- Exploratory Data Analysis.
- In this step you make some analysis on data and metric observed on the public segmentation experiment, the analyzes in this case are the visualization of the obtained distributions and some validations of homogeneity.
- Experiment Design.
- Focus on previos hypothesis definition validation, setup the parameters of test, re-sample if necessary, compute explcit metrics, select and apply inference statistical test and translate results to money!.
Based on other two problem, this one is similar, but whe have multiple page titles from google analytics.
- Load Data.
- Simple manual input data from GA and compuse simple metrics like conversion and calcs.
- Experiment Design.
- Focus on previos hypothesis definition validation, setup the parameters of test, re-sample if necessary, compute explcit metrics, select and apply inference statistical test for multiple tests and translate results to money if have!.
The effect size is nothing more than a result based on how big the expected effect is. In other words, the power of a statistical test is the probability that it will produce statistically significant results.
- The effect size defines the size of the sample to be collected in both groups to start the segmentation of control and treatment.
- Small differences require more data, larger differences require less data.
There are two distributions, the normal distribution already measured (status quo) and the distribution observed after collecting and preparing these data. If the expected observation is small, then both distributions will be close and more data will be needed because there is a lot of uncertainty (overlapping), whereas when the observation is very large, little data is needed because both distributions will be well separated.
The sample size or $n$ is the function of $effect~size$, $a$ and $power$. In this example, when the investigator anticipates a certain effect size or calculates it including the alpha and power of the test, with these three parameters it is possible to calculate the minimum size of a sample for the appropriate statistical inferences.Based on cohen's book, there are tables for certain sample sizes based on the type of test, for example the chi square test there are tables for the sample size of the type (contingency test and fit test) for various definitions of alpha, effect size and power.
Some of the formulas and definitions is implemented on pythons library called "statsmodels", but, for Anova test, it's gives "wrongs" samples sizes, need to checkout on future ;-;
"Statistical inference is the process of using a sample to infer the properties of a population. Statistical procedures use sample data to estimate the characteristics of the whole population from which the sample was drawn." ~ Jim
By using procedures that can make statistical inferences, you can estimate the properties and processes of a population. More specifically, sample statistics can estimate population parameters.