import numpy as np
from scipy import stats
control = np.array([21.1, 20.5, 19.9, 22.0, 20.8, 21.4])
treatment = np.array([22.8, 23.0, 22.2, 24.1, 23.5, 22.9])
statistic, p_value = stats.ttest_ind(treatment, control, equal_var=False)
uplift = treatment.mean() - control.mean()
print({'uplift': round(float(uplift), 3), 'p_value': round(float(p_value), 6)})
I use hypothesis testing to quantify whether observed differences are likely noise or signal, but I keep the business context attached. A tiny p-value without practical effect size is not a win. The code should make assumptions visible: sample sizes, equal variance choices, and the metric definition itself.