experimentation

Hypothesis testing for product experiments in Python

I use hypothesis testing to quantify whether observed differences are likely noise or signal, but I keep the business context attached. A tiny p-value without practical effect size is not a win. The code should make assumptions visible: sample sizes,

A B testing analysis with confidence intervals and guardrails

Experiment analysis should not stop at a binary win or lose label. I calculate uplift, confidence intervals, and guardrail metrics like latency or refund rate before recommending rollout. The point of the analysis is decision quality, not statistical