# Hypothesis testing for leakage assessment in side channel analysis

This post has a slightly different format compared to previous posts, as the material was prepared for an three hour in-person tutorial at the amazing summer school on real-world crypto and privacy in Vodice, Croatia.

The tutorial aims to give participants a solid technical understanding of the mechanics of hypothesis testing in the context of side-channel analysis. In the first part, we focus on the fundamental concepts of hypothesis testing: we practise choosing the null-hypothesis H0 (1), mention the data collection step (2) and take a deep dive on testing which requires building the null-distribution estimating p-values (3).

To follow along, download the workbook and the slides and work on the exercises in the suggested order. I use three running examples to illustrate these concepts. The first, Lake_water.ipynb is a one-sample t-test where we want to estimate the salt concentration in a body of water. The second, PhD_vs_Faculty.ipynb is a two-sample t-test where we test if height has an influence on the chances of becoming a faculty member. Finally, the third example Magic_coins is helful for understanding p-values.

In the workbook you will find three exercises, pick and choose the one you find useful:

• The learning objective of exercise 1 is threefold: (1) we prove that there is natural variation when estimating a test statistic; (2) we determine the influence of the sample size samples on test statistic and (3) we demonstrate that the distribution of the mean test statistic is different compared to the distribution of the sample data;
• The learning objective of exercise 2 is to practice constructing null-distributions. In question 1 we explore what is a t-score and construct the null distribution for a one-sample t-test. In question 2 we construct a null-distribution for a two-sample t-test and in question 3 we construct a null-distribution for data that follows a binomial distribution.
• The learning objective of exercise 3 is to get an intimate understanding of how p-values are computed and what they represent and as a sanity check compare the p-values we estimated with the ones calculated by the python library.

In Part 2, we apply we apply hypothesis testing, in particular a two-sample t-test on a set of real traces for traces collected for two algorithms. Use the same workbook and these slides for guidance.