Slow extraction and improvements to Minerva analysis
GeorgePantelakis opened this issue · comments
Bug Report
Problem description
In the analysis for the Minerva / bit size attack, the confidence intervals for each bit size are calculated twice. This makes the analysis twice as slow for no reason. Also, the nonce extraction takes a lot of time, but then one extraction doesn't depend on the other, it could be run in parallel on multiple CPU cores to speed up the process. Moreover, the individual k-size folder creation probably can be parallelized in analysis.
Expected behavior
The extraction and analysis could be much faster.
Include errors or backtraces
...
[i] Wilcoxon signed-rank test done in 10.5s
[i] Calculating confidence intervals of central tendencies
Done: 100.00%, elapsed: 6m 58.66s, speed: 14.78 bootstraps/s, avg speed: 11.94 bootstraps/s, remaining: 0.00s, ETA: 12:59:44 08-01-2024
[i] Confidence intervals of central tendencies done in 4.2e+02s
Creating graphs for k size 519...
[i] Graphing confidence interval plots
Done: 100.00%, elapsed: 6m 52.65s, speed: 14.93 bootstraps/s, avg speed: 12.12 bootstraps/s, remaining: 0.00s, ETA: 13:06:39 08-01-2024
[i] Confidence interval plots done in 4.14e+02s
...
Additional context
Add any other context about the problem here.
The measurements-invert.csv
should also be processed when the test-tls13-minerva.py
is executed (the analysis.py
should still probably process one file at a time)
Other optimizations possible:
- Calculate how many samples will produce around 1ns CI and analyze only them.
- More advance method of not dividing by 0 in analysis
- In analysis be able to provide the measurements file name
The other thing that we probably should do, is create a report.txt
, similar to what the regular analysis does, i.e. one that includes:
- average, max and min p-values for the sign test and Wilcoxon signed rank test
- result of the Skillings-Mack test
- statistics for the 10-or-so biggest k-values: the p-test for sign test and Wilcoxon signed rank test, the 5% and 45% trimmed means and their confidence intervals
- layperson explanation of the result (side-channel possible, side-channel confirmed, or side-channel unlikely)
In general it should be fairly small (fit in 80-100 columns by 15-20 rows)
All the bugs and new features mentioned in this issue were done.
@GeorgePantelakis are we analysing measurements-invert.csv
by default in the TLS script now?