Define a global entropy measurement for strings and literals

Question

Define a global entropy measurement for strings and literals

maxfisher-g opened this issue a year ago · comments

Entropy calculations currently use per-file character frequency counts to define the expected probabilities for each character. It would be better to measure character frequencies on a large dataset of source files and then use the same frequency counts to analyse all packages.

Max Fisher · Answer 1 · Tue Oct 24 2023 11:18:55 GMT+0800 (China Standard Time)

It will be easier to measure character frequencies when we have static analysis data in bigquery