kfogel / OneTime

An open source encryption program that uses the "one-time pad" method.

Home Page:http://red-bean.com/onetime

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

discussion: size-dependent fuzz?

aaannndddyyy opened this issue · comments

Would it make sense to optionally make self._default_fuzz_source_length and self._default_fuzz_source_modulo determined by the length of the plaintext ?
Thus the fuzz could be always between 0 and a fixed percentage of the lengths, instead of always between 0 and 511, no matter how big the file gets.
WOuld be codewise trivial, but would it be worth burning more pad ?

Hi, @aaannndddyyy. A few thoughts:

  • The default_fuzz_source_length is the number of bytes to read from the pad to use for constructing an integer which is then the actual fuzz length. The only change I could imagine making with default_fuzz_source_length is bumping it from 2 to 4 -- i.e., from a 16 bit integer to a 32 bit integer. It's hard to imagine wanting more than 64k of fuzz, though! So really only the default_fuzz_source_modulo matters here.
  • We don't always know the length of the plaintext input in advance, because OneTime might be operating with stream-based input. This is another argument (as per my discussions in issue #17) for getting rid of the head fuzz and having only tail fuzz, because tail fuzz length can be adjusted as you describe above. I think there should be an upper limit, or some kind of curve that's different from just "fixed percentage" at least, since I'm not sure that in the case of a very large plaintext input it makes sense to also burn a correspondingly large amount of pad on fuzz.

So basically, yes, I think this is a good idea, if done right, but would like to finish the tweaks related to issue #17 first. Although it might not be clear from the discussion there, I'm considering taking @maqp's advice and just getting rid of one of the two fuzz sides, because (especially right now with a 0-512 limit on fuzz) there is no meaningful gain from disguising the position of the plaintext within the overall encrypted message: it's not solving a real problem, as long as the pad is really random OR the plaintext is not a known plaintext, and even if it were solving a problem, it's only making it at most 512 iterations harder to solve that problem, which is not very much for a computer.

If i see it correctly, by only reading 2 bytes for determining the fuzz size, 2^(2_8) numbers can be encoded. the modulo is for cutting it down if the number grows beyond a limit you set. 2^(2_8) bytes of fuzz is not much.
if you take an asymptotic curve instead of fixed percentage, it effectively means the attacker can get pretty much the exact size if the file is very big, only tiny percentage of margin.

Re "it's only making it at most 512 iterations harder to solve that problem, which is not very much for a computer.", see my comment on issue #17