A question regarding the implementation of KREEP’s "Word Identification" stage

Question

A question regarding the implementation of KREEP’s "Word Identification" stage

chenfeifei0927 opened this issue 2 years ago · comments

Dear Professor Monaco,

Hello Professor, I’m sorry to disturb you in your busy schedule. I’m an undergraduate student of Beihang University (BUAA) from China, majoring in Cyber Security.

Studying on side-channel attack on search engines with autocomplete, I have recently read one of your related papers, titled “What Are You Searching For? A Remote Keylogging Attack on Search Engine Autocomplete”(USENIX'19). And I have noticed that the source code of KREEP is available at github.
However, when trying to reproduce this attack based on the available source code, I have encountered a problem about the “Word identification” stage. In this article, a three-layer neural network is used to predict key probabilities, including a bidirectional recurrent neural network (BiRNN) with GRU, a 1D convolutional layer and a dense layer with softmax activation. But in kreep-master/kreep/keytiming.py, the training of bigrams model is based on maximum likelihood estimation, using the 136M keystroke dataset to derive the mean and standard deviation of a Gussian distribution. And the probabilities of each word are predicted by the PDF of Gussian distribution, instead of using the neural network. Obviously, there are some significant differences between these two methods. And I think the attack performance can be effected to some degree.

As a result, if possible, could you please send me the source code of “Word identification” stage, using the three-layer neural network? I promise they will be used only for research purpose.

I would appreciate it if you could help me. Looking forward to your early reply!