packing-box / docker-packing-box

Docker image gathering packers and tools for making datasets of packed executables and training machine learning models for packing detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error in one feature computation causes other features to not be computed

AlexVanMechelen opened this issue · comments

Issue

Some executables return a CFG with very few nodes. If the total extracted instruction length is smaller than the n-gram size n, then ngram_hist returns an empty list. Therefore, features relying on the ngram_hist, like zeropad(128, default=0)(binary['cfg']['ngram_hist'](3, True, False)[1])[0] result in an IndexError: list index out of range. This not only leads to empty ngram_hist related features, but all other features remain empty, also the non-cfg-based ones which do successfully compute otherwise.

Samples

/mnt/share/dataset-packed-pe/not-packed/IEExec.exe
/mnt/share/dataset-packed-pe/not-packed/AddInProcess32.exe
/mnt/share/dataset-packed-pe/not-packed/MacMakeup.exe
/mnt/share/dataset-packed-pe/not-packed/Updater5.exe