CheckPointSW / Karta

Karta - source code assisted fast binary matching plugin for IDA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

thumbs_up_ELF crashing on ARM binary

MrPeck opened this issue · comments

When I run thumbs_up_ELF on a ARM 32 bit binary I get the following exception:

C:\Users\pedro.peck\Desktop\Karta\src\thumbs_up\thumbs_up_ELF.py: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
Traceback (most recent call last):
File "C:\Program Files\IDA 7.2\python\ida_idaapi.py", line 572, in IDAPython_ExecScript
execfile(script, g)
File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up/thumbs_up_ELF.py", line 186, in
main()
File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up/thumbs_up_ELF.py", line 178, in main
result = analysisStart(analyzer, code_segments, data_segments)
File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up/thumbs_up_ELF.py", line 43, in analysisStart
if not gatherIntel(analyzer, scs, sds):
File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up\analyzer_utils.py", line 20, in gatherIntel
if not analyzer.func_classifier.calibrateFunctionClassifier(scs):
File "C:/Users/pedro.peck/Desktop/Karta/src/thumbs_up\utils\function.py", line 217, in calibrateFunctionClassifier
clf.fit(X_train, Y_train)
File "C:\Python27\lib\site-packages\sklearn\ensemble\forest.py", line 250, in fit
X = check_array(X, accept_sparse="csc", dtype=DTYPE)
File "C:\Python27\lib\site-packages\sklearn\utils\validation.py", line 552, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Let me know if anything is unclear! :)

Hi, sorry for the delay I was out-of-office.

From the exception it looks like you have a single sample (simple function to train on), could you add some debug prints and check if this is indeed the scenario? How many functions are sent to the classifier for calibration?

While I probably need to update the code to better handle this case, I can't believe that any meaningful training could be made on a sample set of a single sample...

Added a better error handling.

If the issue persists even with a high amount of functions, please feel free to re-open the ticket.