CheckPointSW / Karta

Karta - source code assisted fast binary matching plugin for IDA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running ThumbsUp on raw firmware binaries

jiska2342 opened this issue · comments

Hi,

I encountered the a few issues when running the Thumbs Up script with the following configuration:

  • Up-to-date Ubuntu 19.10
  • Python 3.7.5
  • IDA Pro 7.4

The requirement sark==2.0 could not be installed, so I replaced it in the install script and just took the most recent one from GitHub, which was 7.8. This might already be the source of my subsequent errors ;)

/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py: unpack requires a buffer of 8 bytes
Traceback (most recent call last):
  File "/opt/idapro-7.4/python/3/ida_idaapi.py", line 593, in IDAPython_ExecScript
    exec(code, g)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 226, in <module>
    main()
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 210, in main
    analyzer.linkFunctionClassifier()
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/analyzers/arm.py", line 48, in linkFunctionClassifier
    self.func_classifier = FunctionClassifier(self, function_feature_size, function_inner_offset, classifiers_start_offsets, classifiers_end_offsets, classifiers_mixed_offsets, classifier_type_offsets)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/utils/function.py", line 68, in __init__
    numpy.random.seed(seed=struct.unpack("L", ida_nalt.retrieve_input_file_md5()[:4])[0])
struct.error: unpack requires a buffer of 8 bytes

Fixed this by replacing line 68 with numpy.random.seed(1337) and it worked.

Console output in IDA continues as follows:

[27/05/2020 08:14:08] - Thumbs Up Logger - INFO: Phase #4
[27/05/2020 08:14:08] - Thumbs Up Logger - INFO: Observe all code patterns from the improved analysis
[27/05/2020 08:14:08] - Thumbs Up Logger - INFO: There are 8913 scoped functions for code type 1
[27/05/2020 08:14:09] - Thumbs Up Logger - INFO: Calibration: Function Prologue Accuracy: 91.47%
[27/05/2020 08:14:11] - Thumbs Up Logger - INFO: Calibration: Function Epilogue Accuracy: 96.50%
[27/05/2020 08:14:12] - Thumbs Up Logger - INFO: Calibration: Function Prologue/Epilogue Accuracy: 97.00%
[27/05/2020 08:14:14] - Thumbs Up Logger - INFO: Testing: Function Prologue Accuracy: 91.72%
[27/05/2020 08:14:15] - Thumbs Up Logger - INFO: Testing: Function Epilogue Accuracy: 97.44%
[27/05/2020 08:14:16] - Thumbs Up Logger - INFO: Testing: Function Prologue/Epilogue Accuracy: 97.25%
[27/05/2020 08:14:22] - Thumbs Up Logger - INFO: Start marking functions, even without xrefs

Got the following error displayed in IDA:

/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py: 0
Traceback (most recent call last):
  File "/opt/idapro-7.4/python/3/ida_idaapi.py", line 593, in IDAPython_ExecScript
    exec(code, g)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 226, in <module>
    main()
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 218, in main
    result = analysisStart(analyzer, code_segments, data_segments)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 122, in analysisStart
    functionScan(analyzer, scs)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/analyzer_utils.py", line 172, in functionScan
    if analyzer.func_classifier.predictFunctionStart(line.start_ea, guess_code_type):
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/utils/function.py", line 368, in predictFunctionStart
    return self._start_classifiers[code_type].predict([sample])
KeyError: 0

IDA still continues automatic analysis afterward. Not sure if it worked or didn't. The results are definitely better than after just running a linear analysis on the ROM :)

Thanks for the detailed issue report. Will start working on it right away.
Just one question: Is this an ARM firmware file with the vast majority of functions being in THUMB mode?

I couldn't reproduce the error with struct.unpack("L") expecting 8 bytes instead of 4 bytes. I guess it comes from the type "long" which varies in size, but Python's documentation (in all version) specify this format is being fixed 4 bytes. Instead of just changing it to "I" (int) I'm trying to check this and hopefully notify Python that they need to update their docs.

Could you elaborate on your exact setup and versions:

  • 64bit / 32bits
  • Does this size requirement in python also consists outside of IDA Pro?

NVM, their documentation was just not clear enough. "L" indeed stands for "long" which is used as "sizeof(long)", hence varies in size.

The firmware is ARM v7 little endian and mostly (only?) Thumb mode.

A few examples are available here: https://github.com/seemoo-lab/polypyus/tree/master/examples/history

I used IDA 64bit but with 32bit analysis.

This pull request fixed all the bugs listed in this issue, at least on my setup. If any of the bugs persist, please feel free to re-open this issue.

Thank you very much for this fast fix :)

The initial error is gone. But it still breaks on the Thumbs Up stage #4 with this message, on both ida and ida64:

Traceback (most recent call last):
  File "/opt/idapro-7.4/python/3/ida_idaapi.py", line 593, in IDAPython_ExecScript
    exec(code, g)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 231, in <module>
    main()
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 223, in main
    result = analysisStart(analyzer, code_segments, data_segments)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/thumbs_up_firmware.py", line 123, in analysisStart
    functionScan(analyzer, scs)
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/analyzer_utils.py", line 179, in functionScan
    if analyzer.func_classifier.predictFunctionStart(line.start_ea, guess_code_type):
  File "/media/sf_seemoo/software/Karta/src/thumbs_up/utils/function.py", line 366, in predictFunctionStart
    return self._start_classifiers[code_type].predict([sample])
KeyError: 0

Sorry for the late response, I saw the notification just now.

The code already supported predicting only the single-supported code type, so that this exception will be avoided. The bug is that I accidentally checked the cpu's supported types list instead of the active supported list. I'm now testing the patch to check that nothing breaks, and hopefully it will be committed very soon.

It would be great if I could add your sample to my test suite. If this is indeed a file from https://github.com/seemoo-lab/polypyus/tree/master/examples/history, could you please share the *.idb / mapping instructions to IDA + list of code segments and data segments as printed out by Thumbs Up?

Yay, it's working now :D Two hours before the deadline, should still work. I'll send you the results, detailed setup, etc. later :)