androguard / androguard

Reverse engineering and pentesting for Android applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ResParserError: res1 must be zero!

arcao opened this issue · comments

With any recently downloaded APK from Google Play console (universal APK, archive APK) I can't decode resources.arsc. I attached sample archive APK, but it crash with universal APK too. APK created directly by Gradle task doesn't cause it.

> androguard --debug arsc 240307133-archived.apk

2024-03-12 16:28:27.029 | INFO     | androguard.core.apk:_apk_analysis:312 - Starting analysis on AndroidManifest.xml
2024-03-12 16:28:27.124 | INFO     | androguard.core.apk:_apk_analysis:369 - APK file was successfully validated!
Traceback (most recent call last):
  File "C:\Python312\Lib\site-packages\androguard\core\apk\__init__.py", line 1543, in get_android_resources
    return self.arsc["resources.arsc"]
           ~~~~~~~~~^^^^^^^^^^^^^^^^^^
KeyError: 'resources.arsc'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Python312\Scripts\androguard.exe\__main__.py", line 7, in <module>
  File "C:\Python312\Lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\androguard\cli\cli.py", line 169, in arsc
    arscobj = a.get_android_resources()
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\androguard\core\apk\__init__.py", line 1549, in get_android_resources
    self.arsc["resources.arsc"] = ARSCParser(self.zip.read("resources.arsc"))
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\androguard\core\axml\__init__.py", line 1471, in __init__
    self.packages[package_name].append(ARSCResTypeSpec(self.buff, pc))
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\androguard\core\axml\__init__.py", line 2339, in __init__
    raise ResParserError("res1 must be zero!")
androguard.core.axml.ResParserError: res1 must be zero!

System Information

  • Androguard Version: 4.1.0
  • Python Version: 3.12.2
  • Operating System: Windows 10 Pro x64 22H2

Attachments

Hi @arcao , this issue was brought up by @Ch0pin here #1008 and this PR should resolve your issue. Make sure to install the version of androguard with the latest commits. Regarding MobSF I can see that it is still using an old version of androguard.

Additionally, since this is brought up again while it was not an issue in the past and older androguard versions face the same issue, I decided to take a bit closer look.

It seems that the typeSpec struct as defined in the main here still states that both res0 and res1 must be zero. So it appears it is not something coming from updates in the Android source, though it is evident that several apps now create resources with res0 and res1 having other values than zero.
ResTable_typeSpec contains specifications for a resource type while ResTable_type represents actual instances of resources within that type. This, in combination with the fact that this issue appears in split APKs, and androguard is not fully able to handle split APKs properly at this point, shows that we need to revisit the current fix in the future, after having more feedback, to make sure that all resource types and instances are accounted for properly.

When can we expect a new release with the changes in #1008 ?

hi @ajinabraham, will aim to release a patch version over the next days. It still under consideration whether the current fix might affect some other cases but in theory for all non split/universal apks it should work fine.

FYI for the Debian package, I just made those a warning rather than error and it seems to work fine, based on advice from @reox in one of the discussions in the issue tracker:

https://salsa.debian.org/python-team/packages/androguard/-/commit/89459f99a71561f2daf20199f37d5485205b0941

I switched to the upstream patch from #1008 and pushed 3.4.0~a1-12 to Debian.

How about including the patch in #1008 on top of 3.4.0~a1 and calling that 3.4.0~a2? Could be quick fix for those of us wanting to receive this fix sooner rather than later.

Here's the patch I used for the Debian package, which applies cleanly on 3.4.0~a1:

From 187b912784d77a36b4c36289e76b722127d272d1 Mon Sep 17 00:00:00 2001
From: Ch0pin <ch0pin@mackis.lan>
Date: Thu, 7 Mar 2024 17:21:38 +0000
Subject: [PATCH 1/1] added error handling for "res1" and "res0" must be zero
 errors which caused aborting the parsing

Forwarded: https://github.com/androguard/androguard/pull/1008
---
 androguard/core/bytecodes/axml/__init__.py | 35 ++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 15 deletions(-)

--- a/androguard/core/bytecodes/axml/__init__.py
+++ b/androguard/core/bytecodes/axml/__init__.py
@@ -2175,16 +2175,18 @@
         self.id = unpack('<B', buff.read(1))[0]
         self.res0 = unpack('<B', buff.read(1))[0]
         self.res1 = unpack('<H', buff.read(2))[0]
-        if self.res0 != 0:
-            raise ResParserError("res0 must be zero!")
-        if self.res1 != 0:
-            raise ResParserError("res1 must be zero!")
-        self.entryCount = unpack('<I', buff.read(4))[0]
-
-        self.typespec_entries = []
-        for i in range(0, self.entryCount):
-            self.typespec_entries.append(unpack('<I', buff.read(4))[0])
-
+        try:
+            if self.res0 != 0:
+                raise ResParserError("res0 must be zero!")
+            if self.res1 != 0:
+                raise ResParserError("res1 must be zero!")
+            self.entryCount = unpack('<I', buff.read(4))[0]
+
+            self.typespec_entries = []
+            for i in range(0, self.entryCount):
+                self.typespec_entries.append(unpack('<I', buff.read(4))[0])
+        except ResParserError as e:
+            log.warning(e)
 
 class ARSCResType:
     """
@@ -2663,11 +2665,14 @@
 
         self.size, = unpack("<H", buff.read(2))
         self.res0, = unpack("<B", buff.read(1))
-        if self.res0 != 0:
-            raise ResParserError("res0 must be always zero!")
-        self.data_type = unpack('<B', buff.read(1))[0]
-        # data is interpreted according to data_type
-        self.data = unpack('<I', buff.read(4))[0]
+        try:
+            if self.res0 != 0:
+                raise ResParserError("res0 must be always zero!")
+            self.data_type = unpack('<B', buff.read(1))[0]
+            # data is interpreted according to data_type
+            self.data = unpack('<I', buff.read(4))[0]
+        except ResParserError as e:
+            log.warning(e)
 
     def get_data_value(self):
         return self.parent.stringpool_main.getString(self.data)
-- 
2.39.2

From a quick analysis on the resources.arsc from the airbnb app I think that the reserved res1 field in ResTable_typeSpec was used to count the number of ResTable_type following it. Not sure why this happens and what is the purpose.

Now, regarding raising an error or simply logging it:

  • If an error is raised then the parsing of the rest of the chunk stops and we skip to the next chunk, essentially missing any information it contained.
  • If we simply log it, then in the case that the reserved res0 or res1 have a purpose, then that purpose is not part of the existing logic of the parser and therefore the results will not be correct.

As far as I could tell from the main branch of the Android source code res0 and res1 are still considered reserved. Additionally, checking what other tools are doing on that matter, it seems that they are either not validating the values of res0 and res1 or simply just skipping these 3 bytes (like jadx here).

Based on the information above, I will keep the try-except blocks and convert the raising of errors to only logging this as an error, so the rest of the chunks will be parsed properly.

Will release patch version 4.1.1 shortly

Thanks, your update makes sense to me! Since this issue is related to AAB, I wonder if the bundletool source might give some insight to what the new usages of res0 and res1 are?

I believe we can close this for now and revisit if needed in the future.
Indeed @eighthave it is a nice idea to check the bundletool for any hints on how the two reserved fields are now being used and I will put it in my backlog.