Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndk-stack.py fails to symbolise on windows #1587

Closed
shrukul opened this issue Sep 22, 2021 · 15 comments
Closed

ndk-stack.py fails to symbolise on windows #1587

shrukul opened this issue Sep 22, 2021 · 15 comments
Assignees
Labels

Comments

@shrukul
Copy link

shrukul commented Sep 22, 2021

Description

I was learning about how to symbolise crash stacks on android (tombstone file). So, I induced a crash (nullptr-dereference) and it indeed produced a crash and generated a tombstone.txt file. When symbolising the crash stack via ndk-stack.py, I noticed that -

  1. On macOS (If apk is built on macOS and symbolised on macOS), I get a perfectly symbolised crash stack. (can see the source file and line number causing the crash)
  2. On windows (If apk is build on windows and symbolised on windows), the symbolise fails. (cannot see source file and line number)

I debugged further and found that the below snippet in ndk-stack.py file -

def get_zip_info_from_offset(zip_file, offset):
    """Get the ZipInfo object from a zip file.
    Returns: A ZipInfo object found at the 'offset' into the zip file.
             Returns None if no file can be found at the given 'offset'.
    """

    file_size = os.stat(zip_file.filename).st_size
    if offset >= file_size:
        return None

    infos = zip_file.infolist()
    if not infos or offset < infos[0].header_offset:

It looks like (correct me If I'm wrong) the above snippet expects infos to be sorted based on header_offset (this is true for apk built on macOS, but not on windows). To test out my hypothesis, I added a line to sort it -

def get_zip_info_from_offset(zip_file, offset):
    """Get the ZipInfo object from a zip file.
    Returns: A ZipInfo object found at the 'offset' into the zip file.
             Returns None if no file can be found at the given 'offset'.
    """

    file_size = os.stat(zip_file.filename).st_size
    if offset >= file_size:
        return None

    infos = zip_file.infolist()
    infos = sorted(infos, key=lambda k: k.header_offset)
    if not infos or offset < infos[0].header_offset:

and sure enough - the symbolise started working for me perfectly. Hence, thought I'd check with ndk team - do you see a need to introduce the above sort line? (I'm unsure why it's not sorted on windows. Is this due to apk not generated in proper format on windows for some reason?)

Environment Details

Not all of these will be relevant to every bug, but please provide as much
information as you can.

  • NDK Version: 22.1.7171670
  • Build system: CMake 3.10.2
  • Host OS: Windows 10.
  • ABI: x86_64 (any ABI for that matter)
  • NDK API level:
  • Device API level: 28

Let me know, if any more details are required!

@shrukul shrukul added the bug label Sep 22, 2021
@DanAlbert
Copy link
Member

r23 was released a little while ago, but I don't think ndk-stack has changed since r22. The tools it calls certainly did though, so it's probably worth checking if this is already fixed there.

@shrukul
Copy link
Author

shrukul commented Sep 22, 2021

Hi Dan. Thanks for the quick reply!

I just tried out ndkVersion "23.0.7599858", and the issue doesn't occur in that version.
Since the issue is not present in current LTS, closing the issue. Thanks!

@shrukul shrukul closed this as completed Sep 22, 2021
@enh-google
Copy link
Collaborator

that code in ndk-stack.py hasn't changed though (not since 2019-06), so if anything's changed it seems like it's the zip file on Windows? did you also rebuild, and now your apk is sorted? (if so, i'll see if i can find out why that changed, and whether there's a test to prevent that from becoming unsorted again. if not, we should probably take your change to use sorted() even if it is working right now :-) )

@enh-google enh-google self-assigned this Sep 22, 2021
@shrukul
Copy link
Author

shrukul commented Sep 23, 2021

Hi @enh-google / @DanAlbert -

I'm no-more able to reproduce this issue (tried again with ndk 22, but this time I'm able to symbolise the crash stack) without using sorted().

I haven't updated Android Studio since I last observed this issue, so not sure if a clean build may have changed anything.

@enh-google - I agree with your suggestion - If the zip file format "changes/can change", then it's probably the responsibility of ndk-stack.py script to be as resilient as possible to such a scenario. You/Dan are the right persons to take the decision though. ✌🏻

@enh-google
Copy link
Collaborator

not sure if a clean build may have changed anything

but you didn't come on the idea of using sorted() by accident, right? you worked that out from looking at your apks? and if you look at your current apks, they seem to already be sorted now?

@cferris1000
Copy link
Collaborator

To answer the original question, the code does expect that the info objects are "sorted", or, more specifically, that the entries are returned in the order in which each file exists in the archive. According to the python documentation for the infolist function:

Return a list containing a ZipInfo object for each member of the archive. The objects are in the same order as their entries in the actual ZIP file on disk if an existing archive was opened.

Given this, I don't know how you could have created a zip file that does not return this list in this order. Maybe there is some bug in python that doesn't obey the documentation. The way zips work, I can't even imagine how you could create a zip file that would not have the archive entries in order. If you can replicate the zip file that doesn't seem to follow the python documentation, it would be great to attach and see what might be going on.

@shrukul
Copy link
Author

shrukul commented Sep 24, 2021

@enh-google / @cferris1000 / @DanAlbert -

Okay, so here's the good and bad part -

  1. good part - I had duplicated the original apk file, so I have the apk file with me (and I can see that the header_offset is not sorted in the apk!)
  2. bad part - I do not have the symbol files nor the crash stack.

Anyway, the get_zip_info_from_offset() is used for figuring out which .so file resulted in the crash (not for actual symbolisation), hence I'm able to reproduce the issue effectively.

But I'm a bit concerned about sharing the apk file here (I work for a company, and the apk is basically a test app for an internal library. Plus, it's a debug-configuration-built app). So, I'll need to check with my team if it's safe to share with NDK team.

Alternatively (and I'm not sure if this is usually done) - Can we have a video conference where I can instead reproduce (and perform any steps you'd want me to?). If this isn't acceptable, I'll check with my team (on Monday) and get back.

@enh-google
Copy link
Collaborator

it's likely just the zipinfo output for the directory would be sufficient and not leak anything (unless the names of files in your zip would say too much)?

what created the zip file? was it custom tooling, or just Android Studio?

@shrukul
Copy link
Author

shrukul commented Oct 16, 2021

Hi @enh-google - Very sorry for the delayed response.

  1. Definitely, file names wouldn't be a problem. I haven't used zipinfo command before, so could you please tell the exact command to run? (I tried zipinfo -l base.apk, but I'm not sure if the output has what you require.)
  2. The apk was created by Android Studio. (we do not use a custom tooling. )

@cferris1000
Copy link
Collaborator

Using zipinfo -v should dump all of the relevant information. If you don't want to show all of this information, the three parts of information that would be useful are:

Central directory entry #40:
offset of local header from start of archive:   0
compressed size:                                11368 bytes

Dumping just that would tell me if something is weird about the apk. Specifically, if the offset is not always increasing, that would be odd and unexpected.

@shrukul
Copy link
Author

shrukul commented Oct 19, 2021

Hi @cferris1000 - Thanks for the command! PFA the output of zipinfo -v base.apk
out.txt

The C++ library is libgude_test.so. Please let know if you need any more details.

@cferris1000
Copy link
Collaborator

Okay, that definitely shows that the zip file is not in always increasing order. So, we'll definitely have to do the sort before the comparison. It probably worked for you because it depends on the ordering of the offsets. It looks like you got lucky sometimes so it make the problem appear and disappear.

Thanks for persevering through our requests for data, and we'll get a fiz in.

@DanAlbert DanAlbert reopened this Oct 19, 2021
@shrukul
Copy link
Author

shrukul commented Oct 20, 2021

Just a quick note -

  1. I faced the issue today, this time on macOS. (so, this is not windows-specific issue)
  2. If the header-offsets are not sorted, then ndk-stack.py may quit with exception. (Sorting fixes this..)

Below is the exception -


    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/bin/ndk-stack.py", line 428, in <module>
      main(sys.argv[1:])
    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/bin/ndk-stack.py", line 393, in main
      readelf_path, tmp_dir)
    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/bin/ndk-stack.py", line 312, in get_elf_file
      tmp_dir.get_directory())
    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/lib/python2.7/zipfile.py", line 1024, in extract
      return self._extract_member(member, path, pwd)
    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/lib/python2.7/zipfile.py", line 1078, in _extract_member
      with self.open(member, pwd=pwd) as source, \
    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/lib/python2.7/zipfile.py", line 1006, in open
      close_fileobj=should_close)
    File "/Users/labuser/Library/Android/sdk/ndk/22.1.7171670/prebuilt/darwin-x86_64/lib/python2.7/zipfile.py", line 526, in __init__
      self._decompressor = zlib.decompressobj(-15)
  AttributeError: 'NoneType' object has no attribute 'decompressobj'

@cferris1000
Copy link
Collaborator

I fixed this with a similar change to what was original proposed. The fix is:

https://android-review.googlesource.com/c/platform/ndk/+/1866134

I don't know what ndk version this will wind up in, but let us know if this doesn't work for you.

Thanks again for finding this.

@DanAlbert
Copy link
Member

DanAlbert commented Oct 27, 2021

Thanks for the fix @cferris1000 👍

Reopening just in case we need to cherry-pick, but will probably end up merging from master again. We'll get it into r24 either way (too late for beta 1, but will be in beta 2).

@DanAlbert DanAlbert reopened this Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants