Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unwinding crashes on ARM32 but not elsewhere even when the exceptions are within a single library #1192

Closed
Marian-Kechlibar opened this issue Feb 18, 2020 · 41 comments
Labels

Comments

@Marian-Kechlibar
Copy link

Marian-Kechlibar commented Feb 18, 2020

Description

I am not sure that what I am reporting is a NDK bug or my own fault.

I have a large project with 8 pure C shared libraries and one C++ shared library which uses the functions defined in those C libraries.

The C++ library supports and uses exceptions. It throws special exceptions (called Leave in my code) and it has a catch block for catching them. This setup works perfectly on ARM64 and x86 builds. (It also works on Windows and Linux where the same code is cross-compiled). It only does not work in ARM 32-bit build, where the application crashes.

I am linking the libraries with the gold linker, because with the experimentally supported lld linker I ran into the same problem as this user.

The problem is that the first C++ exception that occurs in my code will crash the application. The stack indicates that the unwind operation is going through libcares.so, one of the C libraries that probably (AFAIK) should not have any business unwinding C++ exception code. Not through libCryptoCult_Expert.so, which is the C++ library intended to catch its own exceptions. The exceptions do not cross the JNI boundary.

In other builds (ARM64, x86), libCryptoCult_Expert.so catches all the exceptions correctly and everything runs smoothly.

unwind_exception_libcares

I am building the 32-bit libraries with the flags suggested here in the NDK docs, so

-Wl,--exclude-libs,libgcc_real.a -Wl,--exclude-libs,libunwind.a

By manually inspecting the resulting *.so files of the C libraries using readelf I can see that they still contain the Unwind and other GNU symbols in ARM32, but not in ARM64 or x86 variants. It is about 50 extra symbols per *.so.

I can get rid of those symbols by specifying -Wl,--exclude-libs,libgcc.a additionally, but the problem does not change, only the caught stack is now much less clear

unwind_exception_from_JIT

All the libraries have STL _c++shared specified. I also tried to specify none for the C libraries, no change. I also tried to include the _libc++shared.so into the APK and preload it, no luck here. Identical behavior.

Unfortunately I cannot share the proprietary code base, so any insight from the NDK gurus would be very welcome.


Environment Details

  • NDK Version: 21.0.6113669
  • Build system: Microsoft Visual Studio 2019
  • Host OS: Windows
  • ABI: armeabi-v7a
  • NDK API level: 21
  • Device API level: 21

I was able to reproduce the same behavior with an old ndk_build Linux toolchain with NDK 13b and API level 14.

@DanAlbert
Copy link
Member

If you're trying to throw exceptions across shared library boundaries you also need to be aware of https://android.googlesource.com/platform/ndk/+/master/docs/user/common_problems.md#rtti_exceptions-not-working-across-library-boundaries

@Marian-Kechlibar
Copy link
Author

Marian-Kechlibar commented Feb 19, 2020

I am not trying to throw exceptions across libraries. Yes, I read your link before. I read everything I could lay my hands on before coming here and asking for help. I have done 20 hours of experiments.

I did my best to describe the situation, but perhaps my description was too long and unreadable. (I am not a native speaker.) So a shorter one:

  • Just one of the libraries is C++, all the throwing and catching code is within it, it does not attempt intentionally to throw exceptions out of itself.

  • All the other libraries are pure C and are not intended even to know what exceptions are, much less throw or handle them.

So why does ARM32 build introduce the Unwind symbols into those C libraries and why are they actually invoked instead of the correct, intended catch block within the C++ library?

I also created a similar issue last year. It turned out that my problem was not solved, it is still extant. I only thought it was solved because I started building ARM64 for my new experimental phone and stopped experimenting with ARM32. Yes, everything works on ARM64.

@Marian-Kechlibar Marian-Kechlibar changed the title [BUG] Unwinding crashes on a 32-bit build, but not on 64-bit build or x86 build. [BUG] Unwinding crashes on ARM32 but not elsewhere even when the exceptions are within a single library Feb 19, 2020
@DanAlbert
Copy link
Member

Sorry, I'd misunderstood.

What exception types are you throwing? If they're standard exception types (or derived from them) and your using the shared libc++, that might still have the same issues.

The reason that the 32-bit ARM libraries have the symbols exposed and the others don't is probably because libgcc.a is a linker script in that platform (as you've seen with libgcc_real.a).

I can get rid of those symbols by specifying -Wl,--exclude-libs,libgcc.a additionally, but the problem does not change, only the caught stack is now much less clear

To clarify, do any of the libraries in your APK expose any unwind symbols? I'm not sure if you mean that you were able to hide them for just your C++ library or for all of them.

@DanAlbert DanAlbert reopened this Feb 19, 2020
@Marian-Kechlibar
Copy link
Author

TLeaveEx is just a simple derivated class from std::exception with one integer as a member variable.

class TLeaveEx : public std::exception{
public:
	inline TLeaveEx(TInt aCode){
		iCode = aCode;
	};

public:
	TInt GetCode() const{
		return iCode;
	};

protected:
	TInt iCode;
};

I tried to use both c++_shared and c++_static, the behavior seems to be identical.

To clarify, do any of the libraries in your APK expose any unwind symbols? I'm not sure if you mean that you were able to hide them for just your C++ library or for all of them.

I set " -Wl,--exclude-libs,libgcc.a" for all the C libraries and that flag removed the unwind symbols from all the C libraries successfully. So, in all likelihood, they were still present in the C++ library.

I haven't tried to remove them from the C++ library yet, but I will try it now and describe the result.

@Marian-Kechlibar
Copy link
Author

Marian-Kechlibar commented Feb 19, 2020

I could not remove all the Unwind symbols from the C++ library.

_Unwind_Resume, __aeabi_unwind_cpp_pr0, __aeabi_unwind_cpp_pr1, _Unwind_DeleteException _Unwind_VRS_Get, _Unwind_VRS_Set, _Unwind_RaiseException, __gnu_unwind_frame, _Unwind_GetRegionStart, _Unwind_GetLanguageSpecif

are still present. As is the problem (crash).

@DanAlbert
Copy link
Member

Sorry, somehow missed your last update.

When you say they're still present, present in what form? Use readelf -sW $LIBRARY | grep _Unwind_RaiseException. If things are LOCAL it's fine, but DEFAULT or UND are a problem.

@Marian-Kechlibar
Copy link
Author

Sorry, somehow missed your last update.

When you say they're still present, present in what form? Use readelf -sW $LIBRARY | grep _Unwind_RaiseException. If things are LOCAL it's fine, but DEFAULT or UND are a problem.

GLOBAL DEFAULT in all of them. That is probably the core of the problem.

@DanAlbert
Copy link
Member

Yeah, that'll be the issue. If you can provide the link flags used to build it (the actual command line that was run, with -v is best) we might be able to see what went wrong.

Closing since this doesn't appear to be a bug, but we're still here to help if we can :)

@Marian-Kechlibar
Copy link
Author

Yeah, that'll be the issue. If you can provide the link flags used to build it (the actual command line that was run, with -v is best) we might be able to see what went wrong.

Closing since this doesn't appear to be a bug, but we're still here to help if we can :)

Thank you! I am attaching linker input for a single library. It is just one line, fairly long.

"C:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/bin\ld" "--sysroot=C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm" -z noexecstack -EL --warn-shared-textrel -z now -z relro -X --hash-style=both --enable-new-dtags --eh-frame-hdr -m armelf_linux_eabi -shared -o "ARM\Release\libcares.so" "C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib\crtbegin_so.o" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\llvm\prebuilt\windows-x86_64\lib64\clang\9.0.8\lib\linux\arm" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/lib/../lib/armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/arm-linux-androideabi/../../lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/lib/armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib" -soname=libcares.so "-rpath-link=C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm\usr\lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm\usr\lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\lib\gcc\arm-linux-androideabi\4.9.x\armv7-a" --no-undefined -z relro -z now -z noexecstack --no-wchar-size-warning "ARM\Release\ares_android.o" "ARM\Release\ares_cancel.o" "ARM\Release\ares_create_query.o" "ARM\Release\ares_data.o" "ARM\Release\ares_destroy.o" "ARM\Release\ares_expand_name.o" "ARM\Release\ares_expand_string.o" "ARM\Release\ares_fds.o" "ARM\Release\ares_free_hostent.o" "ARM\Release\ares_free_string.o" "ARM\Release\ares_getenv.o" "ARM\Release\ares_gethostbyaddr.o" "ARM\Release\ares_gethostbyname.o" "ARM\Release\ares_getnameinfo.o" "ARM\Release\ares_getopt.o" "ARM\Release\ares_getsock.o" "ARM\Release\ares_init.o" "ARM\Release\ares_library_init.o" "ARM\Release\ares_llist.o" "ARM\Release\ares_mkquery.o" "ARM\Release\ares_nowarn.o" "ARM\Release\ares_options.o" "ARM\Release\ares_parse_aaaa_reply.o" "ARM\Release\ares_parse_a_reply.o" "ARM\Release\ares_parse_mx_reply.o" "ARM\Release\ares_parse_naptr_reply.o" "ARM\Release\ares_parse_ns_reply.o" "ARM\Release\ares_parse_ptr_reply.o" "ARM\Release\ares_parse_soa_reply.o" "ARM\Release\ares_parse_srv_reply.o" "ARM\Release\ares_parse_txt_reply.o" "ARM\Release\ares_platform.o" "ARM\Release\ares_process.o" "ARM\Release\ares_query.o" "ARM\Release\ares_search.o" "ARM\Release\ares_send.o" "ARM\Release\ares_strcasecmp.o" "ARM\Release\ares_strdup.o" "ARM\Release\ares_strerror.o" "ARM\Release\ares_strsplit.o" "ARM\Release\ares_timeout.o" "ARM\Release\ares_version.o" "ARM\Release\ares_writev.o" "ARM\Release\ares__close_sockets.o" "ARM\Release\ares__get_hostent.o" "ARM\Release\ares__read_line.o" "ARM\Release\ares__timeval.o" "ARM\Release\bitncmp.o" "ARM\Release\inet_net_pton.o" "ARM\Release\inet_ntop.o" "ARM\Release\windows_port.o" -lgcc -ldl -lc -lgcc -ldl "C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib\crtend_so.o"
1

@DanAlbert
Copy link
Member

That link command is missing the --exclude-libs args that are needed to prevent those symbols from being re-exported.

@Marian-Kechlibar
Copy link
Author

That link command is missing the --exclude-libs args that are needed to prevent those symbols from being re-exported.

Should I edit the NDK somehow in order to add the --exclude-libs argument automatically?

@Marian-Kechlibar
Copy link
Author

BTW --exclude-libs with which arguments?

@DanAlbert
Copy link
Member

Should I edit the NDK somehow in order to add the --exclude-libs argument automatically?

It's a bug in the build system you're using. Should probably raise the bug with MS (assuming VS is actually responsible for the build here). Show them https://android.googlesource.com/platform/ndk/+/master/docs/BuildSystemMaintainers.md

You could probably add the arguments manually in your build scripts, but I don't know anything about how the Visual Studio build works.

BTW --exclude-libs with which arguments?

See https://android.googlesource.com/platform/ndk/+/master/docs/BuildSystemMaintainers.md#unwinding

@Marian-Kechlibar
Copy link
Author

OK, thank you, last questions for today. Should I add the parameters into every single library, including the one that is actually C++ and uses exceptions internally? Will those exceptions still be caught?

I will do that tomorrow and report the results.

@DanAlbert
Copy link
Member

DanAlbert commented Apr 13, 2020

Yeah, all libraries is the best solution. Note that libunwind.a (currently) is only used on 32-bit Arm.

Another thing you could do that's slightly more involved but is generally a good idea: use a version script to forcibly restrict the API surface of your libraries. This has the same effect as the above but uses a whitelist instead of a blacklist and applies it to your whole library as opposed to just blacklisted dependencies. For example, if your library exposes jniFoo and jniBar, you can use the following version script (libapp.map.txt, for this example):

APP_1 { # This is just an arbitrary name
  global:
    jniFoo;
    jniBar;
  local:
    *;
};

And add -Wl,--version-script,libapp.map.txt to your link flags. jniFoo and jniBar will be the only symbols exported from your library this way. You need to make sure you add any new functions you want to expose to the map file, but IMO that's a feature.

We should really get something about this into a best practices doc... #1235

@Marian-Kechlibar
Copy link
Author

So, I added --exclude-libs to all the projects and I am now reasonably sure that the Unwind symbols are no longer exposed in any of the libraries. I also added the --version-script argument in most of the C libraries

(A side question: what must be exposed from the C++ library? Should, for example, all the methods that are bound to a JNI, be exposed?)

Now, my app still crashes during the first exception, but a little differently. I am attaching a screenshot. This looks very similar to this bug:

#289

BUT there are no undefined unwind symbols in the C++ library (unlike the situation that you address in your comment here).

I am attaching readelf output on all the libraries, too.
crash-read-encoded-pointer_2
armeabi-v7a-symbols.zip

@enh-google
Copy link
Collaborator

(A side question: what must be exposed from the C++ library? Should, for example, all the methods that are bound to a JNI, be exposed?)

if you use JNI_OnLoad and RegisterNatives, you only need to expose JNI_OnLoad. otherwise you need to expose any function that's looked up using dlsym(3); basically everything that starts Java_. see https://developer.android.com/training/articles/perf-jni#native-libraries for details.

@Marian-Kechlibar
Copy link
Author

Just a small comment.

With an old SDK/NDK duo, my 32-bit application compiles and runs fine, but only with use of gnustl_static, which is now obsolete.

@DanAlbert
Copy link
Member

Back then both sides of the problem used the same unwinder so the problem was asymptomatic. Some part of your application is using the old unwinder while others are using the new unwinder, which is what makes the problem appear.

@Marian-Kechlibar
Copy link
Author

I would love to diagnose the problem, but I am not sure where to begin. All the C libraries seem to be stripped to the bone now. So it must be in the only library that is C++. Is it possible that some of the system libraries linked to it uses a different unwinder?

This is my link command for the C++ library. I wonder if the -L paths that contain /lib/gcc may be the problem.

"C:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/bin\ld" "--sysroot=C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm" -z noexecstack -EL --warn-shared-textrel -z now -z relro -X --hash-style=both --enable-new-dtags --eh-frame-hdr -m armelf_linux_eabi -shared -o "ARM\Debug\libCryptoCult_Expert.so" "C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib\crtbegin_so.o" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\llvm\prebuilt\windows-x86_64\lib64\clang\9.0.8\lib\linux\arm" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/lib/../lib/armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/arm-linux-androideabi/../../lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/lib/armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib" -soname=libCryptoCult_Expert.so "-rpath-link=C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm\usr\lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\sources\cxx-stl\llvm-libc++\libs\armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm\usr\lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\lib\gcc\arm-linux-androideabi\4.9.x\armv7-a" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\sources\cxx-stl\llvm-libc++\libs\armeabi-v7a" --no-undefined -z relro -z now -z noexecstack --no-wchar-size-warning --exclude-libs libgcc.a --exclude-libs libgcc_real.a --exclude-libs libunwind.a "ARM\Debug\Circletech_Network_Client_Unity_1.o" "ARM\Debug\Circletech_Network_Defs_Unity_1.o" "ARM\Debug\Circletech_Network_Server_Unity_1.o" "ARM\Debug\Circletech_Network_Server_Unity_2.o" "ARM\Debug\CryptoCult_Engine_Unity_1.o" "ARM\Debug\CryptoCult_Engine_Unity_10.o" "ARM\Debug\CryptoCult_Engine_Unity_2.o" "ARM\Debug\CryptoCult_Engine_Unity_3.o" "ARM\Debug\CryptoCult_Engine_Unity_4.o" "ARM\Debug\CryptoCult_ShineLet_Unity_1.o" "ARM\Debug\CryptoCult_ShineLet_Unity_2.o" "ARM\Debug\CryptoCult_ShineLet_Unity_3.o" "ARM\Debug\CryptoCult_ShineLet_Unity_4.o" "ARM\Debug\CryptoCult_ShineLet_Unity_5.o" "ARM\Debug\AdvancedStorage_Unity_1.o" "ARM\Debug\AdvancedStorage_Unity_2.o" "ARM\Debug\AdvancedStorage_Unity_3.o" "ARM\Debug\AdvancedStorage_Unity_4.o" "ARM\Debug\Algorithms_Unity_1.o" "ARM\Debug\Algorithms_Unity_2.o" "ARM\Debug\Algorithms_Unity_3.o" "ARM\Debug\Algorithms_Unity_4.o" "ARM\Debug\Audio_Unity_1.o" "ARM\Debug\Basic_Structures_Unity_1.o" "ARM\Debug\Basic_Structures_Unity_2.o" "ARM\Debug\Basic_Structures_Unity_3.o" "ARM\Debug\BigInt.o" "ARM\Debug\Circletech_Platform_Unity_1.o" "ARM\Debug\Circletech_Platform_Unity_2.o" "ARM\Debug\Data_Objects_Unity_1.o" "ARM\Debug\Data_Objects_Unity_2.o" "ARM\Debug\Data_Objects_Unity_3.o" "ARM\Debug\Data_Objects_Unity_4.o" "ARM\Debug\Data_Objects_Unity_5.o" "ARM\Debug\Data_Objects_Unity_6.o" "ARM\Debug\Data_Objects_Unity_7.o" "ARM\Debug\Data_Objects_Unity_8.o" "ARM\Debug\Debug_Unity_1.o" "ARM\Debug\Fortuna_Unity_1.o" "ARM\Debug\InstantMessaging_Unity_1.o" "ARM\Debug\InstantMessaging_Unity_2.o" "ARM\Debug\MailClient_Unity_1.o" "ARM\Debug\MailClient_Unity_2.o" "ARM\Debug\Math_Utils_Unity_1.o" "ARM\Debug\MIME_Unity_1.o" "ARM\Debug\MIME_Unity_2.o" "ARM\Debug\PGP_Unity_1.o" "ARM\Debug\PGP_Unity_2.o" "ARM\Debug\PGP_Unity_3.o" "ARM\Debug\PGP_Unity_4.o" "ARM\Debug\ShineCalendarModule_Unity_1.o" "ARM\Debug\ShineCalendarModule_Unity_2.o" "ARM\Debug\ShineCalendarModule_Unity_3.o" "ARM\Debug\ShineCalendarModule_Unity_4.o" "ARM\Debug\ShineEmailModule_Unity_1.o" "ARM\Debug\ShineEmailModule_Unity_2.o" "ARM\Debug\ShineEmailModule_Unity_3.o" "ARM\Debug\ShineEmailModule_Unity_4.o" "ARM\Debug\ShineLet_Core_Unity_1.o" "ARM\Debug\ShineLet_Data_Unity_1.o" "ARM\Debug\ShineLet_Data_Unity_2.o" "ARM\Debug\ShineLet_Data_Unity_3.o" "ARM\Debug\ShineLet_Data_Unity_4.o" "ARM\Debug\ShineLet_EventEngine_Unity_1.o" "ARM\Debug\ShineLet_EventEngine_Unity_2.o" "ARM\Debug\ShineLet_EventEngine_Unity_3.o" "ARM\Debug\ShineLet_EventEngine_Unity_4.o" "ARM\Debug\ShineLet_Graphics_Unity_1.o" "ARM\Debug\ShineLet_Graphics_Unity_2.o" "ARM\Debug\ShineLet_Mutable_Unity_1.o" "ARM\Debug\ShineLet_Mutable_Unity_2.o" "ARM\Debug\ShineLet_Resource_Unity_1.o" "ARM\Debug\ShineLet_Resource_Unity_2.o" "ARM\Debug\ShineLet_Resource_Unity_3.o" "ARM\Debug\ShineLet_Resource_Unity_4.o" "ARM\Debug\ShineLet_Unity_1.o" "ARM\Debug\ShineLet_Unity_2.o" "ARM\Debug\ShineLet_Unity_3.o" "ARM\Debug\ShineLet_Unity_4.o" "ARM\Debug\ShineLet_Unity_5.o" "ARM\Debug\ShineLet_Unity_6.o" "ARM\Debug\SIP_Unity_1.o" "ARM\Debug\SIP_Unity_2.o" "ARM\Debug\Speex_Unity_1.o" "ARM\Debug\System_Utils_Unity_1.o" "ARM\Debug\DLL_TLS_Unity_1.o" "ARM\Debug\Trezor_Unity_1.o" "ARM\Debug\CTAECMobileWrapper.o" "ARM\Debug\CTAECWrapper.o" "ARM\Debug\XML_Unity_1.o" "ARM\Debug\XML_Unity_2.o" "ARM\Debug\XML_Unity_3.o" "ARM\Debug\SVGParser_Unity_1.o" "ARM\Debug\SVGParser_Unity_2.o" "ARM\Debug\SVGParser_Unity_3.o" "ARM\Debug\SVGParser_Unity_4.o" "ARM\Debug\SVGRenderer_Unity_1.o" "ARM\Debug\SVGRenderer_Unity_2.o" "ARM\Debug\SVGRenderer_Unity_3.o" "ARM\Debug\SVGRenderer_Unity_4.o" "ARM\Debug\SVG_Toolkit_Unity_1.o" "ARM\Debug\SVG_Toolkit_Unity_2.o" "ARM\Debug\SVG_Toolkit_Unity_3.o" "ARM\Debug\SVG_Toolkit_Unity_4.o" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libcares.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libiconv.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_FreeType_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_Opus_Codec_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_Speex_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_TLS_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_WebRTC_AEC_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\\..\..\..\DLL_BigInt_0\android\prebuilt\ARM\libgmp.so" -lEGL -lGLESv1_CM -lz -lm -lc++_static -lc++abi -llog -landroid -lgcc -ldl -lc -lgcc -ldl "C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib\crtend_so.o"

@Marian-Kechlibar
Copy link
Author

Or is it the -lgcc -ldl -lc -lgcc -ldl part? I am not even sure where this comes from. Perhaps Visual Studio itself?

@DanAlbert
Copy link
Member

Using libgcc is correct. Exporting libgcc is a bug. That command line doesn't have that problem (I see the --exclude-libs flags), so check these with readelf:

"C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libcares.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libiconv.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_FreeType_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_Opus_Codec_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_Speex_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_TLS_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC..\ARM\Debug\Package\libs\armeabi-v7a\libDLL_WebRTC_AEC_0.so" "C:\Users\notebook\Documents\Projekty\CryptoCult_Android\CryptoCult\CryptoCult_Android_MSVC\......\DLL_BigInt_0\android\prebuilt\ARM\libgmp.so"

@DanAlbert
Copy link
Member

Or is it the -lgcc -ldl -lc -lgcc -ldl part? I am not even sure where this comes from. Perhaps Visual Studio itself?

The compiler does that implicitly. This part is correct. libgcc is still required on Android; it just needs to not be re exported from the libraries that link it.

@Marian-Kechlibar
Copy link
Author

Marian-Kechlibar commented Apr 17, 2020

I checked all the libraries again with readelf.
To my slight puzzlement, the symbols
__gnu_Unwind_RaiseException
___Unwind_RaiseException
are present in some of the C libraries, but always as LOCAL HIDDEN.
(I am a bit confused about it, why are they present in pure C libraries at all?)

In the only C++ library, I could not find either ___Unwind_RaiseException or __gnu_Unwind_RaiseException. It is simply not there. Not even as LOCAL HIDDEN. Just not there. Could this be the problem?

@DanAlbert
Copy link
Member

(I am a bit confused about it, why are they present in pure C libraries at all?)

Not sure. objdump could tell you where those are called from.

Not even as LOCAL HIDDEN. Just not there. Could this be the problem?

I doubt it. Unwind symbols not showing up in readelf at all is pretty normal in my experience. Not showing up at all guarantees that it isn't exported, so that's fine.

@Marian-Kechlibar
Copy link
Author

Could you possibly tell me how to use objdump to determine where is something called from?
I looked up the manpage, but this is a very powerful utility with a lot of functionality and I cannot identify which one is the one to tell me where functions are called from.

@DanAlbert
Copy link
Member

objdump is available in the NDK (same place as readelf). You'll want the one that matches the architecture you're using, so arm-linux-androideabi-objdump -D <whatever library>. That'll print the disassembly of the library. Most of the contents will be uninteresting, but if you search for either of those symbols you'll find a few references. The first few are probably just relocations, but later in the file you'll find the actual calls to them, and somewhere above the call will be the name of the function that's calling it. That should give you a clue as to why it's being used.

Thinking about it a bit, this could be the problem. If it's being treated as a C only library it may not be getting the correct unwinder... @rprichard has been looking into these quite a bit lately so it's possible that he also has some ideas.

@Marian-Kechlibar
Copy link
Author

Thank you! I will get back onto the topic on Monday, when I can access a stronger build machine again.

@rprichard
Copy link
Collaborator

(I am a bit confused about it, why are they present in pure C libraries at all?)

Not sure. objdump could tell you where those are called from.

I think the NDK builds everything with -funwind-tables by default, which is useful for crash dumps. On arm32, an object file with unwinding tables also has an R_ARM_NONE relocation to a personality routine, which then pulls in the unwinder.

Future versions of Clang should disable this relocation for Android, so the unwinder will be omitted. https://reviews.llvm.org/D70027. It looks like this Clang change isn't in NDK r21.

@Marian-Kechlibar
Copy link
Author

This is a neverending quagmire... Getting exceptions to run seems to be a very awful taks.

There is some progress, though. All the C libraries have their unwind symbols hidden. Everything that now whatever goes wrong happens within the single C++ library that uses exceptions.

I looked at my linker command line in order to determine whether everything is OK. Especially if the desired sequence of objects is kept:

crtbegin
object files
static libraries
libgcc
shared libraries
crtend

And I found that it was not. The sequence I got was

crtbegin
object files
all the libraries specified in my MSVC project in the sequence they were specified
libgcc, libdl, libc, libgcc, libdl (_indeed twice_)
crtend

Given that MSVC does not give me any means how to distinguish between static and shared libraries, I decided to add libgcc, libdl, libc in my project manually and use it to

EGL;GLESv1_CM;z;m;c++_static;c++abi;log;android;;gcc;DLL_TLS_0;DLL_Speex_0;DLL_FreeType_0;DLL_Opus_Codec_0;DLL_WebRTC_AEC_0;cares;iconv;gmp

(bold libraries are static, italics libraries are shared)

In this way, the command line of the linker started to look like this:

"C:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/bin\ld"
"--sysroot=C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm" -z noexecstack -EL --warn-shared-textrel -z now -z relro -X --hash-style=both --enable-new-dtags --eh-frame-hdr -m armelf_linux_eabi -shared -o
"ARM\Debug\libCryptoCult_Expert.so" "C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib\crtbegin_so.o"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\llvm\prebuilt\windows-x86_64\lib64\clang\9.0.8\lib\linux\arm"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/armv7-a"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/lib/../lib/armv7-a"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/arm-linux-androideabi/../../lib"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/../../../../arm-linux-androideabi/lib/armv7-a"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib" -soname=libCryptoCult_Expert.so "-rpath-link=C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm\usr\lib" "-LC:\Users\Marian
Kechlibar\Projects\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\..\ARM\Debug\Package\libs\armeabi-v7a" "-LC:\Users\Marian
Kechlibar\Projects\CryptoCult\_Android\CryptoCult\CryptoCult_Android_MSVC\\..\..\..\DLL_BigInt_0\android\prebuilt\ARM" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\sources\cxx-stl\llvm-libc++\libs\armeabi-v7a"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm\usr\lib" "-LC:\Microsoft\AndroidNDK64\android-ndk-r21\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\lib\gcc\arm-linux-androideabi\4.9.x\armv7-a"
"-LC:\Microsoft\AndroidNDK64\android-ndk-r21\sources\cxx-stl\llvm-libc++\libs\armeabi-v7a" --no-undefined -z relro -z now -z noexecstack --no-wchar-size-warning --exclude-libs libgcc.a --exclude-libs libgcc_real.a --exclude-libs libunwind.a
--exclude-libs libc++_static.a --exclude-libs libc++abi.a --exclude-libs libandroid_support.a (all the object files to be linked together) -lEGL -lGLESv1_CM -lz -lm -lc++_static
-lc++abi -llog -landroid -ldl -lc -lgcc -lDLL_TLS_0 -lDLL_Speex_0 -lDLL_FreeType_0 -lDLL_Opus_Codec_0 -lDLL_WebRTC_AEC_0 -lcares -liconv -lgmp -lgcc -ldl -lc -lgcc -ldl
"C:\Microsoft\AndroidNDK64\android-ndk-r21\platforms\android-21\arch-arm/usr/lib/../lib\crtend_so.o"

I could not find the script that adds the bold portion (-lgcc -ldl -lc -lgcc -ldl) to the command line. I have a suspicion it is somewhere within the NDK, but I could not find where.

The application still crashes and the stack trace looks like this.
crash-read-encoded-pointer_3

@Marian-Kechlibar
Copy link
Author

So, yet another update.
I added unwind as a dependence just before c++abi and the application stopped crashing. I will observe its behavior for a while, but that might be it.
Interestingly enough, it does not seem to mind sequence of static and shared libraries, but it seems to mind the unwind library.
And ARM64 does not need it, only ARM32.

@Marian-Kechlibar
Copy link
Author

I wonder if this should be somehow performed automatically by the NDK itself, or at least mentioned in the Android Wiki.

The behavior where the application will happily compile and link without libunwind, but crash later, is very baffling for an average programmer monkey like me. If the docs said explicitly "you must add libunwind to the linker dependencies for your 32-bit application or else", I would have done it ages ago.

@DanAlbert
Copy link
Member

This is a neverending quagmire... Getting exceptions to run seems to be a very awful taks.

I've been avoiding mentioning this because I suspect you have some pretty strong reasons for using Visual Studio and switching isn't practical (it never is), but ndk-build doesn't have these problems. Depending on what exactly is going wrong, our CMake toolchain might not either (CMake doesn't make it possible for us to set the link order such that your project can't be affected by broken dependencies).

I could not find the script that adds the bold portion (-lgcc -ldl -lc -lgcc -ldl) to the command line.

The compiler does it automatically.

And ARM64 does not need it, only ARM32.

Yes, only arm32 currently uses the new unwinder.

I wonder if this should be somehow performed automatically by the NDK itself, or at least mentioned in the Android Wiki.

The behavior where the application will happily compile and link without libunwind, but crash later, is very baffling for an average programmer monkey like me. If the docs said explicitly "you must add libunwind to the linker dependencies for your 32-bit application or else", I would have done it ages ago.

Most users never encounter this because the build system is responsible for it. You're just in the unfortunate situation of maintaining the build system you use because it seems to be broken :( https://android.googlesource.com/platform/ndk/+/master/docs/BuildSystemMaintainers.md#Unwinding covers the instructions for build system maintainers and does mention all of this.

The really odd thing though is that libgcc.a is a linker script that is supposed to make this a non-issue because it will handle it implicitly:

$ cat android-ndk-r22-canary/toolchains/llvm/prebuilt/linux-x86_64/lib/gcc/arm-linux-androideabi/4.9.x/libgcc.a
INPUT(-lunwind -lcompiler_rt-extras -lgcc_real -ldl)

The fact that explicitly linking libunwind ahead of your other libraries indicates that one of those other libraries is likely the issue, though I was fairly certain we'd disproven that.

@Marian-Kechlibar
Copy link
Author

It is true that there are good reasons for me using MSVC, namely, all my colleagues are MSVC users and they debug pretty much everything they can on MSVC / Windows combo. The app can be compiled for multiple OSes including Windows. Most of the bugs can be reproduced and fixed on the Windows port too, but some are Android-specific...

Microsoft supports NDK 16 and the developer community basically begs them to support NDK 21 too, but they do not seem to be in much of a hurry. Which is a shame, because MSVC is actually a pretty great IDE.

I was able to strongarm MSVC, including the msbuild system, into accepting NDK 21, on my own, by modifying some MS scripts. Pretty much everything could be solved easily. Only the unwind problem stymied me.

The fact that explicitly linking libunwind ahead of your other libraries indicates that one of those other libraries is likely the issue, though I was fairly certain we'd disproven that.

I am fairly certain too. There are just two types of libraries I use in my project: shared C libraries that I compile myself. Every single of them uses -Wl,--version-script now. And static libraries from the NDK.

I will try to produce a minimal example tomorrow.

@DanAlbert
Copy link
Member

Yep, that's more or less the answer I expected. There's a reason we keep documentation for maintaining third-party build systems :)

I will try to produce a minimal example tomorrow.

That'd be awesome if you can.

@Marian-Kechlibar
Copy link
Author

I will try to produce a minimal example tomorrow.

That'd be awesome if you can.

So yeah, I have a minimal test harness which shows the crashing behavior exactly as I experienced it. I am attaching a ZIP with the solution.

In order to get this up and running, you need Microsoft Visual Studio 2019 Community version with Android development packages installed. Also, you need to set path to the NDK 21 in MSVC, this is done in Tools / Options / Cross Platform / C++

The solution TestHarness.sln can be built out of the box and you can try debugging it on any attached Android device.

The solution consists of two shared libraries - libgmp.so, a pure C library, and libTestNativeLibrary.so which contains the exception throw/catch code. Build events take care of copying the correct .so files to the correct directories.

Upon success, the application should display a simple message

"Leave Exception caught successfully."

on the screen of the phone. I am attaching a screenshot showing the message. ARM64 works for me.

32-bit ARM build will crash reliably in the same way that my app did unless unwind library is explicitly specified as a dependency. I am attaching another screenshot. Unfortunately MSVC is in Czech language there, but you will find the correct textfield pretty easily. With unwind specified, the 32-bit application will run correctly.

unwind-contained

leave-exception-caught_Test_Harness

AndroidTestHarness-exception.zip

@DanAlbert
Copy link
Member

Thanks. It's probably going to be a while before I'm able to look given the setup that I'll have to sort out to look into a visual studio issue. You have a workaround though, so I think I'm mainly just satisfying my own curiosity at this point :)

@Markus87
Copy link

@DanAlbert Sorry for commenting on an old issue, but I just experienced the same issue.
Just to make sure you are not "interested" in those kind of issues. (I understand you cant change foreign build systems)

I use the ndk r21d.

My case was as follows:
I introduced this problem to my app after changing from qmake to cmake. (I have cmake build that builds the same stuff also for Windows and Linux so ndk-build is no option, ndk-build and Qt also dont mix I think)
As I understand the libraries for the linker in cmake are only controllable to a degree.
In the range that I can control I added -lunwind to fix my issue.

I would have preferred to fix the problem in the third party libraries that export the _Unwind* symbols.
I was able to track down the static libraries that showed the entries with readelf.

  • boost - which has its own build system, so I could understand it is doing things wrong for (unsupported) android
  • bsoncxx - which is built with cmake, I assumed for cmake the issue should not come up if there are no other libraries it depends on that are built wrong

Did I understand correctly that -Wl,--exclude-libs ... for the build of the third party libraries alone should fix the issue as well?
Because it does not seem to work for me. 🤔

@DanAlbert
Copy link
Member

Did I understand correctly that -Wl,--exclude-libs ... for the build of the third party libraries alone should fix the issue as well?
Because it does not seem to work for me. 🤔

Not enough information then. Since you're not using VS I doubt it's the same bug. File a new one if you can provide the information we'd need to investigate (as requested by the bug template), but honestly the most frequent explanation for these types of problems is that not all libraries got that flag. readelf is your friend for diagnosing that.

@Markus87
Copy link

@DanAlbert Thanks for the quick response, it may be not THE same bug but it is the same circle of problems, and the root cause seems always the same. You dont use ndk-build, so you are on your own try to follow https://android.googlesource.com/platform/ndk/+/master/docs/BuildSystemMaintainers.md#Unwinding or if you are out of ideas link -lunwind at the end.
When I have time I will try again to understand what it is that I am doing wrong, in case I find anything that could be considered a bug I will open an issue.

@DanAlbert
Copy link
Member

it may be not THE same bug but it is the same circle of problems

If it's not the same bug tracking two in a single report is just a great way to make sure one of the reports gets missed by triage.

You dont use ndk-build, so you are on your own

This isn't at all what we've said. We support ndk-build, our CMake toolchain file (and with r23, non-toolchain file CMake), and the "just a toolchain" approach. For the latter case you definitely need to follow the docs.

We're happy to help with other systems, but we're not the ones that maintain them so there's nothing we can do but offer advice.

link -lunwind at the end

This is what causes the problem. It must be linked as specified by the doc if you want binaries that are not able to infect other libraries with incompatible symbols.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants