-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blast patch causes binaries to null-pointer-exception/core-dump #53165
Comments
There have been some past issues with packages that do a phone home causing our CI providers to block all our builds for "suspicious" activity. I'm not sure if that was the case here, but it could be. So, that's why I would be hesitant to just remove the patch. From the original PR for the patch it seems like maybe it was causing a timeout during the build because the request was blocked by the CI provider:
I think your suggestion to disable it in another way is a good option. |
Thank you for following up! I opened #53173 , where I did explore another mechanism to disable this. I was optimistic that reintroducing the I do see that the build passed on that PR, which seems promising. Do you happen to be familiar with the mutexes/signaling in |
Hi!
I noticed that one of the blast binaries,
blastn
, throws a null-pointer-exception:After a lot of debugging, I chased it down to this patch: https://github.com/bioconda/bioconda-recipes/blob/8e7728ea4d3ce0bf6c27fec18d0f0fd8f6be332d/recipes/blast/phonehome.patch
The root cause is fairly complex. Unfortunately I can't link out to blast code because it's not on github. I was using the latest version, 2.16.0.
In
blast_node.cpp
:Diving further, the concurrency controls in
ncbi_usage_report.cpp
cause the object manager to stay alive and non-null while all threads unregister their data loaders. Only after all threads have unregistered, then the usage report code cleans itself up.However, after applying this patch, the object manager gets cleaned up after the first call to
s_UnregisterDataLoader
. When the other threads call it, they hit the null-pointer-exception.There is clearly tight coordination between the blast-node code and the usage-report code. I don't well understand all the concurrency controls in
ncbi_usage_report
code to understand how to safely turn offNCBI_USAGE_REPORT_SUPPORTED
in a multi-threaded environment. There are plenty of guards, mutexes and signalling that must help the usage reporting coordinate with destruction. You may notice inncbi_usage_report.hpp
:So clearly ncbi expects
NCBI_USAGE_REPORT_SUPPORTED
to be defined in a multi-threaded environment, and they sure do depend on it.I don't think it's right to file a bug against them because their code does work as-is.
Perhaps
NCBI_USAGE_REPORT_SUPPORTED
is not the best way to disable reporting. Looking around their codebase, there seem to be other ways, such as env vars.So, firstly, could we revert this patch as it causes
blastn
not to function correctly?Alternatively, perhaps we can find a different way to patch the usage reporter to disable it. E.g. we could probably just override:
Thanks!
The text was updated successfully, but these errors were encountered: