-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--bootstrap_impl=script breaks pkg_tar, bazel-lib tar and py_binary with py_package + py_wheel #2489
Comments
I think this is somewhat WAI. A plain py_binary can't really be given as something for py_package to process, and the plain py_binary isn't meant to be redistributable. The first order problem is that zip doesn't support symlinks. There's some extensions to allow it to store them, though. However, that might be tricky because most of files it is given are going to be symlinks to something else, so we'd need some way to tell "dereference these symlinks, but not these symlinks". Maybe by reading the symlink and see if its non-absolute? IDK. If you really want to package the whole binary, then you're probably better off packaging the zip file version of the binary. That has special code in its startup to handle the case of coming from a zip file that couldn't store symlinks. Why do you want to pass a py_binary to py_package? |
Is this also affecting the |
Yeah it's the same problem. The problem as I see it, is that the venv is created both in the runfiles directory, but also in the directory containing the runfiles directory (is there a name for this?). Only the former is required, the latter isn't functional (broken symlink) and never gets invoked anyway; stage-1 bootstrap runs the former. I didn't see an easy way to fix this - usually I would've tried to use runfiles symlinks if I wanted to create a symlink only inside the runfiles directory, but this doesn't let you create arbitrary relative symlinks. Personally I see this as a deficiency of the related |
I don't know, It is affecting pkg_tar . Should be related to bazelbuild/rules_pkg#115 |
This is a very old usecase for us where we build wheels for Apache-beam application for Dataflow jobs |
I personally think the Do I understand correctly that the As for That said, I think the current |
@aignas just tested with
|
I think |
Is there any downside to containerizing the zip file, then? OR can we maintain a function similar to |
I think it would help if you explained your use-case a bit more thoroughly. This sounds like a deficiency in the workflow or the way that Apache Beam is consuming runtime code for execution. Building a "fat wheel" (like a "fat jar" in Java) isn't the remit of py_package and py_wheel rules. The closest thing to a "fat jar" is the Python zip support via output groups. But it's unclear to me if this is supported by the target runtime (Dataflow?) that you're using. |
Sorry @groodt , I was referring to building docker images and not wheels since --bootstrap_impl=script is broken for both pkg_tar and We have migrated most of our dataflow jobs to docker images, just a couple still use wheels, so that is less of a problem. |
Probably? Ultimately, I'd like to remove all the zip stuff from py_binary itself:
Presumably, if one can create e.g. a zip from a py_binary, then using a different format, e.g. tar, would be fairly simple. To clarify, though -- the output is a zipapp-based thing. That is not quite the same as just putting all of a py_binary into a tar file (or equiv). The former means deriving a slightly different "runnable thing" from the py_binary. The latter means simply putting all the py_binary files into a tar file (essentially If you want the latter, then you're probably better off using e.g. rules_pkg, since that has various facilities to tar up an arbitrary binary and its runfiles. |
I came to this line of thinking becuase Unfortunately I cannot use |
I think that's just incidental. The zip file doesn't contain the venv
Ah, hm, yeah. Because the input is the bazel-bin symlink forest, so you can't distinguish a "real" symlink from a "convenient" symlink? Actually, maybe File.is_symlink would allow solving that. Basically, that can tell us which files are supposed to be symlinks. So then it's just a matter of telling the tool to not dereference paths where File.is_symlink is true. Looking at the CLI of tar, this looks rather tedious, but possible. I think what you'd have to do is one invocation with --dereference (pass it all files for which is_symlink=False), then a second invocation with --no-dereference (pass it all files for which is_symlink=True). |
🐞 bug report
Affected Rule
py_package + py_wheel
Is this a regression?
Yes, related commit #2409
Description
Using the resulting py_binary with py_package+py_wheel fails to find the interpreter symlink. This also breaks
pkg_tar
from rules_pkg🔬 Minimal Reproduction
🔥 Exception or Error
The text was updated successfully, but these errors were encountered: