-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
artifacts or just a container layer with special annotations? #6
Comments
Hey Colin, thanks for looking into it. There is no particular reason other than someone from the internal bootc team recommending OCI artifact tooling as a good fit for the job. Then when I wrote the downloading tool which acts like a "rsync" I realized I need uncompressed digest and I had no idea this metadata is actually available. Having this as part of OCI metadata (not blobs) is something handy - no need to download layers to figure out that no update is actually needed. Another reason why we went for a custom tool was lack of a standard tool in the podman universe to download arbitrary files from OCI container images. I was only able to figure out pulling the image, starting a temporary container and copying files from there and finally shutting the container down. That felt clunky. I am happy to rewrite the specs to use just |
There's There's also |
Right, that may have been me (sorry!) - but looking at what resulted in the spec, ISTM that standardizing just metadata on top of an OCI image may make more sense. BTW, one thing I'm not sure is standardized at all but maybe should be is the concept of something like a "single layer container" - I am not aware of a good use case for supporting multiple layers here, and doing so makes the unpacking logic more complex. If it was required that there was at most a single tar layer (with no whiteouts) that seems like a good idea. |
@cgwalters thanks so much for the feedback!
Up until recently people were sort of 'abusing' OCI container image specs (also because of registry limitations..) to upload and store artifacts in the registry in the arbitrary way. Non-OCI conformant artifacts were having
Whether to keep multiple layers or single one, I would not have a preference. When working on the netboot specs proposal we were using ORAS tool that offers 2 ways of storing multiple artifacts a) separately as multiple layers b) directory tared in a single layer https://oras.land/docs/how_to_guides/pushing_and_pulling#pushing-artifacts-with-multiple-files
Yes, I agree. We were sort of going back and forth between having special annotation for the arch or rather incorporate that info in the image tag e.g https://quay.io/repository/fedora/fedora?tab=tags
To my knowledge podman5 did recently add the ability to build and push OCI artifacts. For the delivery pipeline it probably would not make any difference whether to use ORAS or podman. |
So I created this and it looks pretty good:
It creates the following structure:
Unfortunately, there is no tool that would help me extracting the payload, podman does have an export feature but it only works on containers not images, skopeo can copy but there is no export/extract feature available. The only option is to run the container, but there is no executable to run, it makes no sense to put libc/bash or whatever just to run "sleep". I could write a simple utility until we get something in podman or skopeo, in the meantime:
It appears that symlinks are dereferenced, I wonder if it will work if I'd put them in the same layer as link targets. There is the
Anyways, this looks good, I mean it does the job. What you think? |
I'd recommend adding
Seems sane, that said it'd be good to verify the inputs against the treeinfo and perhaps the build process could do that and just reuse those sha256 checksums instead of recomputing them. (Not a big deal of course, just noting)
Agreed. So...I think next steps here would be to:
|
Good call, will do. Podman somehow was able to cache the commits by itself which was surprising to me, this cannot hurt tho.
Hmmm, recalculating is quick as podman will cache the downloaded files and actually skip most of steps. However, I think it makes sense to maybe download the treeinfo and put it as one of the artifacts for record purposes, good idea I think.
Right, any ideas about the client that would download and extract the files? Shall we file a RFE into skopeo or podman to bring a new "extract image" feature? Or just a shell/python script doing |
Added on @ipanova request:
And:
|
@cgwalters So... is your preference to just go with standard container image manifest format and put there kickstart files? In that case we should change at least the config.mediaType (per specs). It is doable, just feels slightly wrong, especially if OCI artifacts were created exactly for such cases. |
@cgwalters Don't get me wrong, I want to get this done and shipped, so if the majority will decide that using container image manifest format is the easiest thing to do, I will not stand much in the way and yield. Just please consider my arguments towards OCI artifacts usage. |
On Wed, Jun 12, 2024, at 12:11 PM, Ina Panova wrote:
@cgwalters <https://github.com/cgwalters> Don't get me wrong, I want to
get this done and shipped, so if the majority will decide that using
container image manifest format is the easiest thing to do, I will not
stand much in the way and yield. Just please consider my arguments
towards OCI artifacts usage.
To be clear, I am broadly OK with the spec as is, and no one needs my specific approval to move forward with this. We are just having a discussion 😀
My core feeling is OCI artifacts make sense when:
- There is logically nothing to execute (at least directly)
- When the content is architecture independent (WASM, helm charts, AI models etc)
This use case matches just one of those two.
Note that the argument about extraction applies either way; with a custom OCI artifact type you need a custom build *and* a custom extractor right? Custom extractor especially for multi arch handling.
BTW one thing we’ve done in the past with somewhat similar cases (embedding RPMs in a container) is include a simple web server as the entry point. That adds another avenue for extraction or even direct serving, and addresses the “you can run it” problem.
But again…while I personally lean just making it a container, if you both feel otherwise I think that’s reasonable and we can move forward with the spec mostly as is.
|
So I was able to finalize my POC, see the gist for both container files for aa64 and x64 and let me know. Overall, I like the idea of using Questions or observationis:
If you ask me, I lean towards reworking the spec from the ground up. I know we spent quite some time figuring it out and I wish we could do this earlier but we all know it was Summit blocking us from meeting up and discussing this properly. I don’t mind rewriting it and prototyping a new client based entirely on |
So we agreed with @ipanova to pursue the But before I start, I just wanted to get @cgwalters opinion on podman v5 artifacts which can be created via |
Creating artifacts with v5 podman is really quick and convenient:
With a working directory as a volume, files are fetched from DNF repo and KS repo:
Then they can be added:
Podman adds "files" section which is a bit weird, I do not see this in any spec and it appears that this is dropped once image is pushed into a registry:
Once pushed, it looks as expected:
Signing should be supported out of box by podman (GPG). The only open question is compression, from what I saw in documentation podman is supposed to compress blobs "on the fly" transparently, I am not sure how this is supposed to work or cannot even confirm any data is ever compressed. |
I am done with my prototype using podman/skopeo: https://github.com/theforeman/nboci-files I want to discuss this on the containerization gathering. Once Ina is back, we can start proceeding with the plan for gitlab. |
To close up the loop, we will use podman5 to build and push image index with image manifests containing netboot artifacts. Thanks all for the feedback and discussion. |
At a quick skim: IMO this spec is overall sane, and I'd be fine to ship and support tooling using it.
I wasn't involved in its drafting, but just reading it I find myself wondering: Is it really worth defining a custom OCI artifact type for this versus a spec that is basically:
etc?
The thing is stuff like
org.pulpproject.netboot.os.arch
would require special handling by clients, but...when you start having architecture-specific binaries I think one needs to consider just reusing standard OCI containers, but with special labels.Looking at it too, we have other special annotations like
org.pulpproject.netboot.src.digest
that wouldn't be necessary because OCI containers already have both compressed and uncompressed digests of the tarball (which includes file size metadata obviously etc.)The text was updated successfully, but these errors were encountered: