-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: consider read only sysctl errors as non fatal #825
Comments
Did this PR try to fix this issue? #825 |
What PR? You link to this issue. |
No, that PR only works if you already have the right sysctl values. This is about not having the right sysctl set and just ignoring it if we cannot set it. But this most likely means routing is non functional so I am not sure if this is a good idea. |
Thanks for the info. The right sysctls being mentioned in the following issue right? #362
|
I ran into this today setting up a rootless and unprivileged podman deployment inside of a k8s cluster. Here is the PodSpec for reference: containers:
- image: quay.io/podman/stable:v4.9.0
name: podman
command:
- podman
- system
- service
- --log-level
- debug
- --transient-store
- --time
- "0"
- tcp://localhost:2375
securityContext:
runAsUser: 1000
runAsGroup: 1000
resources:
limits:
squat.ai/fuse: 1
squat.ai/tun: 1 I would've expected it to only set hard-required sysctls and ignore/not write any that already have the correct value or are optional but it seems it just tries to set them unconditionally causing problems for this setup. I had tried to set the following sysctls using the PodSpec's
Checking from within the running pods, the sysctls have the values as set on the PodSpec and match the value netavark would write (which it still did even though it's already correct). |
We already first read the value and then only set it if it does not have the correct value. |
Are there any sysctls missing or using incorrect values in the above table it doesn't log about then? If it's not writing if the sysctls are set to the expected values, I would not expect it to fail for not being able to write (if it does indeed not write anything in that case) 🤔 |
I am stuck with the same issue. Any idea how to resolve this? Is there maybe another place where it writes to /proc ? |
I recently took another shot at this with podman 5 but things have not changed on my end sadly. There's no documentation on what sysctl values are expected to be set or attempted to be set, what capabilities or filesystem access is needed, nothing. The only information[1][2][3][4] I've found thus far suggests there's no need for it to be privileged, no need for (NET_ADMIN) capabilities, no need to set sysctls if they are already set correctly[5][6], etc. Running podman without any networking seems to suggest this might actually be true but the moment networking is involved, it all falls apart. That means it either just can't work rootless or without special privileges/capabilities at all yet if networking is involved (doubtful, everyone involved seems to present outward that it does), assumptions are made around the underlying systems/runtimes (e.g. device access, minimum set of capabilities, ...) and/or the documentation is incorrect/missing information (quite likely). It also turns out we can request a
[1]: https://www.redhat.com/sysadmin/podman-inside-kubernetes netavark/src/network/core_utils.rs Line 258 in febe31a
EDIT: the above was run on Kubernetes |
Hi, I'm facing some version of @Omar007's issue as well. With However, as soon as I attempted to expose one of the container ports on localhost, the container failed to come up as netavark tries to set My questions:
If you think it's meaningful to do so I can make another reproduction attempt with a more recent podman version (the only reason why I tested with 4.9.3 was because it's part of another image in our setup). Here's some context on what we're even trying to achieve here (X/Y problems and all that): we have a bunch of code using |
@mvalvekensCET how are you using |
@Omar007 Unfortunately it's difficult to try on new GKE, since it has no this feature gate enabled, so |
When running inside of unprivileged containers /proc is normally mounted read only.
Now if a users tries to run netavark it will fail hard if we cannot set all the sysctl's. Most of them are needed for routing or to disable some ipv6 options but general communication may still be possible.
We should consider not treating read only errors as fatal and just log them as warning. The biggest problem is likely the ip_forward sysctl, without it no external communication would be possible. However this could already be set by the outer container manager in which case I would expect it to mostly work fine.
see containers/podman#19991
The text was updated successfully, but these errors were encountered: