-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEP-0146] Parameters
in Script
#1077
[TEP-0146] Parameters
in Script
#1077
Conversation
/kind tep |
/assign |
/assign @Yongxuanzhang @JeromeJu |
@JeromeJu: GitHub didn't allow me to assign the following users: Yongxuanzhang. Note that only tektoncd members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Parameters
in Script
Parameters
in Script
Parameters
in Script
Parameters
in Script
Parameters
in Script
Parameters
in Script
/assign |
Hi @aaron-prindle, thanks for the PR! We have a TEP template to leverage in case you don't know 😁 |
@Yongxuanzhang - thank you for this! Initially I was going off the format of the TEP this TEP is building off of [TEP-0099]: Parameters in Scripts which had a different format. I've updated the TEP here to be in line w/ this template now, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think the problem statement is clear.
teps/0146-parameters-in-script.md
Outdated
1. **(To be discussed) Provide an escape hatch in v1**. If a cluster | ||
operator or admin decides that they want to continue allowing | ||
`Parameter` variables to be directly injectable to `script` fields of | ||
`Steps` there should be some way for them to configure that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we consider to implement this as a new feature gated by feature flag? Cluster operators can choose enable this feature to disallow insecure tasks. They can choose to continue use param injection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great point, I will update the doc to explain how the ability to toggle off the current default behaviour via a feature-gate could be helpful here and allow a goal/requirement here of making Tekton safe by default be more feasible for users. Will update this thread when I have completed adding this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we could disallow this by default in v1
- it would be a backward incompatible change.
But we could provide a flag to let operator disallow params in scripts if they want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the doc to explain that feature-gates will be added:
- upon initial implementation so that cluster operators can OPT-IN to disabling direct param injection for
script
- [optional/TBD] post-implementation so users can opt-in to default behavior (if we deem this necessary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, agree the default should be backwards compatible and allow the current behavior. As you mention it makes sense to allow the cluster operator to disable the default behavior via a feature flag though. I’ve updated the doc in a few places (including the highlighted section) to explain how feature-gates could be used to support a potential option for restricting the default v1 (but NOT by default). Users choosing this option would also give us some insights/feedback related to the experience for if/when we remove the default behavior in 1.X/v2 (or after the ~9 mo deprecation policy depending on what we decide)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the problem statement @aaron-prindle . The examples and the diff between before
and after
and resolved
LGTM 🙏 Just some minor comments.
### Discussion | ||
This approach mutates the specification, but it is something we're already doing | ||
with implicit `Parameters` per [TEP-0023][tep-0023]. This approach removes the insecure | ||
default behaviour but in doing so unfortunately loses backwards-compatibility requiring users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aaron-prindle for putting putting up the discussion. I noticed there are already discussions at #1077 (comment) on using feature gate to turn off the default behaviour. But I am afraid the existing feature has now been regarded as stable, and we might have to comply with the API compatibility policy for its deprecation cycle to keep both options available to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also noticed in
- "During the inital creation of this feature, the user experience will not change as the original UX/functionality will be preserved."
- "At a later time or in an updated
apiVersion
(v2*),"
These all SGTM. Wondering if we would like to call out explicitly that we'd wait till the end of the 9-month deprecation period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we would like to call out explicitly that we'd wait till the end of the 9-month deprecation period?
@JeromeJu where would this one apply ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the text here to explicitly reference the 9 mo deprecation policy for any v1.X changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted here and in some other comments, the doc was initially unclear around the backwards compatibility guarantees - it has now been updated. The proposed solution will be backwards compatible for the 9 mo deprectation window and will have an opt-in feature flag for users to modify this if they desire (assuming folks here agree to that during the implementation).
teps/0146-parameters-in-script.md
Outdated
1. **Automatable**: A script or program should be able to convert `Tasks` | ||
to the new approach. This might be impossible for all cases (e.g. if an | ||
unrecognized shebang has been used in the script) but a best-effort | ||
implementation should be provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have such tooling indeed, maybe something that can detect issues and optionally attempt to fix them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, as part of the implementation we would add some automation that did either the following:
- Either as a warning in controller logs OR when the insecure defaults are eventually removed (via opt-in feature flag or after decided upon depreciation timing in v1.X, v2, etc.) - tekton will suggests running a new to-be-implemented CLI command such as
tkn fix
that could update input YAML files, converting them to using the new secure “.env” syntax forscript
and relaying any issues/manual changes necessary as needed - Controller implementation that is able to convert all/subset of cases the automatically “under the hood”. Ideally this would work for all cases but perhaps for some 80% (eg: bash only, etc.) cases it works and for others it returns a validation error
- name: foo | ||
image: myimage | ||
script: | | ||
echo $(params.bar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, maybe we could have an example with non-bash (e.g. python) too?
I just want to make sure that whatever solution we design, works across the different kinds of scripts we support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point, just added a security vulnerability example for Python here as well detailing how to do a similar exploit as the bash example with the current implementation. Additionally now the preferred solution (Explicity Project
Parametersas Environment Variables in
Steps (Language-Agnostic)
) shows an example of how non-bash implementations would work.
teps/0146-parameters-in-script.md
Outdated
1. **Easy to use**: Whatever replaces the direct use of `$(params.X)` in | ||
`script` should be unsurprising and memorable to ease friction in user | ||
migration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on ease of use and possibly I would add verbosity.
The current proposed approach with environment variables works fine but is very verbose. To get some input from a pipeline into a script there are many levels of indirection: pipeline.param -> task.param -> step.env.variable -> env variable in script, which is both frustrating to write and maintain as well as using a lot of space in etcd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree that currently the solution here is verbose in that there are multiple levels of indirection as you have outlined. The current proposal has been designed based on the consensus and positive feedback received from the previous proposal (now closed TEP-099). Additionally the currently suggested solution (*.env) similar to those employed in tasks that successfully addressed this issue. I added an additional bullet under Requirements
titled Minimize verbosity as much as possible
which mentions the issue you raised above.
Below in your comments you have mentioned an alternative implementation options could be that we set the env vars in the pod directly which would seemingly be less verbose. Once this is merged I will compare different implementation paths (including setting the env vars in pods directly) and we can weigh the tradeoffs there. I have currently not expressly added that implementation detail in the doc (plan to post merge as "proposed")
teps/0146-parameters-in-script.md
Outdated
env: | ||
- name: PARAM_URL | ||
value: $(params.url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to mutating the step specification could be to inject the env in the Pod
.
We could do that by default for all params by default.
There are some gotchas:
- the translation from param name to env variable name risks being brittle
- what happens in case of conflicts with user-defined environment variables or how do we prevent such conflicts
- how do we handle array params? We added support for indexing into array params but how do we render the list into an environment variable that is usable by different kinds of script environments? Shall we use a JSON blob? That's not too bash-friendly but very python-friendly
- how do we handle object params? Do we add one variable for each field? Or a single variable with a JSON blob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last 2 questions of @afrittoli are also valid independently of "mutating the step spec" or not, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to the gotchas outlined, I will address each one inline:
- the translation from param name to env variable name risks being brittle
- This is a good point, to clarify could you elaborate on the specific brittleness you mention? My interpretation would be that because param names are somewhat unconstrained multiple param names might map to the same env var name in translation if not careful. I have not added this into the document but to reduce brittleness in the param-name- > env-var name translation we could add a param name hash suffix (similar to k8s pods) to avoid collisions eg: TEKTON_PARAM_FOO_X5GHA. This would make directly knowing the name for the param a bit more difficult for users but Tekton is dynamically generating these anyway so it should be ok if it is somewhat opaque.
- what happens in case of conflicts with user-defined environment variables or how do we prevent such conflicts
- Good point, this case important to call out explicitly. I’ve added this into the document now under
Design Details
in a bulleted section titled - “Considered Edge Cases and Gotchas”. The gist is that in case of an overlap with user-defined environment variables, Tekton will throw a validation error stating that the param should be renamed. NOTE: collision should be very unlikely given the TEKTON_PARAM prefix and we can also optionally add a sha suffix to the env var (based on the param name) to further unique-ify the env var name to reduce collision.
- Good point, this case important to call out explicitly. I’ve added this into the document now under
- how do we handle array params? We added support for indexing into array params but how do we render the list into an environment variable that is usable by different kinds of script environments? Shall we use a JSON blob? That's not too bash-friendly but very python-friendly
- Currently the plan would be that the values in the env var map 1:1 with how they are currently used as that UX is working for users atm. In the case of an array param, it would be the same values based on the syntax, for an array param + index (params.arr[0]) we would get the string value and for the entire array we would provide the list as a comma separated array (params.arr -> first,second,third…) The TEP suggestion is built around the
script
field which only supports string type variables which currently has comma separated serialization for the array and actually has a validation error against using an object directly (vs an object field). I have added some examples under the “Suggested Solution”.
- Currently the plan would be that the values in the env var map 1:1 with how they are currently used as that UX is working for users atm. In the case of an array param, it would be the same values based on the syntax, for an array param + index (params.arr[0]) we would get the string value and for the entire array we would provide the list as a comma separated array (params.arr -> first,second,third…) The TEP suggestion is built around the
- how do we handle object params? Do we add one variable for each field? Or a single variable with a JSON blob?
- There would be a variable for each field. I have added some examples under the “Suggested Solution”. Currently in my experimentation around this it seems that you get a validation error if you try to reference a param object in a script directly (vs a param object field which works) - as such there doesn't seem to be a precedent for supporting the obj param directly as they can't be used IIUC in
script
currently
- There would be a variable for each field. I have added some examples under the “Suggested Solution”. Currently in my experimentation around this it seems that you get a validation error if you try to reference a param object in a script directly (vs a param object field which works) - as such there doesn't seem to be a precedent for supporting the obj param directly as they can't be used IIUC in
teps/0146-parameters-in-script.md
Outdated
Write `Parameters` in files - `/tekton/parameters/<param-name>` - mounted into `Steps` | ||
in the same way that `Results` are supplied. The file path for the param | ||
would be made available via a variable (possibly reusing `$(params.<name>)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files would hold JSON blobs similar to what we do for results I guess?
Producing JSON with bash is not too bad, parsing it is a bit harder, but maybe we could provide some helper functions in a bash resource file mounted in a local folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to providing some helper scripts to be used for arrays (and possibly objects) as needed. I’ve added a note under the alternative there related to adding UX helper functions
teps/0146-parameters-in-script.md
Outdated
IFS='' read -d -r PARAM_URL3 /tekton/params/url3 | ||
git clone ${PARAM_URL3} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work for array/object params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not investigated as deeply for the .path
alternative as I have for the .env
option but I believe this implementation would require helper functions as you suggest above to get this correct. Serializing arrays/objects into JSON syntax and then creating helper functions to parse the necessary piece of information instead when using [*]
or [0]
syntax (or the obj array directly) would be required. I did not update the document here but if folks prefer this alternative or want more detail here now I can add more information. In the case that would be too cumbersome, we could try to map files 1:1 to used values and have a file for each field/array-index but this should only be used as a final option as it doesn't scale well.
spec.steps[0].script: $(params.url) cannot be injected into a script, consider moving | ||
into an environment variable on the Step. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we did fail validation, we would then basically not support them at all, which would be a backwards incompatible change. We could do that in v2
, for v1
we might do that as an opt-in feature or a warning system - even though validation in itself does not have any facility for deprecations or warnings that I'm aware of, so it's unclear how the warning could be delivered to the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, currently the idea for rolling this out would be to keep backwards compatibility with an opt-in feature-gate for cluster operators to remove this in v1 (if we agree this makes sense). For the warnings, it wasn’t clear to me the de-facto way to do this in Tekton. As stated, we could output some “WARN: Tekton is deprecating <...>”, see docs link here for more information on deprecation timelines how to migrate <doc-link AND/OR link to TBD bespoke tkn
command for migration>” on creation of a Task but not sure how visible this might be to most end users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this proposal.
I left a couple of comments.
NIT: I have the impression that there was a proposal with the .env
syntax initially documented, and then removed in favour of a list of alternatives, as some part of the document seem to assume a specific design and others don't.
In any case, I'm happy to have this merged as a proposal, but I'd like to see more coverage of array/object params as well as non-bash environments (e.g. python).
Thank you for your feedback @afrittoli & @vdemeester! I’ve updated the document addressing all of the feedback posted (see individual comments). Wanted to address the points in you final comment below:
|
Thanks for the reviews here! @afrittoli - does this look good now to be merged as “proposed”? If not, what are the open questions to resolve to get this to implementable? I've added additional information and examples around array params, obj params, and non-bash environments. Happy to answer any additional q's or iterate on the doc as needed. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the suggested solution here ? Usage of $(params.url.env)
or having ${PARAMS_URL}
working by default ? I am not sure I understand the before/after/resolved.
- Are we mutating the
Task
definition (to add the environment variables), or are we just setting them at Pod creation time ? - What happens if
params.url
andparams.URL
exists ? (I think we error out on those already, but I am not 100% sure 😅 ).
Hi @vdemeester, responding to your questions below:
|
Sounds good, thanks @aaron-prindle |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: afrittoli, JeromeJu, vdemeester The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks @vdemeester! Lmk if there is anything I need to do on my end to get this merged as |
We need someone who has the permission to lgtm |
This tep is very similar like https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions#using-an-intermediate-environment-variable ? Is it mentioned in the proposal as a reference? |
This PR adds the problem statement for the TEP and identifies possible solutions. The proposal will be added in a subsequent PR after discussions of alternatives. Using `Parameter` variables directly in `script` blocks in `Tasks` is a footgun in two ways: - **Security**: It is easy for a `Task` _author_ to accidentally introduce a vector for code injection and, by contrast, difficult for a `Task` _user_ to verify that such an injection can't or hasn't taken place. - **Reliability**: It is easy for a `Task` _user_ to accidentally pass in a `Parameter` with a character that would make the `Script` invalid and fail the `Task`, making the `Task` extremely fragile. To solve the above problems, this TEP aims to: - Introduce a safe and reliable way to access `Parameter` variables from `Scripts`, and update the documentation and *Tekton Catalog* with the new approach. - Disallow use of `Parameter` variables directly in `script` blocks of `Steps` in *Tekton Pipelines V1 API*. References: * Issues: * tektoncd/pipeline#3226 * tektoncd/triggers#675 * tektoncd/plumbing#971 * [Catalog Guidance to Avoid Using `Parameters` in `Script` Blocks](https://github.com/tektoncd/catalog/blob/main/recommendations.md#dont-use-interpolation-in-scripts-or-string-arguments) * Tekton Enhancement Proposals: * [TEP-0017: Shell-Escaped Parameters](#208) * [TEP-0023: Implicit Parameters](https://github.com/tektoncd/community/blob/main/teps/0023-implicit-mapping.md) * [TEP-0099: Parameters in Script](#596) Co-authored-by: Jerop Kipruto <[email protected]> Co-authored-by: Scott Seaward <[email protected]>
@Yongxuanzhang - just added this as a reference, thanks! |
/lgtm |
This PR adds the problem statement for the TEP and identifies possible solutions. The proposal will be added in a subsequent PR after discussions of alternatives.
Using
Parameter
variables directly inscript
blocks inTasks
is a footgun in two ways:Task
author to accidentally introduce a vector for code injection and, by contrast, difficult for aTask
user to verify that such an injection can't or hasn't taken place.Task
user to accidentally pass in aParameter
with a character that would make theScript
invalid and fail theTask
, making theTask
extremely fragile.To solve the above problems, this TEP aims to:
Parameter
variables fromScripts
, and update the documentation and Tekton Catalog with the new approach.Parameter
variables directly inscript
blocks ofSteps
in Tekton Pipelines V1 API.References:
script:
pipeline#3226 * Escape incoming data triggers#675 * UseEnvironment Variables
instead ofParameters
inScript
plumbing#971Parameters
inScript
Blocks