Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/sigv4authextension] Initial implementation #8263

Merged
merged 14 commits into from
Mar 16, 2022

Conversation

erichsueh3
Copy link
Contributor

Description:
This is the initial implementation of the Sigv4 Authenticator Extension. See #7533 for more information.

Link to tracking Issue:
#7533

Testing:
Unit tests have been added.

Documentation:
Included in the PR is a README and a design document.

@erichsueh3 erichsueh3 requested a review from a team March 5, 2022 00:02
@anuraaga
Copy link
Contributor

anuraaga commented Mar 7, 2022

@erichsueh3 Can you squash the commits?

extension/sigv4authextension/README.md Outdated Show resolved Hide resolved
extension/sigv4authextension/README.md Outdated Show resolved Hide resolved
extension/sigv4authextension/config_test.go Show resolved Hide resolved
extension/sigv4authextension/extension_test.go Outdated Show resolved Hide resolved
expectedErr error
}{
{
"bad_credentials",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the test case has a bad role name so it would be better to explicitly call out bad_role. But it is probably best to avoid making requests to AWS from unit tests - I'm not sure if this actually does (the role is completely misformatted so maybe handled by the SDK) so can you check? If it does, we should probably just skip having the role test.

Either way, make sure there is a test with a valid config but missing_credentials (no role, no env vars)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like having a role means that a request to aws is in fact made, as it returns an error operation error STS: AssumeRole, failed to sign request: failed to retrieve credentials: no EC2 IMDS role found, operation error ec2imds: GetMetadata, http response error StatusCode: 403, request to EC2 IMDS failed; I think that we should remove the role test. As for a test with a valid config but missing_credentials with no role/environment variables, I've added in that test and it works as expected. However, it also seems to be making a request to aws (no EC2 IMDS role found, operation error ec2imds: GetMetadata, http response error StatusCode: 403, request to EC2 IMDS failed), so not sure if this is the right way to go about this test (this also applies to the missing creds test in TestGetCredsFromConfig()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how much value is there in having a test with a valid config but with missing credentials, when trying to test for errors from loading from the config? I think we could remove this whole TestLoadConfigError(), as Validate() will only error as a result of getCredsProviderFromConfig() now after making region/service optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a test when creating the extension then it's not as important in the config, we'd want it in one or the other I think (ideally without the IMDS request if we can try my other suggestion #8263 (comment)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think for now we can just keep this test in here then?

provider := stscreds.NewAssumeRoleProvider(stsSvc, cfg.RoleArn, func(o *stscreds.AssumeRoleOptions) {
o.RoleSessionName = "otel-" + strconv.FormatInt(time.Now().Unix(), 10)
})
creds, err := provider.Retrieve(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want to call Retrieve once in this function to validate credentials eagerly, but what we want to return is provider or awscfg.Credentials, not the result of Retrieve. The result of retrieve is a never-again-refreshed credential, we want to call Retrieve every time we sign.

You'll also want to call NewCredentialsCache for this sts codepath

https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/#specify-credentials-programmatically

All credential providers passed to or returned by LoadDefaultConfig are wrapped in a CredentialsCache automatically. This enables caching and concurrency safe credential access. If you explicitly configure a provider on aws.Config directly you must explicitly wrap the provider with this type using NewCredentialsCache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change to getCredsProviderFromConfig(), returning an *aws.CredentialsProvider instead; we still call Retrieve once for validation, and we also wrap it in NewCredentialsCache when we explicity configure a provider when we have a RoleARN

extension/sigv4authextension/extension.go Outdated Show resolved Hide resolved
extension/sigv4authextension/extension.go Outdated Show resolved Hide resolved

func TestGetCredsFromConfig(t *testing.T) {
awsCreds := fetchMockCredentials()
t.Setenv("AWS_ACCESS_KEY_ID", awsCreds.AccessKeyID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we shouldn't need env vars here since can set creds on cfg

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also let's add tests for missing creds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added back in environment variables into the test for a success case, since according to this guide, environment variables have the highest precedence for credentials (besides CLI options). Without it, we would look at the ~/.aws files, and then at container and instance profile credentials. If we don't set environment variables here, then it will eventually error when it reaches the end and tries to use instance profile credentials. I've added in a test for missing creds, and it errors here at the last step of precedence for credentials. It also makes a request to aws, which we want to avoid, so not sure if having this test is ok or not, or if there's another way to go about this test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this - I guess it is hitting IMDS to try to fetch a cred for an EC2 machine. Since that IP will almost always be unresolvable it's a bit better than hitting the STS endpoint which is always available so maybe we can live with it. Could you try if disabling IMDS helps? Can follow up in another PR though

AWS_EC2_METADATA_DISABLED
https://github.com/aws/aws-sdk-go-v2/blob/c214cb61990441aa165e216a3f7e845c50d21939/config/load_options.go#L169

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried disabling IMDS by adding awsconfig.WithEC2IMDSClientEnableState(imds.ClientDisabled), as a LoadOption to the LoadDefaultConfig() method call, but it doesn't change anything and we still get the same error of no EC2 IMDS role found, operation error ec2imds: GetMetadata, http response error StatusCode: 403, request to EC2 IMDS failed; I think its better to follow up with this in another PR as well

extension/sigv4authextension/factory_test.go Outdated Show resolved Hide resolved
exporter/awsprometheusremotewriteexporter/README.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
exporter/awsprometheusremotewriteexporter/README.md Outdated Show resolved Hide resolved
exporter/awsprometheusremotewriteexporter/factory.go Outdated Show resolved Hide resolved
extension/sigv4authextension/signingroundtripper.go Outdated Show resolved Hide resolved
extension/sigv4authextension/signingroundtripper.go Outdated Show resolved Hide resolved
extension/sigv4authextension/signingroundtripper.go Outdated Show resolved Hide resolved
extension/sigv4authextension/signingroundtripper.go Outdated Show resolved Hide resolved
extension/sigv4authextension/signingroundtripper_test.go Outdated Show resolved Hide resolved
@erichsueh3
Copy link
Contributor Author

Hi @jpkrohling, could you take a look at this PR as well? Would like some more eyes on it if possible

@jpkrohling
Copy link
Member

Sure, I'll add this to my queue.

extension/sigv4authextension/config_test.go Show resolved Hide resolved
Next, we introduce `extension.go`, which contains the bulk of the implementation of the Sigv4 authenticator. This includes implementing the `ClientAuthenticator` interface and its necessary methods.


`Sigv4Auth` is a struct that implements the `ClientAuthenticator` interface. It must implement two methods, `RoundTripper()` and `PerRPCCredentials()`. Additionally, it must also implement a `Start()` and a `Shutdown()` method. For our Sigv4 authenticator, both of these methods won’t do anything, so we instead embed `componenthelper.StartFunc` and `componenthelper.ShutdownFunc` to get that default behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are very close to merging open-telemetry/opentelemetry-collector#4837, which would make this a bit simpler. Keep an eye on it :-)

Copy link
Contributor Author

@erichsueh3 erichsueh3 Mar 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jpkrohling, would this just mean replacing componenthelper.StartFunc with component.StartFunc (and same for Stop)? Want to make sure I'm not misunderstanding anything before I make changes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you would just need to call the "NewClientAuthenticator(WithClientRoundTripper(yourfunc))" and you'd get a complete implementation without having to worry about the start/stop/per-RPC implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understood; this is something that we'd need to wait until the next release to make the changes as well correct?

extension/sigv4authextension/design.md Outdated Show resolved Hide resolved
extension/sigv4authextension/design.md Outdated Show resolved Hide resolved
extension/sigv4authextension/design.md Outdated Show resolved Hide resolved
@@ -80,6 +80,7 @@ module-sets:
- github.com/open-telemetry/opentelemetry-collector-contrib/extension/observer/k8sobserver
- github.com/open-telemetry/opentelemetry-collector-contrib/extension/oidcauthextension
- github.com/open-telemetry/opentelemetry-collector-contrib/extension/pprofextension
- github.com/open-telemetry/opentelemetry-collector-contrib/extension/sigv4authextension
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add this component to the -releases repo when ready.

* [List of AWS regions](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html)
* `service`: **Optional**. The AWS service for AWS Sigv4
* Note that an attempt will be made to obtain a valid service from the endpoint of the service you are exporting to
* `role_session_name`: **Optional**. The name of a role session. If not provided, one will be constructed with a semi-random identifier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this up to be right after role_arn

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wondering if it should be

assume_role:
  arn:
  session_name:

since session_name doesn't make sense without arn

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, pairing the ARN and session name like that sounds good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes and now include an AssumeRole struct with an ARN and a SessionName


identifier := cfg.RoleSessionName
if identifier == "" {
b := make([]byte, 5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we found the default SDK behavior let's be consistent with that, or remove the random logic, but we should avoid this divergent logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the random logic and use of the otel prefix, so now it will just default to the SDK behavior using nanosecond time.

Copy link
Member

@Aneurysm9 Aneurysm9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note about service/region detection warning. Fix quickly if you can, but I think we can handle it separately if needed as well.

extension/sigv4authextension/signingroundtripper.go Outdated Show resolved Hide resolved
@Aneurysm9 Aneurysm9 added the ready to merge Code review completed; ready to merge by maintainers label Mar 15, 2022
@jpkrohling jpkrohling changed the title New Component: Sigv4 Authenticator Extension [extension/sigv4authextension] Initial implementation Mar 16, 2022
@jpkrohling jpkrohling merged commit 88d0e28 into open-telemetry:main Mar 16, 2022
@erichsueh3 erichsueh3 deleted the sigv4auth branch March 16, 2022 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants