Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(cli): changeset-based cdk diff not showing changes caused by SSM parameters, but appear in CloudFormation UI changeset info #29731

Open
Labels
bug This issue is a bug. cli Issues related to the CDK CLI effort/large Large work item – several weeks of effort p1 package/tools Related to AWS CDK Tools or CLI

Comments

@blimmer
Copy link
Contributor

blimmer commented Apr 4, 2024

Describe the bug

Currently if you use a Systems Manager StringParameter as a value to another resource, the cdk diff does not identify a diff when the underlying parameter changes, even when using the new behavior that produces a CloudFormation changeset (vs. template-only diffs). The new behavior to produce a real changeset is designed to identify these types of changes.

If I manually create a changeset in the console, it does identify the change, so this feels like an issue specific to the CDK changeset diffing behavior.

Expected Behavior

I expect to be notified that my stack will change, just like I am if I upload the template to the CloudFormation UI. This bug is especially concerning if the change could trigger an unexpected resource replacement (which makes me think this should be a P1 issue).

Current Behavior

cdk diff says There were no differences when, really, CloudFormation generates a changeset that shows a diff.

Reproduction Steps

Consider the following straightforward stack:

import * as cdk from 'aws-cdk-lib';
import { Queue } from 'aws-cdk-lib/aws-sqs';
import { StringParameter } from 'aws-cdk-lib/aws-ssm';
import { Construct } from 'constructs';

export class CdkBugReportStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const queueNameFromParameter = StringParameter.valueForStringParameter(this, '/cdk/test/queue-name-param');
    new Queue(this, "Queue", {
      queueName: queueNameFromParameter
    })
  }
}

It imports an SSM StringParameter and uses the resolved value to name the queue.

  1. Create an SSM StringParameter with an initial value. For this repro, I'll call it blimmer-test-1.

Screenshot_2024-04-04_at_14_31_24

  1. Deploy the example stack above (cdk deploy CdkBugReportStack). You'll see your queue is cre
    ated with the name of your StringParameter from step 1.

Screenshot_2024-04-04_at_14_32_49

  1. Now, edit the StringParameter from step 1. I updated the value from blimmer-test-1 to blimmer-test-2.

Screenshot_2024-04-04_at_14_33_04

  1. Run cdk diff CdkBugReportStack. Make sure you're using the latest CDK (at time of writing v2.135.0) and that you're generating a changeset for the diff (e.g., not passing --no-change-set).

    > npx cdk diff CdkBugReportStack
    Stack CdkBugReportStack
    Hold on while we create a read-only change set to get a diff with accurate replacement information (use --no-change-set to use a less accurate but faster template-only diff)
    There were no differences
    
    ✨  Number of stacks with differences: 0

    As you can see, no diffs are identified (even though the underlying parameter did change).

  2. Generate the CloudFormation stack for use in the console in the next steps:

    > npx cdk synth -j CdkBugReportStack > stack.json
    
  3. Visit the CloudFormation service page in the AWS Console. Select the stack, CdkBugReportStack.

  4. Choose "Stack Actions" -> "Create change set for current stack".

Screenshot_2024-04-04_at_14_49_59

  1. Replace the template with the one you generated in step 5.

Screenshot_2024-04-04_at_14_51_24

  1. Keep clicking "next" and, finally, submit, without changing any other values.
  2. Now, once the changeset is rendered, you'll see that it identifies that the SSM parameter has changed and will result in a replacement of the queue. This is what I expect aws-cdk to report, too.

Screenshot_2024-04-04_at_14_56_05

  1. Now you can also run npx cdk deploy CdkBugReportStack and you'll see that the queue is replaced, even though the diff said no changes were detected.
> npx cdk deploy

✨  Synthesis time: 1.84s

CdkBugReportStack:  start: Building 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack:  success: Built 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack:  start: Publishing 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack:  success: Published 3514dbc73317309e7175dd9ffc618c6a7b4f814c0383edf38308e14ddaf77d81:current_account-current_region
CdkBugReportStack: deploying... [1/1]
CdkBugReportStack: creating CloudFormation changeset...

 ✅  CdkBugReportStack

✨  Deployment time: 43.33s

Stack ARN:
arn:aws:cloudformation:us-west-2:REDACTED:stack/CdkBugReportStack/6b883590-f2c2-11ee-afe3-025d26e8baa1

✨  Total time: 45.17s

Possible Solution

It feels like if CloudFormation can identify this change, CDK should also be able to identify the change when running the more accurate changeset-based diff.

Additional Information/Context

I was confused why my services were sometimes redeploying when no diffs were shown via cdk diff. It turned out that my problem was with obtainDefaultFluentBitECRImage, which obtains the fluent bit image via an SSM parameter (docs). When the underlying parameter changed, it caused my task definitions and services to be updated.

Linking this up with #7366 and #23288, which are related to the specific issue I ran into here.

CDK CLI Version

2.135.0 (build d46c474)

Framework Version

No response

Node.js Version

20 LTS

OS

MacOS

Language

TypeScript

Language Version

No response

Other information

No response

@blimmer blimmer added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Apr 4, 2024
@github-actions github-actions bot added the package/tools Related to AWS CDK Tools or CLI label Apr 4, 2024
@blimmer
Copy link
Contributor Author

blimmer commented Apr 4, 2024

cc @comcalvi who has been working on the CFN changeset diffs lately

@pahud
Copy link
Contributor

pahud commented Apr 8, 2024

Yes you are right. Looks like cdk diff would not detect the parameter value change but the CFN console would. Making it a p1 and we'll discuss with the team.

@pahud pahud added p1 cli Issues related to the CDK CLI effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Apr 8, 2024
@bergjaak
Copy link
Contributor

Using the example shared by @blimmer, the root cause appears to be that the DifferenceCollection.logicalIds (https://github.com/aws/aws-cdk/blob/main/packages/@aws-cdk/cloudformation-diff/lib/diff/types.ts#L406) doesn't contain all the changed resources. Before hitting this line (406) of code in cdk diff, we filter for replacements -- and we do find the SqsQueue replacement https://github.com/aws/aws-cdk/blob/main/packages/@aws-cdk/cloudformation-diff/lib/diff-template.ts#L233. So we should make the TemplateDiff.resources contain the changed resource.

Here are the debugger variables at line 233:
Screenshot 2024-04-22 at 2 04 58 PM

@bergjaak bergjaak self-assigned this May 6, 2024
mergify bot pushed a commit that referenced this issue May 10, 2024
### Issue # (if applicable)

Closes #29731

### Reason for this change

* Diffs that are only present in the change set are ignored

### Description of changes

* Include changes to properties that are only present in the changeset

### Description of how you validated changes

* Here is an image of what the change looks like, with the fix on the right and the old behavior on the left. I did this with a CDK app that is the same as given in the example from the customer who opened the issue (29731), but the app also includes a new queue (which is in the left and right).

<img width="851" alt="29731Fix" src="https://github.com/aws/aws-cdk/assets/108292982/2c6e9464-5c36-4697-88fd-2929cdbcf8cc">


* manual, unit, and integration tested. Ran all integration tests that mention `cdk diff`:

```
[INTEG TEST::cdk diff] Starting (pid 34550)...
[INTEG TEST::cdk diff] Done (13725 ms).
[INTEG TEST::cdk diff --fail on multiple stacks exits with error if any of the stacks contains a diff] Starting (pid 34550)...
[INTEG TEST::cdk diff --fail on multiple stacks exits with error if any of the stacks contains a diff] Done (80833 ms).
[INTEG TEST::cdk diff --fail with multiple stack exits with if any of the stacks contains a diff] Starting (pid 34550)...
[INTEG TEST::cdk diff --fail with multiple stack exits with if any of the stacks contains a diff] Done (81796 ms).
[INTEG TEST::cdk diff --security-only successfully outputs sso-permission-set-without-managed-policy information] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only successfully outputs sso-permission-set-without-managed-policy information] Done (7394 ms).
[INTEG TEST::cdk diff --security-only successfully outputs sso-permission-set-with-managed-policy information] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only successfully outputs sso-permission-set-with-managed-policy information] Done (7043 ms).
[INTEG TEST::cdk diff --security-only successfully outputs sso-assignment information] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only successfully outputs sso-assignment information] Done (6930 ms).
[INTEG TEST::cdk diff --security-only successfully outputs sso-access-control information] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only successfully outputs sso-access-control information] Done (6958 ms).
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso access control config] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso access control config] Done (6945 ms).
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso-perm-set-without-managed-policy] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso-perm-set-without-managed-policy] Done (7042 ms).
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso-perm-set-with-managed-policy] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso-perm-set-with-managed-policy] Done (7131 ms).
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso-assignment] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only --fail exits when security diff for sso-assignment] Done (7225 ms).
[INTEG TEST::cdk diff --security-only --fail exits when security changes are present] Starting (pid 34550)...
[INTEG TEST::cdk diff --security-only --fail exits when security changes are present] Done (7124 ms).
[INTEG TEST::cdk diff --quiet does not print 'There were no differences' message for stacks which have no differences] Starting (pid 34550)...
[INTEG TEST::cdk diff --quiet does not print 'There were no differences' message for stacks which have no differences] Done (75387 ms).
[INTEG TEST::cdk diff picks up changes that are only present in changeset] Starting (pid 34550)...
[INTEG TEST::cdk diff picks up changes that are only present in changeset] Done (98624 ms).
 PASS  tests/cli-integ-tests/cli.integtest.js (414.667 s)
  ● Console

    console.log
      Using regions: us-east-1

      at log (../../lib/with-aws.ts:36:11)


Test Suites: 2 skipped, 1 passed, 1 of 3 total
Tests:       90 skipped, 14 passed, 104 total
Snapshots:   0 total
Time:        414.86 s
Ran all test suites with tests matching "cdk diff".
```

### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@mergify mergify bot closed this as completed in #30093 May 10, 2024
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@mbonig
Copy link
Contributor

mbonig commented Oct 4, 2024

Just here to 👍🏻 this issue as it is also a problem with EC2 instances where AMIs are often driven by SSM parameters (like new WindowsImage(WindowsVersion.WINDOWS_SERVER_2019_ENGLISH_FULL_BASE)) which causes a replacement of the EC2 instance despite cdk diff showing no changes.

@kaizencc
Copy link
Contributor

kaizencc commented Oct 9, 2024

#30093 fixed this but was reverted by #30243

@timothy-cloudopsguy
Copy link

timothy-cloudopsguy commented Dec 4, 2024

This might be a separate issue, or all part of the same...

If you do something like this

      #
      # Import our SSL Certificate Arn
      #
      ssl_cert_arn = _ssm.StringParameter.from_string_parameter_name(
          self,
          id="ssl_cert_arn_ssm",
          string_parameter_name=f"/{self.app_name}/ssl/cert_arn"
      )

And then change the string_parameter_name and re-deploy ... i.e. you're pulling the ssl cert from a different ssm parameter...

The cdk finds no differences to the stack and doesn't deploy. The ONLY way to change the string_parameter_name and have cdk diff/deploy pick it up is if you ALSO change the id of the resource in your stack ... Which is very less than ideal.

        #
        # Import our SSL Certificate Arn
        #
        ssl_cert_arn = _ssm.StringParameter.from_string_parameter_name(
            self,
            id="something_different_to_kick_cdk_diff",
            string_parameter_name=f"/{self.core_stack_name}/ssl/cert_arn"
        )

i've noticed this behavior as far back as version 2.84, but i'm running on latest 2.171.1 and it's still an issue this morning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment