Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws-ecs): add a warning when the default value for minimumHealthyPercent (50%) is used for a service #31705

Closed
1 of 2 tasks
MarrickLip opened this issue Oct 9, 2024 · 3 comments · Fixed by #31738
Closed
1 of 2 tasks
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p3

Comments

@MarrickLip
Copy link
Contributor

MarrickLip commented Oct 9, 2024

Describe the feature

Add a warning message when minimumHealthyPercent isn't set for an ECS service, resulting in the implicit use of CDK's default value of 50%. This warning wouldn't appear if minimumHealthyPercent is explicitly set to 50%.

Use Case

CDK overrides the default value of 100% used by CloudFormation's AWS::ECS::Service and the CreateService API. This allows the number of running tasks to drop by up to 50% during deployments and Fargate maintenance etc.

This is an unsafe default for services which must support a consistent load e.g. handle web traffic via an ALB.

Proposed Solution

Implementation would be similar to the warning when creating an ASG with an explicit desired capacity.

Wording would be along the lines of:

minimumHealthyPercent has not been configured so the default value of 50% is used. The number of running tasks will decrease below the desired count during deployments etc. See [link to this issue]

Other Information

#14277 proposed changing the default value to 50%. This is a breaking change (e.g. it can cause stuck deployments) so I believe a warning is more practical. I also considered implementing this as a cdk-nag rule, but I believe a warning in "mainstream CDK" is warranted as this is a quirk of CDK itself.

I wasn't able to find any context for why CDK chose to override CloudFormation's default value. The ECS Developer Guide says 50% should be used to speed up deployments for services which are "idle for some time and don't have a high utilization rate." (source), but 100% is recommended for stateless workloads:

For example, when deploying a stateless application as an Amazon ECS service, such as a web or API server, customers should deploy multiple task replicas and set the minimumHealthyPercent to 100% (source)


I'm happy to open a PR but just wanted to get thoughts on this before putting pen to paper :)

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

Latest

Environment details (OS name and version, etc.)

N/A

@MarrickLip MarrickLip added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Oct 9, 2024
@github-actions github-actions bot added the @aws-cdk/aws-ecs Related to Amazon Elastic Container label Oct 9, 2024
@MarrickLip MarrickLip changed the title (aws-ecs): add a warning when using default value (50%) for minimumHealthyPercent in an ECS service (aws-ecs): add a warning when the default value for minimumHealthyPercent (50%) is used for a service Oct 9, 2024
@khushail khushail added investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 and removed needs-triage This issue or PR still needs to be triaged. labels Oct 9, 2024
@khushail khushail self-assigned this Oct 9, 2024
@khushail
Copy link
Contributor

khushail commented Oct 9, 2024

Hi @MarrickLip , thanks for reaching out.

I see that in the cloduformation docs, this minimumHealthyPercent is having different values for different scenarios -

The default value for a replica service for minimumHealthyPercent is 100%. The default minimumHealthyPercent value for a service using the DAEMON service schedule is 0% for the AWS CLI, the AWS SDKs, and the APIs and 50% for the AWS Management Console.

Yes, the CDK chose to keep the minimumHealthyPercent to 50 %

https://github.com/aws/aws-cdk/blob/17c81fe8362a0b5dcb59f059ade298a904b2aba3/packages/%40aws-cdk/aws-ecs/lib/base/base-service.ts#L129C1-L136C39

when the AWS suggested it as 100%.

It would be great to display a warning message,so appreciate you volunteering for that.

I would be marking this issue as P3 which means its a good to have feature and would be open for contribution from community as well as team.
Thanks.

@khushail khushail added effort/small Small work item – less than a day of effort effort/medium Medium work item – several days of effort p3 and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. effort/small Small work item – less than a day of effort p2 labels Oct 9, 2024
@khushail khushail removed their assignment Oct 9, 2024
@davidjmemmett
Copy link

Contrary to this, when initially setting up a new ECS stack, minHealthyPercent should be zero for some containers, and then increased later when dependencies (DBs which may be controlled outside of CDK etc) are healthy.

I've just watched a video from this year's re:Invent whereby even the internal EC2 teams have rolling dependencies when bootstrapping a new AWS region, which means that services should be allowed to fail during a deployment until other things have reached a steady state.

@mergify mergify bot closed this as completed in #31738 Jan 7, 2025
mergify bot pushed a commit that referenced this issue Jan 7, 2025
…yPercent (#31738)

### Issue

Closes #31705

### Reason for this change

CDK overrides the default value of 100% used by CloudFormation's `AWS::ECS::Service` and the `CreateService` API. This allows the number of running tasks to drop by up to 50% during deployments and [Fargate maintenance](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-maintenance.html) etc.

This is an unsafe default for services which must support a consistent load e.g. handle web traffic via an ALB.

### Description of changes

A warning appears when the default value is implicitly used. CloudFormation's default is overridden in three different places so multiple warnings are added:
* `BaseService` emits `@aws-cdk/aws-ecs:minHealthyPercent` when `minHealthyPercent` is `undefined`
* `Ec2Service` emits `@aws-cdk/aws-ecs:minHealthyPercentDaemon` when `daemon` is `true` and `minHealthyPercent` is `undefined` 
* `ExternalService` emits `@aws-cdk/aws-ecs:minHealthyPercentExternal` when `minHealthyPercent` is `undefined`

At most one warning appears due to the way the CloudFormation's default is overidden.

`README.md` has been updated for `aws_ecs` and `aws_ecs_patterns` to ensure these examples don't trigger this warning.

### Description of how you validated changes
Unit tests have been added to ensure the warnings appear are there when they should be and not when they shouldn't.

### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Copy link

github-actions bot commented Jan 7, 2025

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 7, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p3
Projects
None yet
3 participants