Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(logs): Policy Limit Reached for LogGroup ResourcePolicies #20313

Closed
badaldavda8 opened this issue May 12, 2022 · 15 comments · Fixed by #28495
Closed

(logs): Policy Limit Reached for LogGroup ResourcePolicies #20313

badaldavda8 opened this issue May 12, 2022 · 15 comments · Fixed by #28495
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container @aws-cdk/aws-iam Related to AWS Identity and Access Management @aws-cdk/aws-logs Related to Amazon CloudWatch Logs bug This issue is a bug. effort/medium Medium work item – several days of effort p1

Comments

@badaldavda8
Copy link

badaldavda8 commented May 12, 2022

Describe the bug

After resolution of #17544 A new issue has turned up if you tend to create multiple ECS tasks referring multiple log groups.

Resource handler returned message: "Resource limit exceeded. (Service: CloudWatchLogs, Status Code: 400, Request ID: 25bec134-657e-43c3-ae85-810a0ce56fa0)" (RequestToken: 948dab8b-fac6-2903-695d-f9d825eaea90, HandlerErrorCode: ServiceLimitExceeded)
This is because Default quota for resource policies

Resource policies Up to 10 CloudWatch Logs resource policies per Region per account. This quota can't be changed.

Expected Behavior

No error after 10th ECS Task/service

Current Behavior

Each ECS task creates a new log group finally exhausting this limit.

Reproduction Steps

Create 10 log groups for ecs and you will start to face this.

Possible Solution

Let us avoid creating not create Resource Policies for CW Logs until this issue is resolved. I understand this defeats the purpose of lowest privilege, but causes issues.

Additional Information/Context

WorkAround

separate logGroup if created within taskDefinition and add following in the code for now.

logGroup.node.tryRemoveChild('Policy')

CDK CLI Version

2.24.0

Framework Version

No response

Node.js Version

v17.9.0

OS

macOS

Language

Typescript

Language Version

No response

Other information

No response

@badaldavda8 badaldavda8 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 12, 2022
@github-actions github-actions bot added the @aws-cdk/aws-ecs Related to Amazon Elastic Container label May 12, 2022
@badaldavda8 badaldavda8 changed the title (module name): (short issue description) aws-cdk-lib.aws_logs module: Policy Limit Reached for LogGroup ResourcePolicies May 12, 2022
@github-actions github-actions bot added the aws-cdk-lib Related to the aws-cdk-lib package label May 12, 2022
@madeline-k madeline-k changed the title aws-cdk-lib.aws_logs module: Policy Limit Reached for LogGroup ResourcePolicies (logs): Policy Limit Reached for LogGroup ResourcePolicies May 16, 2022
@madeline-k madeline-k added p1 @aws-cdk/aws-iam Related to AWS Identity and Access Management effort/medium Medium work item – several days of effort @aws-cdk/aws-logs Related to Amazon CloudWatch Logs and removed aws-cdk-lib Related to the aws-cdk-lib package needs-triage This issue or PR still needs to be triaged. labels May 16, 2022
@madeline-k madeline-k removed their assignment May 16, 2022
@badaldavda8
Copy link
Author

Hi Any update on this?

@badaldavda8
Copy link
Author

Any update on this?

@brignolij
Copy link

Hi,
Any update on this? i'm facing the same issue.

@tt-fozail
Copy link

Same problem please take a look at this @comcalvi

@aaa-khoa
Copy link

im facing the same issue, any update on this? I've tried using L1 constructs and it still creates a resource policy whenever a log group is attached to a pipeline project, is there a workaround for this?

@xmatusmic
Copy link

Same issue!

@jaredcnance
Copy link

jaredcnance commented Mar 27, 2023

LogGroups do not support ResourcePolicies being attached directly. The only use case for ResourcePolicies today is to enable vended data (see this). I don't think CDK should be creating these policies at all unless the customer is trying to enable vended data, which is not the case for ECS because ECS uses a role in the customer's account to write data.

@kevinswarner
Copy link

I believe this issue is causing problems for our attempts to deploy Fargate ECS tasks. I have not seen a resolution yet and am wondering if anyone has advice. We are simply defining our Fargate tasks like this...

taskDefinition.addContainer("SampleContainer", {
  image: aws_ecs.RepositoryImage.fromEcrRepository(repository, commitHash),
  essential: true,
  logging: aws_ecs.LogDriver.awsLogs({
    streamPrefix: "sample-container-prefix",
  }),
});

We have several (probably more than 10 or 15) tasks spread across different Fargate instances. We are now getting the following CDK error when attempting to deploy...

indexing-gc-dev-error-worker | 1/7 | 4:48:36 PM | CREATE_FAILED |
AWS::Logs::ResourcePolicy | ErrorWorkerTaskDefinition/ErrorWorkerContainer/
LogGroup/Policy/ResourcePolicy (ErrorWorkerTaskDefinitionErrorWorkerContainerLog
GroupPolicyResourcePolicy402AEAF5) Resource handler returned message: "Resource 
limit exceeded. (Service: CloudWatchLogs, Status Code: 400, Request ID: 
333b2ee5-001e-4a15-9ea55a7b726a4cde)" (RequestToken: 
faeb374a-d7a0-1850-e977-6cc73706e858, HandlerErrorCode: ServiceLimitExceeded)

If I remove the "logging" property from the config object, we are able to deploy. But, I was under the impression that adding the logging is necessary to get logs to be delivered to CloudWatch.

I appreciate any help with this.

@xmatusmic
Copy link

I believe this issue is causing problems for our attempts to deploy Fargate ECS tasks. I have not seen a resolution yet and am wondering if anyone has advice. We are simply defining our Fargate tasks like this...

taskDefinition.addContainer("SampleContainer", {
  image: aws_ecs.RepositoryImage.fromEcrRepository(repository, commitHash),
  essential: true,
  logging: aws_ecs.LogDriver.awsLogs({
    streamPrefix: "sample-container-prefix",
  }),
});

We have several (probably more than 10 or 15) tasks spread across different Fargate instances. We are now getting the following CDK error when attempting to deploy...

indexing-gc-dev-error-worker | 1/7 | 4:48:36 PM | CREATE_FAILED |
AWS::Logs::ResourcePolicy | ErrorWorkerTaskDefinition/ErrorWorkerContainer/
LogGroup/Policy/ResourcePolicy (ErrorWorkerTaskDefinitionErrorWorkerContainerLog
GroupPolicyResourcePolicy402AEAF5) Resource handler returned message: "Resource 
limit exceeded. (Service: CloudWatchLogs, Status Code: 400, Request ID: 
333b2ee5-001e-4a15-9ea55a7b726a4cde)" (RequestToken: 
faeb374a-d7a0-1850-e977-6cc73706e858, HandlerErrorCode: ServiceLimitExceeded)

If I remove the "logging" property from the config object, we are able to deploy. But, I was under the impression that adding the logging is necessary to get logs to be delivered to CloudWatch.

I appreciate any help with this.

try this
Hi, what we do and it works is:

log_group = logs.LogGroup(
          self,
          f'log-group-name',
          log_group_name=f'/ecs/log-group-name',
          removal_policy = RemovalPolicy.DESTROY)

log_group.node.try_remove_child("Policy")

@kevinswarner
Copy link

@xmatusmic Thanks for the suggestion. I tried what I think you suggested, but received the same error. Here is my full code example for the stack...

import {
  App,
  Stack,
  StackProps,
  RemovalPolicy,
  aws_ecr,
  aws_iam,
  aws_ecs,
  aws_ec2,
  aws_elasticache,
  aws_logs,
} from "aws-cdk-lib";

interface ParseWorkerStackProps extends StackProps {
  environment: string;
  commitHash: string;
  isGCEnvironment: boolean;
  vpc: aws_ec2.IVpc;
  executionRole: aws_iam.IRole;
  taskRole: aws_iam.IRole;
  repository: aws_ecr.IRepository;
  redis: aws_elasticache.CfnReplicationGroup;
  cluster: aws_ecs.ICluster;
}

class ParseWorkerStack extends Stack {
  constructor(
    scope: App,
    id: string,
    {
      commitHash,
      isGCEnvironment,
      vpc,
      executionRole,
      taskRole,
      repository,
      redis,
      cluster,
      ...props
    }: ParseWorkerStackProps
  ) {
    super(scope, id, props);

    const executionRoleReference = aws_iam.Role.fromRoleArn(
      this,
      "ExecutionRole",
      executionRole.roleArn
    );

    const taskRoleReference = aws_iam.Role.fromRoleArn(
      this,
      "TaskRole",
      taskRole.roleArn
    );

    const logGroup = new aws_logs.LogGroup(this, "ParseWorkerLogGroup", {
      removalPolicy: RemovalPolicy.DESTROY,
    });
    logGroup.node.tryRemoveChild("Policy");

    const taskDefinition = new aws_ecs.FargateTaskDefinition(
      this,
      "ParseWorkerTaskDefinition",
      {
        executionRole: executionRoleReference,
        taskRole: taskRoleReference,
        memoryLimitMiB: 16384,
        cpu: 8192,
      }
    );

    taskDefinition.addContainer("ParseWorkerContainer", {
      image: aws_ecs.RepositoryImage.fromEcrRepository(repository, commitHash),
      essential: true,
      logging: new aws_ecs.AwsLogDriver({
        logGroup,
        streamPrefix: "indexing-parse-worker-",
      }),
    });

    const securityGroup = new aws_ec2.SecurityGroup(this, "SecurityGroup", {
      vpc,
      allowAllOutbound: true,
    });

    new aws_ecs.FargateService(this, "ParseWorkerFargateService", {
      cluster,
      taskDefinition,
      desiredCount: isGCEnvironment ? 1 : 3,
      maxHealthyPercent: 200,
      minHealthyPercent: 100,
      securityGroups: [securityGroup],
      vpcSubnets: {
        subnetGroupName: "application_layer",
      },
      circuitBreaker: {
        rollback: !isGCEnvironment,
      },
    });
  }
}

export default ParseWorkerStack;

This seems like a fairly significant limitation, but I may be misunderstanding things. We have plans to stand up many Fargate ECS tasks and services, but it seems that this problem will not allow us to do this IF we want logging to CloudWatch. IF I remove the "logging" option on the addContainer, the service and tasks deploy fine, but we do not get any logging at all in CloudWatch.

@xmatusmic
Copy link

logGroup.node.tryRemoveChild("Policy");
it will work, move this line to the bottom of your code

@kevinswarner
Copy link

Thanks @xmatusmic That seems to work. I think the reason this works (but I am guessing a little here) is that the "addContainer" call with the logging property is adding the policy, so you have to place the "tryRemoveChild" call after the "addContainer" call. Would be great for someone from the CDK team to provide even more insight into this. But, I at least can now deploy multiple Fargate tasks / services.

@xmatusmic
Copy link

Thanks @xmatusmic That seems to work. I think the reason this works (but I am guessing a little here) is that the "addContainer" call with the logging property is adding the policy, so you have to place the "tryRemoveChild" call after the "addContainer" call. Would be great for someone from the CDK team to provide even more insight into this. But, I at least can now deploy multiple Fargate tasks / services.

good :) yeah, its exactly what you said, it has to be done after you add log to the container.

@mergify mergify bot closed this as completed in #28495 Jan 10, 2024
@mergify mergify bot closed this as completed in 5f96d13 Jan 10, 2024
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

1 similar comment
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

GavinZZ pushed a commit that referenced this issue Jan 11, 2024
This PR modified to avoid creating unnecessary `ResourcePolicy` in CloudWatch Logs.

The related issue reports an error when using the awslogs driver on ECS.
This error is caused by the creation of a ResourcePolicy in CloudWatch Logs that reaches the maximum number of ResourcePolicies.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html

In some cases, this ResourcePolicy will be created and in other cases it will not be created.
Currently, `Grant.addToPrincipalOrResource` is used to grant permissions to ExecutionRole and Log Group in the ECS taskDef.
https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-ecs/lib/log-drivers/aws-log-driver.ts#L138
https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-logs/lib/log-group.ts#L194
https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-iam/lib/grant.ts#L122

`Grant.addToPrincipalOrResource` first grants permissions to the Grantee (ExecutionRole) and creates a resource base policy for cross account access in cases where certain conditions are not met.
This condition is determined by the contents of the `principalAccount` of the ExecutionRole and the accountID in the `env.account` and whether or not these are Tokens, but in this scenario, cross account access is not necessary.
https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-iam/lib/grant.ts#L141

Also, when the `LogGroup.grantWrite` call was added to `aws-log-driver.ts`, the ResourcePolicy for logs could not be created from CFn and only granted to the ExecutionRole.
#1291
![スクリーンショット 2023-12-27 1 08 20](https://github.com/aws/aws-cdk/assets/58683719/5a17a041-d560-45fa-bac6-cdc3894b18bc)
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/ReleaseHistory.html

Therefore, the resource base policy should not be necessary when using the awslogs driver.

This PR changed to grant permissions only to ExecutionRole when using the awslogs driver.
With this fix, ResourcePolicy will no longer be created when using the awslogs driver.
I don't consider this a breaking change, as it changes the content of the generated template, but does not change the behavior of forwarding logs to CloudWatch Logs.
However, if this is a breaking change, I think it is necessary to use the feature flag.

fixes #22307, fixes #20313

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container @aws-cdk/aws-iam Related to AWS Identity and Access Management @aws-cdk/aws-logs Related to Amazon CloudWatch Logs bug This issue is a bug. effort/medium Medium work item – several days of effort p1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants