Sqs batch failure error stack #762

shivang007 · 2021-12-19T11:30:37Z

What does this implement/fix? Explain your changes.

It passes Error Stack along with message.

Does this close any currently open issues?

No

Any other comments?

Currently, if any worker gets shared by multiple methods, then it's very difficult (sometimes impossible without running locally) to pinpoint the exact location where the error occurs. Hence adding the original stack of the error.

Where has this been tested?

Node.js Versions: 14.17.6
Middy Versions: 2.5.3
AWS SDK Versions: ^2.1014.0

shivang007 · 2021-12-19T11:39:46Z

The error stack is normally shown like above logs (error @middy/sqs-partial-batch-failure/index.js:91:1),
which makes it impossible to trace in a large scale application.

Hence added the actual error stack along with message.

It has been tested on a batch and will produce the output like:

willfarrell · 2021-12-19T18:23:43Z

What if instead of attaching the stack traces to the message we attach them to the error similar to how we do in @middy/core. What do you think of this approach?

const sqsPartialBatchFailureMiddlewareAfter = async (request) => {
    ...
    const rejectedReasons = getRejectedReasons(response)

    // If all messages were processed successfully, continue and let the messages be deleted by Lambda's native functionality
    if (!rejectedReasons.length) return

    ...

    const errorMessage = getErrorMessage(rejectedReasons)
    const error = new Error(errorMessage)
    error.originalErrors = rejectedReasons
    throw error
  }

  ...
}

const getRejectedReasons = (response) => {
  return response
    .filter((r) => r.status === 'rejected')
    .map((r) => r.reason)
}

const getErrorMessage = (rejectedReasons) => {
  return rejectedReasons.map(error => error.message).join('\n')
}

shivang007 · 2021-12-21T07:27:26Z

@willfarrell I tried out your code changes. It does add the originalErrors array inside the error.
But the problem is, by default only error message and stack gets logged (hiding originalErrors)!

So instead of this, how about we modify the .stack property of the new Error like this:
https://stackoverflow.com/questions/42754270/re-throwing-exception-in-nodejs-and-not-losing-stack-trace

checkout the first part of accepted answer.

willfarrell · 2021-12-22T17:36:14Z

Are you using @middy/error-logger? That should print out all nested errors. It might have issues due to the nesting being too deep.

shivang007 · 2021-12-23T09:53:02Z

Thanks for that!
We were not using @middy/error-logger, but after adding that, I can see the originalError getting logged.

I have modified the PR according your suggestions and after testing it, I am able to receive following logs (while failing a batch):

Test
    at after (/var/task/webpack:/javascript/@middy/sqs-partial-batch-failure/index.js:93:19)
    at c (/var/task/webpack:/javascript/@middy/core/index.js:120:17)
    at h (/var/task/webpack:/javascript/@middy/core/index.js:88:7) {
  originalErrors: [
    Error: Test
        at calculateGDD (/var/task/webpack:/workers/actualgdd/handler.js:18:11)
        at map (/var/task/webpack:/workers/actualgdd/handler.js:194:11)
        at Array.map (<anonymous>)
        at /var/task/webpack:/workers/actualgdd/handler.js:192:40
        at h (/var/task/webpack:/javascript/@middy/core/index.js:86:32),
    Error: Test
        at calculateGDD (/var/task/webpack:/workers/actualgdd/handler.js:18:11)
        at map (/var/task/webpack:/workers/actualgdd/handler.js:194:11)
        at Array.map (<anonymous>)
        at /var/task/webpack:/workers/actualgdd/handler.js:192:40
        at h (/var/task/webpack:/javascript/@middy/core/index.js:86:32)
  ]
}

shivang007 · 2021-12-28T06:12:54Z

@willfarrell, On a similar note,
can we extend the logging in the middleware?
As in, we want our own logger to log the stacks after attaching the mata data that we need,
to index it in Elasticsearch, to be able to visualise it in Kibana.

willfarrell · 2021-12-28T06:17:11Z

There is an error logging middleware that can allow you to transform and ship where you like. You have to write the code those. Open a new issue if you’d like to discuss further.

shivang007 · 2021-12-28T07:09:00Z

@willfarrell
Sure!
Also, will this PR for sqs-partial-batch-failure be merged and released on npm anytime soon?

willfarrell · 2021-12-28T20:31:31Z

I'm holding off on merging this just for a bit. It relates to another issue around handling errors from getInternal and error handling improvements planned for v3 (working on right now). I want to make sure all instances or arrayed errors are handled consistently, so it might be a week or two before I can put it in a release. Appologies for this delay.

shivang007 · 2021-12-29T08:15:30Z

No problem!
Sounds like a good plan 👍

willfarrell · 2022-01-05T17:34:48Z

@shivang007 Please take a look at #770, new AWS feature removes the need for throwing an error, so will need a logger instead. Would like your feedback on how errors should be handled for reporting purposes.

shivang007 · 2022-01-06T13:15:38Z

@willfarrell
I saw the thread and the updates that have been made by serverless (using changes published by AWS), but since they are successfully processing a batch after catching any errors, it will always show a batch as success - making it impossible for any external monitoring tool (e.g. Lumigo) to detect if there was a failure within a batch. As per our opinion marking a partially failed batch as successful is not the right way. Hence we might stick to this version and won't be using those functionality until it is handled in the right way.

willfarrell · 2022-01-06T16:45:42Z

Thanks for sharing. I was thinking about that as well. I can see where both ways are valuable depending on the use case.

a. An error happened while processing a message (push to DLQ)
b. message is still being processed by an external service and should be tried again later

This v3 approach is much better for b, while the v2 approach is much better for a.

Looking at other use cases like API Gateway + Lambda it is expected the error is handled to return a "success" with a statusCode indicating an error. I imagine monitoring tools handle this use case and display properly. I would expect, give time and little nudging, monitoring tools will handle this use case as well.

The v2 middlewares will work with v3

shivang007 · 2022-01-07T15:19:55Z

Ofcourse, eventually they will adapt!

Also saw your V3 code at https://github.com/middyjs/middy/blob/release/3.x/packages/sqs-partial-batch-failure/index.js
Looks good to me. It accepts the logger function, allowing devs the freedom to handle their logs anyway they want.

shivang007 · 2022-01-11T08:59:45Z

@willfarrell is this released on npm?

willfarrell · 2022-01-11T16:52:32Z

I hope to have an alpha out for the end of the week.

Shivang added 2 commits December 17, 2021 14:51

Added actual error stack with message

9dfee75

upgraded version

2fcbfca

willfarrell mentioned this pull request Dec 22, 2021

SSM middleware ssm.InvalidParameter errors can not be caught #763

Closed

Added original stack to error object

60c9453

shivang007 mentioned this pull request Dec 28, 2021

Extending error logging #766

Closed

willfarrell mentioned this pull request Dec 28, 2021

Version 3 #626

Closed

26 tasks

willfarrell merged commit b9b8ee1 into middyjs:main Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sqs batch failure error stack #762

Sqs batch failure error stack #762

shivang007 commented Dec 19, 2021

shivang007 commented Dec 19, 2021

willfarrell commented Dec 19, 2021 •

edited

Loading

shivang007 commented Dec 21, 2021

willfarrell commented Dec 22, 2021 •

edited

Loading

shivang007 commented Dec 23, 2021

shivang007 commented Dec 28, 2021

willfarrell commented Dec 28, 2021

shivang007 commented Dec 28, 2021

willfarrell commented Dec 28, 2021

shivang007 commented Dec 29, 2021

willfarrell commented Jan 5, 2022

shivang007 commented Jan 6, 2022

willfarrell commented Jan 6, 2022

shivang007 commented Jan 7, 2022

shivang007 commented Jan 11, 2022

willfarrell commented Jan 11, 2022

Sqs batch failure error stack #762

Sqs batch failure error stack #762

Conversation

shivang007 commented Dec 19, 2021

What does this implement/fix? Explain your changes.

Does this close any currently open issues?

Any other comments?

Where has this been tested?

shivang007 commented Dec 19, 2021

willfarrell commented Dec 19, 2021 • edited Loading

shivang007 commented Dec 21, 2021

willfarrell commented Dec 22, 2021 • edited Loading

shivang007 commented Dec 23, 2021

shivang007 commented Dec 28, 2021

willfarrell commented Dec 28, 2021

shivang007 commented Dec 28, 2021

willfarrell commented Dec 28, 2021

shivang007 commented Dec 29, 2021

willfarrell commented Jan 5, 2022

shivang007 commented Jan 6, 2022

willfarrell commented Jan 6, 2022

shivang007 commented Jan 7, 2022

shivang007 commented Jan 11, 2022

willfarrell commented Jan 11, 2022

willfarrell commented Dec 19, 2021 •

edited

Loading

willfarrell commented Dec 22, 2021 •

edited

Loading