Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use exponential retry strategy #271

Merged
merged 2 commits into from
Aug 17, 2021

Conversation

vlastahajek
Copy link
Contributor

@vlastahajek vlastahajek commented Aug 11, 2021

Proposed Changes

This PR aligns retry strategy implementations across all official InfluxDB client libraries so that a delay for the next retry delay is a random value in the interval retryInterval * exponentialBase^(attempts) and retryInterval * exponentialBase^(attempts+1).
The defaults were changed:

  • retryInterval=5_000
  • exponentialBase=2 (it is now also configurable)
  • maxRetryDelay=125_000
  • maxRetries=5

Retry delays are by default randomly distributed within the ranges of [5_000-10_000, 10_000-20_000, 20_000-40_000, 40_000-80_000, 80_000-125_000].

Added is also MaxRetryTime option. When an overall time spent by retrying exceeds a maxRetryTime (180_000 millis by default), the write is not retried and fails.

Checklist

  • CHANGELOG.md updated
  • Rebased/mergeable
  • A test has been added if appropriate
  • Tests pass
  • Commit messages are in semantic format

@vlastahajek vlastahajek requested a review from sranka August 11, 2021 20:11
maxRetryInterval uint
// The maximum total retry timeout in millisecond, default 180,000.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a test herein that proves that it is a maximum retry time since the first retry attempt ... is it really so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is intended to test that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is IMHO testing maxRetryInterval, I was looking for a test (or proof) that maxRetryTime works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, this test should test that

Copy link
Contributor

@sranka sranka Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is really difficult to understand as it is not obvious that a second srv.HandleWrite is required to return an error from the previously inserted expired batch ... a comment might have helped me to understand it. Anyway, It is not clear from the code that the second WriteBatch is actually tried immediately, IMHO it is not. Higher knowledge of the execution context is required even for the test. A test that would match the user expectation is something that was looking for, simply checking that an error is signalized after maxRetryTime is reached and the expired retried item is removed from the retry queue.

Copy link
Contributor Author

@vlastahajek vlastahajek Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for someone unfamiliar with how the write service works this seems complicated. I added more comments to improve readability.
One thing that is not obvious, and is that retries are not scheduled to be sent automatically. Retries are triggered by new writes.

Copy link
Contributor

@sranka sranka Aug 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also confused by the fact that WriteBatch can fail even without trying to write the data on input. It was this way before, so it is not in the scope of this PR. Thank you for your explanation.

@vlastahajek vlastahajek requested a review from sranka August 13, 2021 15:00
@vlastahajek vlastahajek merged commit 8b26ae2 into influxdata:master Aug 17, 2021
@vlastahajek vlastahajek deleted the feat/rnd_exp_retry branch August 17, 2021 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants