-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What should be the action for build timeouts? #11072
Comments
@ulisesh / @AlitzelMendez / @missymessa - are we able to auto retry the build for timeouts? |
I'd imagine that should work, provided we know the timeout error message to look for. Ali can correct me if I'm wrong |
I might be wrong but it looks like the pipeline had a timeout of 90 minutes for the job that timed out. The timeout should be increased. |
A build which times out, except in very unusual circumstances, is pretty likely to time out on attempts 2..N. Seems like something we'd explicitly not auto-retry given the alternative workaround (increase the timeout) |
It is not what I am seeing. There is a lot of intermittent "slow machine problems". It affects macOS the most (see #10794). It can be seen with other OSes too (windows x86 in this case). Build that times out is very likely to pass on rerun. What would you recommend as a margin for slow machines? If we have a build that typically finishes in 1 hour, what should the timeout be set to in the yaml to account for intermittent slow machines? |
Increase timeouts for runtime-dev-innerloop legs to compensate for intermittently slow build machines. Fixes dotnet/arcade#11072
I'd say at least double, i.e. 2 hours for a 1 hour typical case. I am keenly aware of the "slow macOS" problem but it's hopelessly entangled in the "Helix runs take as long as they take because there's only one pool of machines" problem; perhaps you can point to some of these and we can discuss more specifically? |
Sounds good. It is what I have done in dotnet/runtime#76453
I am going to open "Known build error" issue in dotnet/runtime that will accumulate the timeouts and we can then see what to do about them. |
Increase timeouts for runtime-dev-innerloop legs to compensate for intermittently slow build machines. Fixes dotnet/arcade#11072
Build
https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=34958
Build leg reported
Build windows x86 release Runtime_Debug
Pull Request
dotnet/runtime#76386
Action required for the engineering services team
To triage this issue (First Responder / @dotnet/dnceng):
If this is an issue that is causing build breaks across multiple builds and would get benefit from being listed on the build analysis check, follow the next steps:
Additional information about the issue reported
The build timeouts are relatively common reason for red PRs. I understand that there are number of factors outside our control that can lead to the build timeouts. Still, we need to have clarity on what one should do with these timeouts. They typically go away with manual retry.
Should we have "Known Build Error" issue that auto-retries? Or should we have a "Known Build Error" issue that does not auto-retry and just keeps track how often we are seeing these timeouts?
Report
Summary
The text was updated successfully, but these errors were encountered: