Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incomplete read in the DockerMultiplexedInputStream #693

Merged
merged 4 commits into from
Nov 23, 2018

Conversation

a-ba
Copy link
Contributor

@a-ba a-ba commented Oct 23, 2018

Fix #628 (which should be reopened) and https://issues.jenkins-ci.org/browse/JENKINS-53669

readInternal() assumed that the 8-byte header of a docker stream will always be fully read in one call. This is just misleaded.

The assumption would be okay if the network always had a high bandwidth and low jitter, but real-world cases we may random delays (because of packet losses, buffering, ...) that causes random read errors, eg:

FATAL: java.io.IOException: Expected 8 bytes application/vnd.docker.raw-stream header, got 2

Note: the PR also fixes a missing exit case when reaching EOF in the stderr channel

readInternal() assumed that the 8-byte header of a docker stream
will always be fully read in one call. This is just misleaded.

This assumption would be okay if the network always had a high
bandwidth and low jitter, but real-world cases we may have jitter
(because of packet losses, buffering, ...) that causes random read
errors, eg:
FATAL: java.io.IOException: Expected 8 bytes application/vnd.docker.raw-stream header, got 2
@a-ba a-ba force-pushed the fix-input-multiplexer branch from 7ce20c1 to 78c0337 Compare October 23, 2018 14:17
Copy link
Member

@pjdarton pjdarton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial changes look good; it looks like a perfect fit to solving #628

The handling of stderr, however, leaves a lot to be desired (mainly due to the original code). I agree that the EOF handling is an improvement, but I think that now we've seen it, it would be wrong to leave it at that and we really ought to "fix it properly".

Ideally, we should also have some unit tests for this code too (if we'd had some to start with, #628 probably would've been found by the tests...)

@pjdarton
Copy link
Member

@ndeloof It looks like you were the original author of this code... Can you confirm that I've understood it correctly?

  • I believe that what it's doing is reading a stream that's of the format <1-byte-type><3 bytes ignored><4-byte-length> that then can repeat.
  • I believe that it's intended to return the "payload" only where the 1-byte-type is 1 (stdout).
  • I believe that the stream can also contain payload where the 1-byte-type is 2 (stderr) and the intent was to record that "not expected" data in a way that allows the Jenkins admin to see it.

So, the existing code's handling of the stderr case (type==2)...

  • should really log to a Logger instead of Jenkins' stderr,
  • should handle an EOF by outputting the incomplete stderr message.

...and would you also agree that an effective unit-test could be to take some example input data paired with expected output, and then check that the code handled it correctly both when it was provided by the multiplexed stream "all at once" and when it's only available one byte at a time.
Or can you think of an easier way of unit-testing it?

@a-ba
Copy link
Contributor Author

a-ba commented Oct 23, 2018

@ndeloof It looks like you were the original author of this code... Can you confirm that I've understood it correctly?

* I believe that what it's doing is reading a stream that's of the format <1-byte-type><3 bytes ignored><4-byte-length> that then can repeat.

* I believe that it's intended to return the "payload" only where the 1-byte-type is 1 (stdout).

* I believe that the stream can also contain payload where the 1-byte-type is 2 (stderr) and the intent was to record that "not expected" data in a way that allows the Jenkins admin to see it.

this is the docker 'stream' format, it is described in the API at:
https://docs.docker.com/engine/api/v1.32/#operation/ContainerAttach

@a-ba a-ba force-pushed the fix-input-multiplexer branch from 78c0337 to 6e6f38d Compare October 24, 2018 09:06
@jakub-bochenski
Copy link

I've been using 1.1.6-rc867.6e6f38dd6134 for 4 days now and didn't get a single "Expected 8 bytes ..." error

lucamilanesio added a commit to GerritCodeReview/gerrit-ci-scripts that referenced this pull request Nov 9, 2018
After the upgrade to the docker-plugin 1.1.5 a random error
of incomplete input stream started to appear, documented at
jenkinsci/docker-plugin#693

Fix the specific communication issue waiting to upgrade to
the next forthcoming 1.1.6

Change-Id: I166550a336b5b954a8ac1a3b7ba14dd33d7a1265
@a-ba a-ba force-pushed the fix-input-multiplexer branch 2 times, most recently from 4dd6c34 to b1e4b95 Compare November 14, 2018 14:27
@a-ba
Copy link
Contributor Author

a-ba commented Nov 14, 2018

I applied the suggested changes + wrote tests for this class

@a-ba a-ba force-pushed the fix-input-multiplexer branch from b1e4b95 to ae5728d Compare November 14, 2018 14:56
@pjdarton pjdarton merged commit 3c1c4e6 into jenkinsci:master Nov 23, 2018
@pjdarton
Copy link
Member

@a-ba Thanks for persevering with this; it's much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docker Slaves die randomly during build.
4 participants