-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(fetcher): fetch over protocol #12199
Conversation
lighthouse-core/gather/driver.js
Outdated
* @param {{timeout: number}=} options, | ||
* @return {Promise<string>} | ||
*/ | ||
async readIOStream(handle, options = {timeout: 5000}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned about spending too much time making an unknown number of calls to IO.read
so I added this timeout.
lighthouse-core/gather/driver.js
Outdated
} | ||
|
||
if (ioResponse.base64Encoded) { | ||
data = Buffer.from(data, 'base64').toString('utf-8'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like it would break if there is more than one chunk of base64 data. each chunk get's deserialized and saved to the same data variable.
I think if one io response is encoded, all will be. could just store the data string as it comes over the protocol, and then at the end of the function check the encoded flag and do a Buffer.from
EDIT: actually, base64 isn't a streamable format (can't concat the encoding data and decode it..), so you gotta decode each chunk as you go. so just do Buffer.from on each ioResponse
before adding to data
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the data isn't encoded in base64 before splitting into chunks. Each chunk is encoded in base64 individually after splitting.
Am I understanding this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea, what does the chromium source suggest?
Whatever you find out it'd be good to document that in the protocol definition file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the chunks are encoded after splitting.
lighthouse-core/gather/driver.js
Outdated
/** | ||
* @param {string} handle | ||
* @param {{timeout: number}=} options, | ||
* @return {Promise<string>} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string is fine for now and any use case I can envision in the future.
if we wanted to fetch image data, this function would need to return a Buffer, and we'd need an option in fetchResourceOverProtocol
to return data as Buffer or a string (or just return Buffer have the caller deal with it)
lighthouse-core/gather/fetcher.js
Outdated
* @param {{timeout: number}=} options timeout is in ms | ||
* @return {Promise<string>} | ||
*/ | ||
async _fetchResourceOverProtocol(url, options = {timeout: 500}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this timeout doesn't apply to the actual network request, though, which seems like more of a worry than streaming the response from the backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could set the timeout on fetching the resource, then set the remaining time as the timeout for reading the IO stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brendankenny What do you think of the new timeout structure?
@@ -28,7 +28,7 @@ class SourceMaps extends Gatherer { | |||
*/ | |||
async fetchSourceMap(driver, sourceMapUrl) { | |||
/** @type {string} */ | |||
const sourceMapJson = await driver.fetcher.fetchResource(sourceMapUrl, {timeout: 1500}); | |||
const sourceMapJson = await driver.fetcher.fetchResource(sourceMapUrl, {timeout: 10000}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's up with this timeout bump?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hitting the 1500 timeout pretty consistently on preactjs.com
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hitting the 1500 timeout pretty consistently on preactjs.com
10s might be too long to block on a single gatherer, though :/
Was this before or after you added _fetchResourceOverProtocol
, though? Because if it was after and with the current timeout approach, that would mean just readIOStream
is often taking longer than 1.5s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was after. The iframe method completed within 1.5s.
Never mind, I just tested again and it's timing out with the iframe as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think something else is going on with preactjs.
For the iframe method, I can now repro the file download behavior from #12064 using the CLI for the file bundle.dd34e.esm.js.map
. That looks like it never resolves the requestInterceptionPromise
and so it always times out no matter how long I set the timeout.
For the Network.loadNetworkResource
approach it never times out for me. Instrumenting the function and running it several times, I get a mean time of 176ms for the Network.loadNetworkResource
call and 14ms for driver.readIOStream()
, way below the timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this appears to be an issue with preactjs.com. I'm successfully fetching source maps on other sites. Leaving this timeout as 1500 should be good.
Forcing LH to make multiple calls to It's not all bad though... The timing issue appears to be fixed in Canary. I have run the new fetcher 50+ times the past 2 days and never experienced a timeout with Canary. I posted a couple bisects in the Chromium issue but unfortunately haven't identified the commit responsible for fixing this. @connorjclark suggested this CL but it wasn't in any of my bisects. |
This was the only recent CL I found to the relevant stream files in Chromium. |
Possible candidate CL: |
I did 20 runs each of Chrome 90 and Chrome 92 (of the smoke test |
This is certainly the right CL, the regression mimics previous issues we had with startup and tasks: https://chromium-review.googlesource.com/c/chromium/src/+/1882029 |
If we double check that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Shame we have to wait for Chrome 92 :/
with Chrome stable but using
so I think we should be good. Note to future selves: don't just increase the timeout :P |
Yeah, timeout of 1 causes smoke to fail |
Adds
fetchResourceOverProtocol
which usesNetwork.loadNetworkResource
instead of injecting an iframe.SourceMaps
uses the new function, but keeps the old fetcher as a fallback.Part of #12070
Closes #12064