Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(runner): asset manager helper #13519

Merged
merged 15 commits into from
Jan 7, 2022
Merged

core(runner): asset manager helper #13519

merged 15 commits into from
Jan 7, 2022

Conversation

adamraine
Copy link
Member

This PR kills two birds with one stone:

  1. Accomplishes step 1 of Gather/Audit Mode in User Flows #13364. In a future PR, we can run the new helper function directly in navigation/timespan/snapshot/legacy runners.
  2. Eliminates the url parameter for Runner.run. The URL should be passed into Runner.run as part of the gatherFn closure. This makes it easier to define a navigation runner that does not know the requested URL ahead of time core(fr): user triggered navigations #13496.

Issues
#13364
#11313

@adamraine adamraine requested a review from a team as a code owner December 20, 2021 20:40
@adamraine adamraine requested review from connorjclark and removed request for a team December 20, 2021 20:40
Comment on lines -72 to -74
if (runOpts.url && !URL.equalWithExcludedFragments(runOpts.url, requestedUrl)) {
throw new Error('Cannot run audit mode on different URL');
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this check should be the only functional change to legacy mode. I didn't think it was needed because we can proceed with the requestedUrl from the artifacts. I'd be open to adding this back as a non-fatal warning or something in a different location.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check makes more sense from the context of the CLI. Currently you can do -G and then -A separately but you must use the same URL, otherwise it errors because you probably made a mistake. Perhaps instead, we should have -A mode only use what is saved in the gathering step–url, settings, etc. and throw if a url is also provided

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer just logging a warning, since we could just ignore the passed in URL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm fine with just a warning. -G/A is pretty advanced usage so we don't really need these runtime failures to protect us, IMO

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A report warning, then, not logger. This will otherwise be ignored and is sure to mess me up eventually so a loud warning is necessary :)

Copy link
Member Author

@adamraine adamraine Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer switching to a fatal error in the CLI than that. We would need to send the warning message all the way down to Runner.run where run warnings are collected. Removing URL checks from Runner.run is part of the reason I created this PR.

The warning is quiet, but Lighthouse is still performing the more correct and expected operation IMO.

lighthouse-core/gather/driver/navigation.js Outdated Show resolved Hide resolved
Comment on lines -72 to -74
if (runOpts.url && !URL.equalWithExcludedFragments(runOpts.url, requestedUrl)) {
throw new Error('Cannot run audit mode on different URL');
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check makes more sense from the context of the CLI. Currently you can do -G and then -A separately but you must use the same URL, otherwise it errors because you probably made a mistake. Perhaps instead, we should have -A mode only use what is saved in the gathering step–url, settings, etc. and throw if a url is also provided

@adamraine adamraine changed the title core: asset manager helper core(runner): asset manager helper Dec 21, 2021
Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eliminates the url parameter for Runner.run. The URL should be passed into Runner.run as part of the gatherFn closure.

I didnt totally understand the motivation here.. but the user-triggered nav usecase is definitely different. still don't totally understand it... cuz there, we don't even have a requestedUrl at all.. but... that's why we have this new 'requestor' thing in 13496. makes sense.

lighthouse-core/runner.js Outdated Show resolved Hide resolved
lighthouse-core/runner.js Outdated Show resolved Hide resolved
Comment on lines -72 to -74
if (runOpts.url && !URL.equalWithExcludedFragments(runOpts.url, requestedUrl)) {
throw new Error('Cannot run audit mode on different URL');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm fine with just a warning. -G/A is pretty advanced usage so we don't really need these runtime failures to protect us, IMO

@brendankenny
Copy link
Member

Accomplishes step 1 of Gather/Audit Mode in User Flows #13364. In a future PR, we can run the new helper function directly in navigation/timespan/snapshot/legacy runners.

Moving the URL tracking to the artifacts make sense, but can you explain what this enables? All those modes already call into Runner.run() so already support gatherMode/auditMode.

The extracted code is also not independent of the other code in run(). e.g. if gatherAndManageAssets is called from another file, that file will also need the catch block for localizing LHErrors:

} catch (err) {
// i18n LighthouseError strings.
if (err.friendlyMessage) {
err.friendlyMessage = format.getFormatted(err.friendlyMessage, settings.locale);
}
await Sentry.captureException(err, {level: 'fatal'});
throw err;
}

Part of what #12393 ran into in making an audit-runner is that there still needs to be a top-level orchestrator of some sort. We're always going to have UX niceties or issues with our conceptual model that will need to be handled somewhere and that prevent a pure functional artifacts -> audits -> LHR pipeline. Whether or not changing the URL for -A should be a runWarning, piping down runWarnings when a mode might not make a runWarnings array is a good example of the difficulty of an approach of "every mode just call these four functions in a row". Localizing error messages and reporting exceptions are other good examples. These things can all live somewhere, but eventually there's so many implicit ties between the four functions they might as well be called from a single place.

@adamraine
Copy link
Member Author

Moving the URL tracking to the artifacts make sense, but can you explain what this enables?

  1. Fewer string -> Function | string cases to handle in core(fr): user triggered navigations #13496
  2. When gather phase and audit phase are separated, we won't need to backfill the url param for the audit phase.

The extracted code is also not independent of the other code in run(). e.g. if gatherAndManageAssets is called from another file, that file will also need the catch block for localizing LHErrors

This PR is reorganizing for future work in #13364, handling the dependencies is a part of that future work. We can duplicate the catch block and put it inside gatherAndManageAssets to address your example.

Part of what #12393 ran into in making an audit-runner is that there still needs to be a top-level orchestrator of some sort.

The goal of #13364 is not to get rid of the top-level orchestrator. The goal is to give the orchestrator some flexibility in when auditing and gathering happen so we can run gathering phases of all steps before running any auditing phases.

@brendankenny
Copy link
Member

The goal of #13364 is not to get rid of the top-level orchestrator. The goal is to give the orchestrator some flexibility in when auditing and gathering happen so we can run gathering phases of all steps before running any auditing phases.

I think I see where you're coming from now. IIUC:

  • move from run(gatherFn, runOpts) to run(artifacts, runOpts), so artifacts need to be gathered or loaded from disk before runner.run is called, and
  • classic mode and navigation-runner need to share this functionality

Hence the new function.

(the quoted statement above was a little confusing because a user flow won't be able to use gatherAndManageArtifacts, but I finally got it that it will eventually be using run(artifacts, runOpts))

FWIW, this PR may have benefited from being split up with one stone per bird :) Both URL normalization and artifact acquisition are fairly touchy pipelines, and gatherAndManageArtifacts would be easier to review (e.g. for how defensive it needs to be) on its own (and ideally in context of how it will actually be called, though that's trickier).

Overall the change sounds good to me. I'm very excited about moving to a more explicit audit runner (#12393 🥲). I am a little concerned about the extra layers of test mocks (and intricacies of those mocks), but it does seem like a decent bit will be able to be removed as #13364 progresses? e.g. snapshot and timespan runners will be able ditch their mocked runners and only mock gathering, runner won't have to care about mocks at all anymore, etc

lighthouse-core/runner.js Outdated Show resolved Hide resolved
lighthouse-core/runner.js Outdated Show resolved Hide resolved
reset,
};

jest.mock('../../../runner.js', () => runnerModule);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these mock commands are tricky when it comes to es modules, b/c jest hoists them to the top to make them work. https://jestjs.io/docs/manual-mocks#using-with-es-module-imports

can keep it as-is since still commonjs, or just continue to do the jest.mock at the toplevel but instead like

jest.mock('....', createRunnerModuleMock())
// not even sure if this expression with a function call can be hoisted by jest 👀 

no preference atm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this way since it follows the format of mockDriverSubmodules.

@adamraine
Copy link
Member Author

e.g. snapshot and timespan runners will be able ditch their mocked runners and only mock gathering, runner won't have to care about mocks at all anymore, etc

I don't know if #13364 will decrease the number of mocked modules, we might just change what's being mocked. Haven't investigated that much so we can cross that bridge later.

@@ -10,6 +10,7 @@
*/

const {Util} = require('../util-commonjs.js');
const LHError = require('../lib/lh-error.js');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So turns out this creates a circular dependency that messes with our DT bundle. url-shim.js might not be the best place for this @brendankenny.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runner-helpers.js maybe?

@brendankenny
Copy link
Member

is it the lighthouse-core/lib/i18n/i18n.js -> root.js -> lighthouse-core/lib/url-shim.js -> lighthouse-core/lib/lh-error.js -> lighthouse-core/lib/i18n/i18n.js message? Why is root.js depending on url-shim.js? Something seems messed up. root.js should be independent of the rest of Lighthouse or it's a real issue depending on it in so many places

@brendankenny
Copy link
Member

looks like from here?

rollupPlugins.alias({
entries: {
'debug': require.resolve('debug/src/browser.js'),
'lighthouse-logger': require.resolve('../lighthouse-logger/index.js'),
'url': require.resolve('../lighthouse-core/lib/url-shim.js'),
},
}),

substituting our class derived from the global URL as the the 'url' node module doesn't seem necessary for anything (we always do an explicit url-shim import to use the extra functions) and I don't think it would work in root.js anyways (there's no fileURLToPath on the url-shim export, so would throw here if any devtools code called the readJson function after converting to esmodules)

@adamraine
Copy link
Member Author

adamraine commented Jan 6, 2022

Removing that alias produces this in our DT tests:

Screen Shot 2022-01-06 at 5 11 14 PM

I guess the alias is for pubads then?

@brendankenny
Copy link
Member

lol: #5293 (review)

@brendankenny
Copy link
Member

#13545 to fix url shimming for good

futurama-once-and-for-all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants