Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLS: Initial Layout Shift #12314

Closed
benschwarz opened this issue Mar 31, 2021 · 5 comments
Closed

CLS: Initial Layout Shift #12314

benschwarz opened this issue Mar 31, 2021 · 5 comments

Comments

@benschwarz
Copy link
Contributor

benschwarz commented Mar 31, 2021

I was hoping to open a discussion around CLS, Lighthouse and RUM measurement.

From a Chrome/CrUX/Core Web Vitals perspective

CLS is collected while a user interacts with the page. Scrolling, SPA navigations, Custom Accordion elements(I guess?) could cause a layout shifts. Chrome collects CLS (alongside other metrics) and delivers them to Google for use in the search ranking evaluation, also surfaced publicly via CrUX.

From a Lighthouse perspective

CLS is measured using the viewable area of the browser (the initial viewport), any elements that are rendered, then shifted will count towards CLS.


By definition, CLS is Cumulative, so when there isn't a user at the wheel, reported metrics will be very different to Core Web Vitals reported by CrUX. That isn't something that can be directly fixed, but it's a discrepancy that isn't clearly defined in Google documentation AFAIK.

Some questions:

  • Should CLS in Lighthouse be renamed (my suggestion: Initial Layout Shift) to more accurately describe what it measures?
  • Would it make sense for Lighthouse to scroll the page during test, so that Layout Shifts are collected for the whole page? (Giving people an idea of WHICH pages cause the most Layout Shifts)
@adamraine
Copy link
Member

I think this is related to #11313. Auditing user flows should allow us to identify more layout shifts after user interaction.

@addyosmani
Copy link
Member

addyosmani commented Apr 1, 2021

Thanks, Ben! We have thought about this problem in a few different ways over the last few quarters:

  • Evaluated annotation of lab-based metrics to better distinguish them from field (e.g "Lab CLS", "CLS during page-load", "CLS (until load)" etc)
  • Evaluated giving lab metrics a completely different name to minimize confusion with their field counter-parts (closer to your current proposal). We determined the value of doing this didn't outweigh the confusion it may introduce by appearing too different to the overall metric name.
  • Evaluated displaying the lab value for a metric as a point on the field distribution (to help convey its not a complete picture).

Where we are settling on (to date) is the idea of a design system for Core Web Vitals where all of Google's tools would potentially consistently show iconography to denote whether a metric is a lab or field variant. We would share this system in public so that other tools could also reuse the same UX if they wished.

A tooltip would display a summary of nuance, such as whether the metric was only measuring a certain window of time (i.e ideally achieving the same goal as calling the metric "initial layout shift"). We are still in the iteration phases of mocks for this idea but will share something here once we are further along.

Longer-term (later this year, 2021), support for user-flows will allow us to better reflect a more accurate picture of the cumulative nature of shifts (such as the impact clicks, scrolls and other interactions can have) as we move beyond cold page-load measurement.

@thefoxis
Copy link

thefoxis commented Apr 7, 2021

👋 @adamraine and @addyosmani!

thanks for the context! i'd like to throw in my two cents, which come from both talking to numerous teams (and the community at large) and also from the perspective of someone building a speed monitoring tool.

  • Evaluated annotation of lab-based metrics to better distinguish them from field (e.g "Lab CLS", "CLS during page-load", "CLS (until load)" etc)
  • Evaluated giving lab metrics a completely different name to minimize confusion with their field counter-parts (closer to your current proposal). We determined the value of doing this didn't outweigh the confusion it may introduce by appearing too different to the overall metric name.

I think the first option could potentially work, although it's worth recalling the example of the difference between First Input Delay and Total Blocking Time. of course, TBT isn't a replacement for FID, but it's a recommended metric to keep an eye on when monitoring Core Web Vitals in lab setting versus field. I haven't seen any confusion from the community about this distinction. this makes me doubt whether having two metric names, for example, Cumulative Layout Shift and Initial Layout Shift would be detrimental as they are very descriptive.

in a way, we can assume that "more metrics = more problems", but that's not what I observed so far. teams and the community struggles with lack of clarity surrounding metrics and their collection, not the amount of metrics or their naming. a perfect example is large amounts of teams working to meet Core Web Vitals goals not understanding or knowing the following:

  • the Performance Score doesn't affect ranking (directly)
  • how CWV are collected by Google and how they can mimic that setting themselves

many teams (often new to the world of performance) are now working to meet the May 2021 CWV goals and they are struggling with lack of transparency and clear messaging. if someone is working on improving CLS and monitoring solely in lab settings via Lighthouse, they currently might be receiving a very misleading result. while flows might help with this issue, it's not currently present in Lighthouse and it seems like it's still a while off. so, currently, when using synthetic testing, those teams might not meet the thresholds because CLS is reported very differently between lab and field (and field is what matters most in the SEO/ranking perspective since it comes from CrUX).

lack of trust as well as reliability and the knowledge barrier are known problems hindering performance efforts or the adoption of investment in speed altogether. observing different results between lab and field CLS because the collection is so different (but under the same name) contributes to questioning reliability and lack of trust. there needs to be a firm, clear distinction that will leave no doubt to both speed experts and newcomers.

Where we are settling on (to date) is the idea of a design system for Core Web Vitals where all of Google's tools would potentially consistently show iconography to denote whether a metric is a lab or field variant. We would share this system in public so that other tools could also reuse the same UX if they wished.

while this might work well for Google's tooling, I'm unsure about its impact on the wider performance platform options. as someone who makes decisions on product and design language of one of the monitoring tools, I doubt we'd adopt iconography as a way of distinction. I don't feel like it's enough to ensure the difference is clearly communicated. I also think that if that's the only differential, it's not only very subtle, it also would require all the tools to use this visual language so there's no confusion for teams & community. and while a lot of speed platforms follow Goog leadership in perf (features + metrics trickle down), not everything will be incorporated the same. i think it's critical for Google to acknowledge the power differential here and how those webperf decisions affect the entire space.

overall, I'm a fan of either a two-name approach or CLS (Lab) and CLS (Field) approach, as they feel most clear and approachable to the webdev community with varying degrees of performance knowledge.

hope that's helpful.

@benschwarz
Copy link
Contributor Author

Thanks for sharing these notes @addyosmani, helpful context to take in.

I've some additional thoughts to share:

Where we are settling on (to date) is the idea of a design system for Core Web Vitals where all of Google's tools would potentially consistently show iconography to denote whether a metric is a lab or field variant.

Iconography may make the situation more vague. If people see consistent iconography alongside CLS metrics, they may believe that the metrics are recorded using the same methods, and therefore should be the same.

As @thefoxis already alluded, iconography is unlikely to be shown in all downstream tools. All of the marketing surrounding CLS to date has been that it can be measured in Lab and Field. I don't believe that iconography or tooltips can make enough of an impact there.

To that view, I firmly believe that the best option for the short term (considering the SEO + Perf changes landing in May) would be to very clearly (and swiftly!) update LH/PSI/Lab metric names so that they're distinguished from the field.

As @addyosmani mentioned, longer term, user-flows will enable people to test a more detailed user-experience, so a slight name change for the awkward in-between now and flows will likely make a positive improvement.

If it helps, I think "Initial Layout Shift", "Page-load CLS", "CLS during page-load" or "CLS (Lab)" are all pretty decent options.

I feel that this issue is of particular concern due to the time factor of SEO+perf changes landing in May. I've observed many teams that believe their sites will be ranked by PSI (rather than CrUX derived data), or ones that don't understand that a performance score will vary when tested under different locations or environments.

CLS is complex (cc @bazzadp) and there's currently an unrecognised gulf between Lab and Rum that will cause a lot of confusion, fear and anxiety.

Additional clarity around CLS calculation will bring more confidence and certainty to teams who are currently DEEP into their perf deadline fixes.

@paulirish paulirish mentioned this issue Apr 9, 2021
7 tasks
@paulirish
Copy link
Member

paulirish commented Jun 15, 2021

It's been a bit since this issue was filed, especially in CLS-years. I think everything here is now settled. 🎉

We have the big change to add windowing to CLS. That should mean our field and lab CLS's are more similar (though they're still not identically measured as the observation window is different).

Lighthouse's old CLS (non-windowed, during load) is now called totalLayoutShift.(as of LH8) WebPageTest also uses this name.

The field old CLS (non-windowed, for entire page lifecycle) is called "uncapped cls".

more: #12350 (comment)
more: https://web.dev/cls-web-tooling/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants