-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve engine startup/shutdown benchmarks #85885
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, it works as expected.
Output on Linux with an optimized editor build and an Intel Core i9-13900K, using --benchmark --quit
CLI arguments for the project manager:
BENCHMARK:
[Startup]
- Core: 0.004916 sec.
- Initialize Early Settings: 0.000073 sec.
- Servers: 0.160395 sec.
- Setup Window and Boot: 0.019250 sec.
- Translations and Remaps: 0.000041 sec.
- Text Server: 0.000024 sec.
- Scene: 0.033830 sec.
- Platforms: 0.000026 sec.
- Finalize Setup: 0.003609 sec.
- Setup: 0.223019 sec.
- Project Manager: 0.302017 sec.
- Total: 0.527676 sec.
[Core]
- Register Types: 0.003185 sec.
- Register Extensions: 0.000046 sec.
- Register Singletons: 0.000048 sec.
[Servers]
- Register Extensions: 0.004254 sec.
- Modules and Extensions: 0.000387 sec.
- Input: 0.004932 sec.
- Display: 0.140001 sec.
- Tablet Driver: 0.000026 sec.
- Rendering: 0.007867 sec.
- Audio: 0.002166 sec.
- XR: 0.000003 sec.
[Scene]
- Register Types: 0.019183 sec.
- Register Singletons: 0.000002 sec.
- Modules and Extensions: 0.014607 sec.
[Editor]
- Register Types: 0.000754 sec.
- Modules and Extensions: 0.000017 sec.
[EditorTheme]
- Generate Icons (All): 0.039615 sec.
- Generate Icons (Only Thumbs): 0.000278 sec.
- Register Fonts: 0.088244 sec.
- Create Editor Theme: 0.211319 sec.
- Create Custom Theme: 0.000001 sec.
BENCHMARK:
[Startup]
- Core: 0.004916 sec.
- Initialize Early Settings: 0.000073 sec.
- Servers: 0.160395 sec.
- Setup Window and Boot: 0.019250 sec.
- Translations and Remaps: 0.000041 sec.
- Text Server: 0.000024 sec.
- Scene: 0.033830 sec.
- Platforms: 0.000026 sec.
- Finalize Setup: 0.003609 sec.
- Setup: 0.223019 sec.
- Project Manager: 0.302017 sec.
- Total: 0.527676 sec.
[Core]
- Register Types: 0.003185 sec.
- Register Extensions: 0.000046 sec.
- Register Singletons: 0.000048 sec.
- Unregister Extensions: 0.000026 sec.
- Unregister Types: 0.007054 sec.
[Servers]
- Register Extensions: 0.004254 sec.
- Modules and Extensions: 0.000387 sec.
- Input: 0.004932 sec.
- Display: 0.140001 sec.
- Tablet Driver: 0.000026 sec.
- Rendering: 0.007867 sec.
- Audio: 0.002166 sec.
- XR: 0.000003 sec.
- Unregister Extensions: 0.000025 sec.
[Scene]
- Register Types: 0.019183 sec.
- Register Singletons: 0.000002 sec.
- Modules and Extensions: 0.014607 sec.
- Unregister Types: 0.000024 sec.
[Editor]
- Register Types: 0.000754 sec.
- Modules and Extensions: 0.000017 sec.
- Unregister Types: 0.000003 sec.
[EditorTheme]
- Generate Icons (All): 0.039615 sec.
- Generate Icons (Only Thumbs): 0.000278 sec.
- Register Fonts: 0.088244 sec.
- Create Editor Theme: 0.211319 sec.
- Create Custom Theme: 0.000001 sec.
[Shutdown]
- Total: 0.996463 sec.
Testing on Linux, Mageia 9 x86_64, unoptimized dev build. There's an error when running the project manager, both with and without enabling benchmarking:
Then opening a project in the editor, two errors are printed:
The second error doesn't seem to be present with running the editor with Output for the project manager:
Output for the editor:
It seems a bit redundant to print the full output both on startup and shutdown, with the difference for shutdown is the addition of a few lines. But I get why it's done like this and I don't really have a better suggestion. |
I agree, I would flush the data when dumping, but I guess this would conflict with some potential usage of the tool. |
4e8fed0
to
39ba956
Compare
Fixed the reported issues with invalid benchmark keys. For the doc generation I added a counter to track every attempt. When opening the editor on a new commit, it runs twice and the second time takes over 2 seconds (it's generating the cache, most likely). Consecutive startups only have one run:
I also changed the times displayed from seconds to milliseconds, as suggested by @Calinou. The data stored in JSON is still in seconds, in case that matters. And here's the project manager:
|
@YuriSizov I've adapted the Android section in 949c654 |
39ba956
to
bcf6f31
Compare
@m4gr3d Thanks, Freddy, incorporated your changeset into the first commit! |
- Add contexts to give a better sense of benchmarked areas. - Add missing benchmarks and adjust some begin/end points. - Clean up names. - Improve Android's internal benchmarks in a similar manner. Co-authored-by: Fredia Huya-Kouadio <[email protected]>
bcf6f31
to
d7cca81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested, looks good to me!
Test results for opening and closing the editor:
I'm a bit puzzled by the Also I find it weird that the documentation generation is reported as taking so long. That might need to be investigated, from what I remember when Pedro added doc caching we were in the low hundreds of ms. |
@akien-mga This is mostly an issue with how disjointed editor initialization is. I think we can fix it in a future PR. Basically, Registration of editor types and modules happens at a different point in time entirely. Node is created in As for the timing of |
Thanks for testing and reviews! |
So I was working on the project manager and wanted to learn why does it take a whopping 3 seconds to load, with most time spent outside of PM code and in other
main.cpp
activities instead. As I did before, I tried to use our existing benchmarking system that is supposed to track startup and shutdown processes and report on time taken.I have already discovered previously that it is not a very useful system. Many things are not tracked correctly, with begin or end points of measurement being misplaced; many other things are not tracked at all and missing. Those which are tracked are reported in a disorganized manner and are hard to make use of. For example, most of the time of the PM startup is taken by "servers", and that's pretty much all that I can tell from our current benchmark:
Measured on battery with a build of 2f73a05
So I set to try and improve all of that. First of all, I added grouping to the measurements. This makes the data structure a bit more complex, which I guess will increase the impact of using the benchmark on the benchmarked code. But what are you going to do, the observer is a part of the observed process.
I went with a Pair key for intermediate records instead of using nested HashMaps, because I think it would be slightly faster. But I didn't... benchmark it. I also replaced the Dictionary with a HashMap. I think it should be more efficient too, with no Variant marshalling required. We only needed a Dictionary when creating the JSON file, but we can fabricate one when we do the dump, which is one part of the process where it's okay to be slow.
I then went through benchmarked areas and adjusted/added measure points to make sure they track sensible data. There shouldn't be anything controversial about these changes here.
However, there is the second commit where I go through the
main.cpp
code and reorganize it a bit, fixing benchmarks in the process. There should be no functional difference, but changes to main are always sensitive. So I can drop it if needed. To summarize, I shifted some little things around which should not affect the initialization order, but help keep things grouped and organized. I then added subscopes throughout, or adjusted existing subscopes, to make the code a bit easier to follow, making it more segmented. Each scope and some other areas received their own benchmark measurements.And the end result of everything is this:
And here's an example of before and after for the editor:
We support overriding OS benchmark methods in platform OS implementations, and Android so far is the only one that does. Naturally, platforms may not have support for contexts. I don't know if Android has it, so to be the least disruptive I can, I just
vformat
both strings and pass it onto the platform wrapper. Maybe it can be done better, cc @m4gr3d