Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Refine AutoTuner configuration recommendations #1397

Open
kuhushukla opened this issue Oct 29, 2024 · 3 comments
Open

[BUG] Refine AutoTuner configuration recommendations #1397

kuhushukla opened this issue Oct 29, 2024 · 3 comments
Assignees
Labels
bug Something isn't working core_tools Scope the core module (scala)

Comments

@kuhushukla
Copy link
Collaborator

Describe the bug
This bug tracks the work we need to do to improve the autotuner o/p we ask customers to try out for their first GPU run. This issue also relates to #1334 and #1067 .

@amahussein
Copy link
Collaborator

Is there a way we can cleanup old issues that overlap with this umbrella?

@parthosa
Copy link
Collaborator

Many of these recommendations require a separate recommendation when AutoTuner is run on CPU event logs (i.e. via Qual Tool) vs when AutoTuner is run on GPU event logs (i.e. via Profiling Tool).

I plan to use a class based approach to create QualAutoTuner and ProfilingAutoTuner that extend the base AutoTuner. Now we can have overrides as required. E.g. QualAutoTuner can override BATCH_SIZE_BYTES to be 1 GB as compared to 2 GB that is recommended by ProfilingAutoTuner.

cc: @amahussein

@parthosa parthosa added the core_tools Scope the core module (scala) label Dec 12, 2024
@amahussein
Copy link
Collaborator

Many of these recommendations require a separate recommendation when AutoTuner is run on CPU event logs (i.e. via Qual Tool) vs when AutoTuner is run on GPU event logs (i.e. via Profiling Tool).

I plan to use a class based approach to create QualAutoTuner and ProfilingAutoTuner that extend the base AutoTuner. Now we can have overrides as required. E.g. QualAutoTuner can override BATCH_SIZE_BYTES to be 1 GB as compared to 2 GB that is recommended by ProfilingAutoTuner.

cc: @amahussein

Yes, the plan was to do do that when I added the QualAutoTuner, but I did not get the bandwidth to complete it.
Keep me posted with the plan because I remember there was a pass to accomplish that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

No branches or pull requests

5 participants