You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently there are a couple of issues around the reproducibility of the FLAML best loss, when the best estimator found was an LGBMEstimator.
These issues occur as follows:
The best loss returned by FLAML is not reproducible, using the underlying LGBMClassifier or LGBMEstimator model (i.e. automl.model.model). This seems to be caused by n_estimators always being set to 1, regardless of what value it should be. Please note that n_estimators always seems to be set correctly on the FLAMLised LGBMEstimator - this issue exists exclusively on the underlying models.
With certain configurations, the FLAML best loss can't be reproduced even when using the FLAMLised LGBMEstimator. This seems to only be the case when a time budget is set, so the issue is likely caused by the callbacks, similar to the issue found with CatBoostEstimators here
Steps to reproduce
In terms of showcasing these issues in a reproducible manner:
Can be replicated by uncommenting and then running coverage run -m pytest -k test_reproducibility_of_underlying_regression_models[lgbm] (or the classification equivalent.
A unit test will be added for this, which will initially fail, until the issue is fixed.
I'll open a PR for this shortly - hope that's okay
Model Used
LGBMEstimators
Expected Behavior
Taking the FLAMLised LGBMEstimator model, or the underlying model it wraps (LGBMClassifier/LGBMRegressor) and training and testing it on the same folds, should return the same best loss as FLAML.
Screenshots and logs
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered:
Describe the bug
Currently there are a couple of issues around the reproducibility of the FLAML best loss, when the best estimator found was an LGBMEstimator.
These issues occur as follows:
automl.model.model
). This seems to be caused byn_estimators
always being set to 1, regardless of what value it should be. Please note thatn_estimators
always seems to be set correctly on the FLAMLised LGBMEstimator - this issue exists exclusively on the underlying models.LGBMEstimator
. This seems to only be the case when a time budget is set, so the issue is likely caused by the callbacks, similar to the issue found with CatBoostEstimators hereSteps to reproduce
In terms of showcasing these issues in a reproducible manner:
run -m pytest -k test_reproducibility_of_underlying_regression_models[lgbm]
(or the classification equivalent.I'll open a PR for this shortly - hope that's okay
Model Used
LGBMEstimators
Expected Behavior
Taking the FLAMLised LGBMEstimator model, or the underlying model it wraps (LGBMClassifier/LGBMRegressor) and training and testing it on the same folds, should return the same best loss as FLAML.
Screenshots and logs
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: