[Bug]: LGBM Results Not Reproducible #1368

dannycg1996 · 2024-10-24T15:41:00Z

Describe the bug

Currently there are a couple of issues around the reproducibility of the FLAML best loss, when the best estimator found was an LGBMEstimator.
These issues occur as follows:

The best loss returned by FLAML is not reproducible, using the underlying LGBMClassifier or LGBMEstimator model (i.e. automl.model.model). This seems to be caused by n_estimators always being set to 1, regardless of what value it should be. Please note that n_estimators always seems to be set correctly on the FLAMLised LGBMEstimator - this issue exists exclusively on the underlying models.
With certain configurations, the FLAML best loss can't be reproduced even when using the FLAMLised LGBMEstimator. This seems to only be the case when a time budget is set, so the issue is likely caused by the callbacks, similar to the issue found with CatBoostEstimators here

Steps to reproduce

In terms of showcasing these issues in a reproducible manner:

Can be replicated by uncommenting and then running coverage run -m pytest -k test_reproducibility_of_underlying_regression_models[lgbm] (or the classification equivalent.
A unit test will be added for this, which will initially fail, until the issue is fixed.

I'll open a PR for this shortly - hope that's okay

Model Used

LGBMEstimators

Expected Behavior

Taking the FLAMLised LGBMEstimator model, or the underlying model it wraps (LGBMClassifier/LGBMRegressor) and training and testing it on the same folds, should return the same best loss as FLAML.

Screenshots and logs

No response

Additional Information

No response

The text was updated successfully, but these errors were encountered:

dannycg1996 added the bug Something isn't working label Oct 24, 2024

dannycg1996 mentioned this issue Oct 24, 2024

Flaml: fix lgbm reproducibility #1369

Merged

4 tasks

dannycg1996 assigned thinkall and dannycg1996 Oct 24, 2024

thinkall closed this as completed in #1369 Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: LGBM Results Not Reproducible #1368

[Bug]: LGBM Results Not Reproducible #1368

dannycg1996 commented Oct 24, 2024

[Bug]: LGBM Results Not Reproducible #1368

[Bug]: LGBM Results Not Reproducible #1368

Comments

dannycg1996 commented Oct 24, 2024

Describe the bug

Steps to reproduce

Model Used

Expected Behavior

Screenshots and logs

Additional Information