feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD #2027

jppgks · 2022-05-12T14:17:36Z

This PR introduces additional model type GBM (tree learner), as an alternative to ECD.

Limitations:

supports only single output feature of type binary, number or category

In scope:

adding model_type to the Ludwig config
adding type to the trainer section of the Ludwig config
introducing an BaseModel class, implemented by both ECD and GBM
moving trainers into a separate folder
implementing GBM training functionality in LightGBMTrainer and LightGBMRayTrainer

Out of scope:

support for non-trainable feature encoders (e.g. pretrained text embeddings) => follow-up PR

Todo:

update/add tests reflecting new model_type functionality
replace occurences of ECD with AbstractModel
add tests for new GBM functionality
finish GBM implementation

Help needed:

~~Alternative to having separate trainer registry per backend (local, ray)~~ OK for now
How to correctly update the schema model_type and trainer type (Thanks @ksbrar!)

github-actions · 2022-05-12T15:09:46Z

Unit Test Results

      6 files ±  0       6 suites ±0 2h 28m 51s ⏱️ + 10m 54s
2 901 tests +21 2 855 ✔️ +22   46 💤 ±0 0 ❌ - 1
8 703 runs +63 8 561 ✔️ +64 142 💤 ±0 0 ❌ - 1

Results for commit ca7db50. ± Comparison against base commit f654e82.

♻️ This comment has been updated with latest results.

Adds model_type to config and instantiates model object accordingly

for more information, see https://pre-commit.ci

was incorrectly inferring regression when converting lgbm Booster directly

Making use of existing Ludwig infra as much as possible

justinxzhao

Hey @jppgks, here's a first round of comments. Mostly minor nits with a question about how to organize AbstractModel.

tests/integration_tests/test_lightgbm.py

justinxzhao · 2022-05-19T17:55:49Z

ludwig/utils/defaults.py

@@ -69,6 +71,8 @@
    {name: base_type.preprocessing_defaults() for name, base_type in base_type_registry.items()}
 )

+default_model_type = MODEL_ECD


Why do we need both default_model_type and register_ray_trainer(default=True)?

@justinxzhao default_model_type here refers to the default for the model_type field in the config. register_trainer and register_ray_trainer are used to register trainers that support certain models, as well as which trainer to use by default for a certain model type.

ludwig/trainers/registry.py

ludwig/trainers/base.py

ludwig/models/abstractmodel.py

justinxzhao · 2022-05-22T00:38:23Z

ludwig/models/abstractmodel.py

+
+        Returns:
+            A dictionary of output {feature name}::{tensor_name} -> output tensor.
+        """


Should we raise a NotImplementedError if this method isn't implemented?

By using ABCMeta metaclass, the current behavior is as follows if forward() is not implemented:

>>> from ludwig.models.base import BaseModel >>> class Test(BaseModel): ... pass ... >>> Test() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Can't instantiate abstract class Test with abstract methods forward

justinxzhao · 2022-05-22T00:45:51Z

ludwig/models/abstractmodel.py

+logger = logging.getLogger(__name__)
+
+
+class AbstractModel(LudwigModule, metaclass=ABCMeta):


For an abstract class, there's a lot of default implementation.

Perhaps an overkill (and curious to hear others' thoughts), but a potentially more decoupled design would be to define a truly implementation-free abstract class AbstractModel, define a base class which defines several default implementations for several methods, BaseModel, and then have GBM and ECD based off of the BaseModel.

ludwig/models/gbm.py

jppgks · 2022-05-24T09:54:27Z

~~Minimal reproducible example for hanging process.~~

~~Run the below script with --lgbm_ludwig for reproducing the hanging behavior and without --lgbm_ludwig to call LightGBM directly.~~

Error resolved in 7c4793b, issue was that CheckpointManager.close() was never called

geoffreyangus · 2022-06-27T20:30:03Z

ludwig/trainers/trainer_lightgbm.py

+        # Reset the metrics at the start of the next epoch
+        self.model.reset_metrics()
+
+        self.callback(lambda c: c.on_epoch_start(self, progress_tracker, save_path))


How do tree models behave in steps-based vs. epochs-based training?

Tree models have no concept like epochs/steps. There's a separate trainer config for tree based models. You can control the number of boosting rounds through that

ShreyaR

Amazing work! Added a few small comments but feel free to merge otherwise.

ludwig/data/utils.py

ShreyaR · 2022-06-28T09:50:31Z

ludwig/encoders/generic_encoders.py

@@ -27,7 +27,7 @@

 @register_encoder("passthrough", [CATEGORY, NUMBER, VECTOR], default=True)
 class PassthroughEncoder(Encoder):
-    def __init__(self, input_size, **kwargs):
+    def __init__(self, input_size=1, **kwargs):


Why do we set the default input size to 1 here? Is it possible that this causes issues for a combiner downstream if the expected input size doesn't match the actual input size?

@ShreyaR see the comment from @w4nderlust: #2027 (comment)

Not sure if I can sensibly answer your second question. Maybe @w4nderlust has some input here?

ShreyaR · 2022-06-28T09:51:02Z

ludwig/decoders/generic_decoders.py

 from ludwig.decoders.base import Decoder
 from ludwig.decoders.registry import register_decoder
 from ludwig.utils.torch_utils import Dense, get_activation

 logger = logging.getLogger(__name__)


+@register_decoder("passthrough", [BINARY, CATEGORY, NUMBER, SET, VECTOR, SEQUENCE, TEXT])
+class PassthroughDecoder(Decoder):
+    def __init__(self, input_size: int = 1, num_classes: int = None, **kwargs):


See comment below for default input size.

ludwig/models/base.py

Co-authored-by: Joppe Geluykens <[email protected]>

justinxzhao

This is a big PR and LGTM with some small nits.

It would be good to get this merged soon so that we don't have to worry about hair merge conflicts.

justinxzhao · 2022-06-28T14:01:56Z

tests/integration_tests/test_model_save_and_load.py

+                assert torch.allclose(of1_w, of2_w)
+
+    # Test saving and loading the model explicitly
+    with tempfile.TemporaryDirectory() as tmpdir:


tmpdir is a built-in pytest fixture

justinxzhao · 2022-06-28T14:03:43Z

tests/integration_tests/test_gbm.py

+    output_feature = number_feature()
+    output_features = [output_feature]
+
+    with tempfile.TemporaryDirectory() as tmpdir:


tmpdir is a built-in pytest fixture.

justinxzhao · 2022-06-28T14:03:55Z

tests/integration_tests/test_gbm.py

+    output_feature = category_feature(vocab_size=vocab_size)
+    output_features = [output_feature]
+
+    with tempfile.TemporaryDirectory() as tmpdir:


tmpdir is a built-in pytest fixture.

justinxzhao · 2022-06-28T14:04:00Z

tests/integration_tests/test_gbm.py

+    output_feature = binary_feature()
+    output_features = [output_feature]
+
+    with tempfile.TemporaryDirectory() as tmpdir:


tmpdir is a built-in pytest fixture.

justinxzhao · 2022-06-28T14:04:09Z

tests/integration_tests/test_gbm.py

+    input_features = [number_feature(), category_feature(reduce_output="sum")]
+    output_features = [text_feature()]
+
+    with tempfile.TemporaryDirectory() as tmpdir:


tmpdir is a built-in pytest fixture.

I was actually using the fixture before, but ran into issues so I switched to this style in accordance with test_ray.py

connor-mccorm

Very awesome job. Just had one small comment on the schema naming but besides that looks awesome and am very impressed with all the work!

ludwig/schema/trainer.py

for more information, see https://pre-commit.ci

ludwig/schema/trainer.py

for more information, see https://pre-commit.ci

tgaddair and others added 18 commits May 18, 2022 17:21

WIP LightGBM

c844bea

Fixed training e2e

b61e993

Added example

7c54ec4

Added more boosting rounds

9c221dd

Hummingbird

d54364a

WIP refactor

ee5184a

add requirements for tree models

faa8707

format workspace

2161baf

incorporate api changes

fc5f47d

WIP training metrics

0adcdb3

WIP ray

9bf5f79

cache preprocessed dask df

4ddb10e

only log eval results on rank 0

c1b181e

add abstract model class

cc4097c

Adds model_type to config and instantiates model object accordingly

rename lightgbm to trainer_lightgbm

408e7d8

create trainer based on config

4d97c6e

[pre-commit.ci] auto fixes from pre-commit.com hooks

34fbd37

for more information, see https://pre-commit.ci

add accuracy for GBM

2c50c8d

jppgks force-pushed the gbt branch from 2c71185 to 4cce72b Compare May 18, 2022 17:28

expose GBM params through config

04c9c00

jppgks force-pushed the gbt branch from 4cce72b to 04c9c00 Compare May 18, 2022 17:37

jppgks added 6 commits May 19, 2022 19:39

move build input/output to abstractmodel

3ea822f

use abstractmodel in predictor

36c5294

fix wrong defaults

1aa2704

fix hummingbird conversion

21bc00d

was incorrectly inferring regression when converting lgbm Booster directly

parse logits from GBM predictions

9d7eb96

perform evaluation after training

29e110d

Making use of existing Ludwig infra as much as possible

justinxzhao reviewed May 22, 2022

View reviewed changes

jppgks added 4 commits June 27, 2022 09:31

fix test: increase num examples to 100

feefa79

format

30ff1eb

Merge branch 'gbt' of https://github.com/ludwig-ai/ludwig into gbt

de545a4

Merge remote-tracking branch 'origin/master' into gbt

6106b93

jppgks requested review from ShreyaR, connor-mccorm and geoffreyangus June 27, 2022 17:39

geoffreyangus reviewed Jun 27, 2022

View reviewed changes

ShreyaR approved these changes Jun 28, 2022

View reviewed changes

jppgks and others added 5 commits June 28, 2022 10:35

use DataFrame type from utils

0de3c4c

Merge remote-tracking branch 'origin/master' into gbt

c0ba8ec

fix circular import

971eda9

fix merge with master

85f7654

Apply suggestions from code review

e70cf62

Co-authored-by: Joppe Geluykens <[email protected]>

justinxzhao approved these changes Jun 28, 2022

View reviewed changes

connor-mccorm approved these changes Jun 28, 2022

View reviewed changes

ludwig/schema/trainer.py Outdated Show resolved Hide resolved

jppgks and others added 3 commits June 28, 2022 16:31

rename TrainerConfig to ECDTrainerConfig

6ba8730

Merge remote-tracking branch 'origin/master' into gbt

1eb0046

[pre-commit.ci] auto fixes from pre-commit.com hooks

3c0e700

for more information, see https://pre-commit.ci

jppgks changed the title ~~Integrating GBM models~~ feat: add Tree learner Jun 28, 2022

ksbrar reviewed Jun 28, 2022

View reviewed changes

ludwig/schema/trainer.py Outdated Show resolved Hide resolved

ksbrar approved these changes Jun 28, 2022

View reviewed changes

jppgks and others added 2 commits June 28, 2022 16:53

update docstring occurences of TrainerConfig

ccb524c

[pre-commit.ci] auto fixes from pre-commit.com hooks

67334c7

for more information, see https://pre-commit.ci

tgaddair changed the title ~~feat: add Tree learner~~ feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD Jun 28, 2022

Merge remote-tracking branch 'origin/master' into gbt

ca7db50

tgaddair merged commit aa0c63b into master Jun 29, 2022

tgaddair deleted the gbt branch June 29, 2022 02:12

jimthompson5802 mentioned this pull request Jul 12, 2022

Incorrect predictions are produced #2259

Closed

justinxzhao mentioned this pull request Jul 12, 2022

Fix: Don't skip saving the model if the save path already exists. #2264

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD #2027

feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD #2027

jppgks commented May 12, 2022 •

edited

Loading

github-actions bot commented May 12, 2022 •

edited

Loading

justinxzhao left a comment

justinxzhao May 19, 2022

jppgks May 26, 2022

justinxzhao May 22, 2022

jppgks May 26, 2022 •

edited

Loading

justinxzhao May 22, 2022

jppgks commented May 24, 2022 •

edited

Loading

geoffreyangus Jun 27, 2022

jppgks Jun 27, 2022

ShreyaR left a comment •

edited

Loading

ShreyaR Jun 28, 2022

jppgks Jun 28, 2022

ShreyaR Jun 28, 2022

justinxzhao left a comment

justinxzhao Jun 28, 2022

justinxzhao Jun 28, 2022

justinxzhao Jun 28, 2022

justinxzhao Jun 28, 2022

justinxzhao Jun 28, 2022

jppgks Jun 28, 2022

connor-mccorm left a comment

		logger = logging.getLogger(__name__)


		class AbstractModel(LudwigModule, metaclass=ABCMeta):

feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD #2027

feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD #2027

Conversation

jppgks commented May 12, 2022 • edited Loading

github-actions bot commented May 12, 2022 • edited Loading

Unit Test Results

justinxzhao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jppgks May 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jppgks commented May 24, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShreyaR left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinxzhao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connor-mccorm left a comment

Choose a reason for hiding this comment

jppgks commented May 12, 2022 •

edited

Loading

github-actions bot commented May 12, 2022 •

edited

Loading

jppgks May 26, 2022 •

edited

Loading

jppgks commented May 24, 2022 •

edited

Loading

ShreyaR left a comment •

edited

Loading