-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Added model type GBM (LightGBM tree learner), as an alternative to ECD #2027
Conversation
Adds model_type to config and instantiates model object accordingly
for more information, see https://pre-commit.ci
was incorrectly inferring regression when converting lgbm Booster directly
Making use of existing Ludwig infra as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jppgks, here's a first round of comments. Mostly minor nits with a question about how to organize AbstractModel
.
@@ -69,6 +71,8 @@ | |||
{name: base_type.preprocessing_defaults() for name, base_type in base_type_registry.items()} | |||
) | |||
|
|||
default_model_type = MODEL_ECD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need both default_model_type
and register_ray_trainer(default=True)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinxzhao default_model_type
here refers to the default for the model_type
field in the config. register_trainer
and register_ray_trainer
are used to register trainers that support certain models, as well as which trainer to use by default for a certain model type.
ludwig/models/abstractmodel.py
Outdated
|
||
Returns: | ||
A dictionary of output {feature name}::{tensor_name} -> output tensor. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we raise a NotImplementedError
if this method isn't implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By using ABCMeta metaclass, the current behavior is as follows if forward()
is not implemented:
>>> from ludwig.models.base import BaseModel
>>> class Test(BaseModel):
... pass
...
>>> Test()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Test with abstract methods forward
ludwig/models/abstractmodel.py
Outdated
logger = logging.getLogger(__name__) | ||
|
||
|
||
class AbstractModel(LudwigModule, metaclass=ABCMeta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an abstract class, there's a lot of default implementation.
Perhaps an overkill (and curious to hear others' thoughts), but a potentially more decoupled design would be to define a truly implementation-free abstract class AbstractModel
, define a base class which defines several default implementations for several methods, BaseModel
, and then have GBM and ECD based off of the BaseModel
.
Error resolved in 7c4793b, issue was that CheckpointManager.close() was never called |
# Reset the metrics at the start of the next epoch | ||
self.model.reset_metrics() | ||
|
||
self.callback(lambda c: c.on_epoch_start(self, progress_tracker, save_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do tree models behave in steps-based vs. epochs-based training?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tree models have no concept like epochs/steps. There's a separate trainer config for tree based models. You can control the number of boosting rounds through that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work! Added a few small comments but feel free to merge otherwise.
@@ -27,7 +27,7 @@ | |||
|
|||
@register_encoder("passthrough", [CATEGORY, NUMBER, VECTOR], default=True) | |||
class PassthroughEncoder(Encoder): | |||
def __init__(self, input_size, **kwargs): | |||
def __init__(self, input_size=1, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we set the default input size to 1 here? Is it possible that this causes issues for a combiner downstream if the expected input size doesn't match the actual input size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShreyaR see the comment from @w4nderlust: #2027 (comment)
Not sure if I can sensibly answer your second question. Maybe @w4nderlust has some input here?
from ludwig.decoders.base import Decoder | ||
from ludwig.decoders.registry import register_decoder | ||
from ludwig.utils.torch_utils import Dense, get_activation | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
@register_decoder("passthrough", [BINARY, CATEGORY, NUMBER, SET, VECTOR, SEQUENCE, TEXT]) | ||
class PassthroughDecoder(Decoder): | ||
def __init__(self, input_size: int = 1, num_classes: int = None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below for default input size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big PR and LGTM with some small nits.
It would be good to get this merged soon so that we don't have to worry about hair merge conflicts.
assert torch.allclose(of1_w, of2_w) | ||
|
||
# Test saving and loading the model explicitly | ||
with tempfile.TemporaryDirectory() as tmpdir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpdir is a built-in pytest fixture
output_feature = number_feature() | ||
output_features = [output_feature] | ||
|
||
with tempfile.TemporaryDirectory() as tmpdir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpdir is a built-in pytest fixture.
output_feature = category_feature(vocab_size=vocab_size) | ||
output_features = [output_feature] | ||
|
||
with tempfile.TemporaryDirectory() as tmpdir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpdir is a built-in pytest fixture.
output_feature = binary_feature() | ||
output_features = [output_feature] | ||
|
||
with tempfile.TemporaryDirectory() as tmpdir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpdir is a built-in pytest fixture.
input_features = [number_feature(), category_feature(reduce_output="sum")] | ||
output_features = [text_feature()] | ||
|
||
with tempfile.TemporaryDirectory() as tmpdir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmpdir is a built-in pytest fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually using the fixture before, but ran into issues so I switched to this style in accordance with test_ray.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very awesome job. Just had one small comment on the schema naming but besides that looks awesome and am very impressed with all the work!
This PR introduces additional model type GBM (tree learner), as an alternative to ECD.
Limitations:
In scope:
model_type
to the Ludwig configtype
to the trainer section of the Ludwig configBaseModel
class, implemented by both ECD and GBMLightGBMTrainer
andLightGBMRayTrainer
Out of scope:
Todo:
model_type
functionalityAbstractModel
Help needed:
Alternative to having separate trainer registry per backend (local, ray)OK for nowmodel_type
and trainertype
(Thanks @ksbrar!)