[BUG] 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel' #1701

sibyl1956 · 2022-10-31T16:40:51Z

SynapseML version

0.10.1

System information

Language version: Python: 3.8.10, Scala 2.12
Spark Version : Apache Spark 3.2.1,
Spark Platform: Databricks

Describe the problem

When try to load a pipeline model for lightgbm, I encountered this error message:
'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel'

But I imported from synapse.ml.lightgbm import LightGBMClassificationModel before I try to load the pipeline model

Code to reproduce issue

from pyspark.ml.pipeline import PipelineModel
from synapse.ml.lightgbm import LightGBMClassificationModel, LightGBMClassifier
clf = PipelineModel.load(model_savepath)

Other info / logs

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<command-2087039020756525> in <module>
      1 # Load model
      2 from pyspark.ml.pipeline import PipelineModel
----> 3 clf = PipelineModel.load(model_savepath)

/databricks/spark/python/pyspark/ml/util.py in load(cls, path)
    461     def load(cls, path):
    462         """Reads an ML instance from the input path, a shortcut of `read().load(path)`."""
--> 463         return cls.read().load(path)
    464 
    465 

/databricks/spark/python/pyspark/ml/pipeline.py in load(self, path)
    258             return JavaMLReader(self.cls).load(path)
    259         else:
--> 260             uid, stages = PipelineSharedReadWrite.load(metadata, self.sc, path)
    261             return PipelineModel(stages=stages)._resetUid(uid)
    262 

/databricks/spark/python/pyspark/ml/pipeline.py in load(metadata, sc, path)
    394             stagePath = \
    395                 PipelineSharedReadWrite.getStagePath(stageUid, index, len(stageUids), stagesDir)
--> 396             stage = DefaultParamsReader.loadParamsInstance(stagePath, sc)
    397             stages.append(stage)
    398         return (metadata['uid'], stages)

/databricks/spark/python/pyspark/ml/util.py in loadParamsInstance(path, sc)
    719         else:
    720             pythonClassName = metadata['class'].replace("org.apache.spark", "pyspark")
--> 721         py_type = DefaultParamsReader.__get_class(pythonClassName)
    722         instance = py_type.load(path)
    723         return instance

/databricks/spark/python/pyspark/ml/util.py in __get_class(clazz)
    630         m = __import__(module)
    631         for comp in parts[1:]:
--> 632             m = getattr(m, comp)
    633         return m
    634 

AttributeError: module 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel'

What component(s) does this bug affect?

What language(s) does this bug affect?

language/scala: Scala source code
language/python: Pyspark APIs
language/r: R APIs
language/csharp: .NET APIs
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/synapse: Azure Synapse integrations
integrations/azureml: Azure ML integrations
integrations/databricks: Databricks integrations

The text was updated successfully, but these errors were encountered:

github-actions · 2022-10-31T16:41:05Z

Hey @sibyl1956 👋!
Thank you so much for reporting the issue/feature request 🚨.
Someone from SynapseML Team will be looking to triage this issue soon.
We appreciate your patience.

ppruthi · 2022-11-07T15:17:02Z

@svotaw -- could you take a look at this issue ? Thanks !

svotaw · 2022-11-13T18:06:24Z

Can you give more context here? How did you save the model? What was the code to create the original Pipeline?

anor4k · 2023-04-28T13:10:28Z

Having the same issue.
Here's the code i used to train and save the model:

from synapse.ml.lightgbm import LightGBMRegressor
from synapse.ml.train import TrainedRegressorModel
from pyspark.ml.pipeline import PipelineModel

model = TrainRegressor(
    model=LightGBMRegressor(**model_params),
    inputCols=features,
    labelCol=target
)

trained_model = model.fit(df_train)
trained_model.getModel().save('trained_model_pipeline')

loaded_model = PipelineModel.load('trained_model_pipeline')

Running that last line gives me the same error as the OP. Running on SynapseML 0.11.1, PySpark 3.2.3.

I can save the TrainedregressorModel and use TrainedRegressorModel.load to load the model correctly, but using PipelineModel.load seems like a more general solution to loading models and I would prefer using that.

tbrandonstevenson · 2023-07-25T15:29:02Z

Here is an anecdotal experience, whatever it is worth:

I had the same problem and was able to get the pipeline to load by flattening the pipeline stages. It was erroring when my first stage in the pipeline was itself a pipeline of feature transformations. When I removed this nested pipeline structure I was able to load the saved pipeline.

grzegorz-karas · 2023-08-11T06:26:05Z

For a pyspark.ml.Pipeline where all stages were java stages (estimators and transformers that come from the spark MLlib library) the model could be saved and read without problems.

WORKS:

pipe = Pipeline(
    stages=[
        SomePysparkMLibTransformer, # is an instance of the JavaMLWritable
        LightGBMClassifier(**model_params),
    ]
)

The error occurred when one of the transformers were a custom and not a java stage.

DOESN'T WORK:

pipe = Pipeline(
    stages=[
        SomeCustomTransformer, # is NOT an instance of the JavaMLWritable
        LightGBMClassifier(**model_params),
    ]
)

In this case the PipelineModel.write method returned a non java writer. The classes synapse.ml.lightgbm.LightGBMClassifier and synapse.ml.lightgbm.LightGBMRegressor inherit correct java reader (pyspark.ml.util.JavaMLReadable) and writer (pyspark.ml.util.JavaMLWritable). The problem is with the superclass synapse.ml.core.schema.Utils.ComplexParamsMixin that inherits only from the pyspark.ml.util.MLReadable.

I could bypass the problem by wrapping the estimator with the pyspark.ml.Pipeline. In this situation the write method of the last stage will return the JavaMLWriter not the PipelineModelWriter.

pipe = Pipeline(
    stages=[
        SomeCustomTransformer, # is NOT an instance of the JavaMLWritable
        Pipeline(
            stages=[
                LightGBMClassifier(**model_params),
            ]
        )
    ]
)

dsmith111 · 2024-09-05T02:20:09Z

Is this bug still being considered? Implementing

pipeline = Pipeline(
    stages=[
        custom_transformer,
        PipelineModel(stages=[lgbm_model]),
        custom_transformer
        ]
    )

seems like it should just be a temporary work around.

sibyl1956 added the bug label Oct 31, 2022

github-actions bot added the triage label Oct 31, 2022

ppruthi added the area/lightgbm label Nov 7, 2022

svotaw self-assigned this Nov 13, 2022

svotaw removed the triage label Nov 16, 2022

dsmith111 mentioned this issue Oct 6, 2024

[BUG] Cannot Load LightGBM Model When Placed in a Spark Pipeline with Custom Transformers #2293

Open

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel' #1701

[BUG] 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel' #1701

sibyl1956 commented Oct 31, 2022 •

edited by ppruthi

Loading

github-actions bot commented Oct 31, 2022

ppruthi commented Nov 7, 2022

svotaw commented Nov 13, 2022

anor4k commented Apr 28, 2023

tbrandonstevenson commented Jul 25, 2023

grzegorz-karas commented Aug 11, 2023

dsmith111 commented Sep 5, 2024 •

edited

Loading

[BUG] 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel' #1701

[BUG] 'com.microsoft.azure.synapse.ml.lightgbm' has no attribute 'LightGBMClassificationModel' #1701

Comments

sibyl1956 commented Oct 31, 2022 • edited by ppruthi Loading

SynapseML version

System information

Describe the problem

Code to reproduce issue

Other info / logs

What component(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

github-actions bot commented Oct 31, 2022

ppruthi commented Nov 7, 2022

svotaw commented Nov 13, 2022

anor4k commented Apr 28, 2023

tbrandonstevenson commented Jul 25, 2023

grzegorz-karas commented Aug 11, 2023

dsmith111 commented Sep 5, 2024 • edited Loading

sibyl1956 commented Oct 31, 2022 •

edited by ppruthi

Loading

dsmith111 commented Sep 5, 2024 •

edited

Loading