Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: DML run get timeout if big dataset has more feature columns (Workaround Synapse Spark optimizer issue) #1903

Merged
merged 8 commits into from
Apr 4, 2023

Conversation

dylanw-oss
Copy link
Contributor

What changes are proposed in this pull request?

When applying DML on WExp project, the data has >1M records and >5 categorical features, DML run got timeout even with a large cluster.
We figured out that it's due to Synapse version of Spark optimizer won't be able to handle a complex query plan, split DML pipeline and cache each pipeline result, can fix the timeout issue.

How is this patch tested?

with internal project data

@github-actions
Copy link

Hey @dylanw-oss 👋!
Thank you so much for contributing to our repository 🙌.
Someone from SynapseML Team will be reviewing this pull request soon.

We use semantic commit messages to streamline the release process.
Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix.
This helps us to create release messages and credit you for your hard work!

Examples of commit messages with semantic prefixes:

  • fix: Fix LightGBM crashes with empty partitions
  • feat: Make HTTP on Spark back-offs configurable
  • docs: Update Spark Serving usage
  • build: Add codecov support
  • perf: improve LightGBM memory usage
  • refactor: make python code generation rely on classes
  • style: Remove nulls from CNTKModel
  • test: Add test coverage for CNTKModel

To test your commit locally, please follow our guild on building from source.
Check out the developer guide for additional guidance on testing your change.

@dylanw-oss
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented Apr 1, 2023

Codecov Report

Merging #1903 (6d68c41) into master (0f02626) will increase coverage by 0.05%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1903      +/-   ##
==========================================
+ Coverage   86.77%   86.83%   +0.05%     
==========================================
  Files         301      301              
  Lines       15587    15596       +9     
  Branches      803      815      +12     
==========================================
+ Hits        13526    13543      +17     
+ Misses       2061     2053       -8     
Impacted Files Coverage Δ
.../azure/synapse/ml/causal/ResidualTransformer.scala 91.89% <ø> (ø)
...rosoft/azure/synapse/ml/train/TrainRegressor.scala 92.59% <ø> (ø)
...ft/azure/synapse/ml/causal/DoubleMLEstimator.scala 89.89% <100.00%> (+0.76%) ⬆️
...osoft/azure/synapse/ml/train/TrainClassifier.scala 84.78% <100.00%> (+0.22%) ⬆️

... and 3 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

mhamilton723
mhamilton723 previously approved these changes Apr 1, 2023
@dylanw-oss
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dylanw-oss dylanw-oss marked this pull request as draft April 3, 2023 18:42
@dylanw-oss dylanw-oss marked this pull request as ready for review April 3, 2023 21:20
@dylanw-oss
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dylanw-oss dylanw-oss requested a review from memoryz April 3, 2023 21:22
@dylanw-oss
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dylanw-oss dylanw-oss requested a review from mhamilton723 April 4, 2023 07:00
@dylanw-oss dylanw-oss enabled auto-merge (squash) April 4, 2023 07:01
@mhamilton723 mhamilton723 disabled auto-merge April 4, 2023 12:48
@mhamilton723 mhamilton723 merged commit 13afff6 into microsoft:master Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants