Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input/Output Feature Schema Refactor #2147

Merged
merged 31 commits into from
Jul 6, 2022
Merged

Conversation

connor-mccorm
Copy link
Contributor

Adds Marshmallow schema support for all input and output features and their corresponding preprocessing, encoders, and decoders.

@github-actions
Copy link

github-actions bot commented Jun 14, 2022

Unit Test Results

       6 files  ±  0         6 suites  ±0   2h 24m 3s ⏱️ + 2m 7s
2 919 tests +  4  2 873 ✔️ +  4    46 💤 ±0  0 ±0 
8 757 runs  +12  8 615 ✔️ +12  142 💤 ±0  0 ±0 

Results for commit cd43485. ± Comparison against base commit dfdc98c.

♻️ This comment has been updated with latest results.

@connor-mccorm connor-mccorm force-pushed the schema_input_features branch 7 times, most recently from 6354470 to a937f5e Compare June 18, 2022 01:25
@connor-mccorm
Copy link
Contributor Author

Most tests are passing, there is one case where I need to add a regex pattern optional argument for the string input type to handle custom fill value restrictions, but once I get that done, we should be good to go.

@connor-mccorm connor-mccorm force-pushed the schema_input_features branch 7 times, most recently from 273f126 to 402f19f Compare June 22, 2022 02:22
@connor-mccorm connor-mccorm marked this pull request as ready for review June 22, 2022 03:25
Copy link
Contributor

@martindavis martindavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @connor-mccorm , overall excellent job making the Schema docs very clean and clear in Ludwig! I've got some questions, but I don't see anything blocking this from my perspective.

ludwig/schema/preprocessing.py Outdated Show resolved Hide resolved
ludwig/schema/preprocessing.py Outdated Show resolved Hide resolved
ludwig/schema/preprocessing.py Outdated Show resolved Hide resolved
ludwig/schema/preprocessing.py Show resolved Hide resolved
ludwig/schema/preprocessing.py Show resolved Hide resolved
ludwig/schema/features/text_feature.py Show resolved Hide resolved
@connor-mccorm connor-mccorm force-pushed the schema_input_features branch from f9d0480 to 864b8d0 Compare June 22, 2022 23:51
ludwig/schema/utils.py Outdated Show resolved Hide resolved
@connor-mccorm connor-mccorm force-pushed the schema_input_features branch 2 times, most recently from c7a405d to b3b64f8 Compare June 23, 2022 17:09
Copy link
Collaborator

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to wire this into the schema validation logic in a follow-up?

ludwig/schema/utils.py Outdated Show resolved Hide resolved
@connor-mccorm connor-mccorm force-pushed the schema_input_features branch from 13f4531 to d023a10 Compare June 27, 2022 17:47
@connor-mccorm
Copy link
Contributor Author

So I believe the schema validation logic will be using all this code now when this lands through getting each input feature marshmallow class via a get_schema_cls() method in each input feature type. The main caveat here is that the defaults for input feature preprocessing are still being set with the preprocessing_defaults() method of each input feature type class since it looked like there we're a handful of dependencies that would need refactoring. So I plan on fixing that in a followup PR asap. Does that sound ok to you @tgaddair?

@tgaddair
Copy link
Collaborator

Sounds good @connor-mccorm !

@connor-mccorm connor-mccorm force-pushed the schema_input_features branch from d42ac68 to d578e13 Compare June 28, 2022 20:50
@connor-mccorm connor-mccorm force-pushed the schema_input_features branch from d636ebe to 4192523 Compare July 6, 2022 21:04
@connor-mccorm connor-mccorm force-pushed the schema_input_features branch from cf71350 to 5affc40 Compare July 6, 2022 21:47
@connor-mccorm connor-mccorm merged commit 6909ae1 into master Jul 6, 2022
@connor-mccorm connor-mccorm deleted the schema_input_features branch July 6, 2022 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants