Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node: id hash_value UnshareSubtrees Deep+ShallowClone ModifyValues #1128

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

graehl
Copy link

@graehl graehl commented Aug 30, 2022

At the cost of exposing some existing detail, enable efficient
unsharing of graph structure (DeepClone), and efficient ShallowClone.

Modify[Key]Values allows in-place modification (or deletion) of
node_data in a Map. (the Key is not modified but provided so the
modification of the value may depend on the key)

UnshareSubtrees for in-place DeepClone copying of
shared subtrees.

id and hash_value allow creating efficient input Node -> output Node
transformations where the result of a previously transformed subtree
may be reused (allowing linear time creation of output trees with
shared subtrees instead of exponential by

Rationale: although YAML::Node faithfully represents the input
document and allows efficient traversal, using it as a large-scale
configuration / AST means that it's easy to unintentionally create
quadratic or worse algorithms modifying the tree.

At the cost of exposing some existing detail, enable efficient
unsharing of graph structure (DeepClone), and efficient ShallowClone.

Modify[Key]Values allows in-place modification (or deletion) of
node_data in a Map. (the Key is not modified but provided so the
modification of the value may depend on the key)

UnshareSubtrees for in-place DeepClone copying of
shared subtrees.

id and hash_value allow creating efficient input Node -> output Node
transformations where the result of a previously transformed subtree
may be reused (allowing linear time creation of output trees with
shared subtrees instead of exponential by

Rationale: although YAML::Node faithfully represents the input
document and allows efficient traversal, using it as a large-scale
configuration / AST means that it's easy to unintentionally create
quadratic or worse algorithms modifying the tree.
@graehl
Copy link
Author

graehl commented Aug 30, 2022

Several of these changes can be made separate pull requests if desired. Tests passed locally and we have our fork is in production successfully. If there's only some subset that you'd like to maintain, we're prepared to keep the fork going, but ideally this work may be useful to others. Also happy to add docs.

@jbeder
Copy link
Owner

jbeder commented Sep 20, 2022

This is very interesting, thanks for the PR.

A couple high level-comments that we should work out before going into detail.

  1. It's worth being explicit about the problem. Could you give an example (of the bad behavior) in a few lines of code? Or maybe create a github gist with several examples?

  2. I'd prefer a factory pattern rather than constructors-with-tags. E.g. Node ShallowClone(Node) kinda thing.

  3. The API exposes too many internal details. The Node class in particular; I'd rather not add any unnecessary public methods. Can you rework it so that's the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants