-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: tensorize the interpolation codes (more general) #251
Conversation
just dumping here what we want to ultimately do: (this shapes can be more jagged, but for now we can assume that it's as nice as we want it to (e.g. MBJ is very regular) (writing this in a jagged way, reminded me again of awkward-array, maybe this is a good/interesting test-case for a-a itself (if not, please ignore :-p @jpivarski ) would the
|
ac30a84
to
312310d
Compare
This is now Issue #256. |
To get a request/whinge that will come up later in now, can we please rename To be slightly more helpful then just complaining, an example of names (not saying they should be used) would be |
Note that the |
my latest stab at it these should be equivalent. tested with notebook that generate random histograms with variable shapes (padded with
|
this is without ellipses
gives
|
for the record a notebook on how we arrived at this highly vectorized code |
a1c625a
to
8d16f0a
Compare
Yup, that was clear from the presence of a My point is that the name "kitchensink" conveys that "this does everything" but that conveys no useful information to what they do. If the idea is that the To be clear, this isn't pressing. This is just something I have unnecessarily strong and verbose opinions on (that I'd like to think are loosely held enough to be swayed). :P |
Yeah, I'll rename these functions more clearly in a bit. |
05163dc
to
886c0ec
Compare
From taking a quick scan over the notebook it is going to be super helpful. 👍 Super great! :) |
Currently, the tests that fail on pytorch/tensorflow are due to issues with single float precision versus double float precision versus what python does. The interpcode1 is fine (non-linear, |
Wow, this is both a massive and impressive PR! Super job to both of you guys! 👍 I don't think I have any changes to request but
what do we do with regards to that^? |
self.at_plus_one[channel['name']][sample['name']] | ||
] | ||
] | ||
]), tensorlib.astensor([tensorlib.tolist(pars)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why to we call astensor(tolist(..))
here? this should be a no-op (if it's not it's an issue we should fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets factored out soon enough. It's a bandage for this inconsistency:
pars = [1.0]
np.array([pars]) == [[1.0]]
tf.Tensor([pars]) == [[1.0]]
torch([pars]) == [1.0]
self.at_plus_one[channel['name']][sample['name']] | ||
] | ||
] | ||
]), tensorlib.astensor([tensorlib.tolist(pars)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dito
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets factored out soon enough. It's a bandage for this inconsistency:
pars = [1.0]
np.array([pars]) == [[1.0]]
tf.Tensor([pars]) == [[1.0]]
torch([pars]) == [1.0]
There's a couple of ways we can go.
The main issue we're seeing is that when we calculate the results of the slow interpolation (in vanilla python) and compare against the backend, we're comparing python's double/arbitrary float precision against the single precision of the pytorch/tensorflow backends. |
once we get this merged and #269 merged we can define CombinedInterpolators. I'm attaching a notebook, that adheres to the signature defined in #269 and uses the interpolators in this PR. It would be very good though if we could get to the bottom of the discrepancy first. I'd be fine if we at least know a setting in which the to are very close together (as @kratsg e.g. by forcing single precision in numpy) so that we are confident that we understand where it is coming from. this is the timing factor 12-15x speed up and the interpolation is now the fastest part, the slowness is in the extraction of the results in the python structures, which are not as deeply nested and can be optimized later (tested on MBJ) |
@matthewfeickert @lukasheinrich The last commit here 4e23366 is representing my perspective on this:
The goal of the slow interpolation code (effectively python-only loops) is to make sure that the fast interpolation code (tensorized form) returns the same results. Therefore, the slow interpolation code should be reduce to single-float precision in the situation when we only have single-float precision. This will make the tests pass. It can be an issue we file later for a feature to configure the numpy backend as single-float precision, and that simply means we can pass in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the description here I'm in agreement. So LGTM.
I see @kratsg. so do I understand correctly, that we did not need to loosen any tolerances and forcing the slow implementation to use single precision is sufficient to make the tests pass? |
Yes! |
Description
The overarching theme for this PR is to get the interpolation codes to be tensorizable. This is one of the small pieces needed for #231 and we'll be doing this through a series of smaller PRs to break this up a bit more.
Related: #231, #246, #219.
A notebook is added describing the process of tensorizing these functions to aid with future understanding.
Additionally, I've added a (first draft of a) neat decorator in
utils.py
that tensorizes the inputs to the interpolator functions if they're not tensorized.add(done by Fix concatenate and add missing functionality to backends: #262)tile()
to backendsadd(done by Fix concatenate and add missing functionality to backends: #262)einsum()
to backendsChecklist Before Requesting Approver