Enforce consistent tensor shape for scalars #413

matthewfeickert · 2019-03-02T08:01:38Z

Description

In mxnet >= v1.4.0 the shape of a NDArray when a scalar has been passed to it is empty. As a result, this should be caught and forced to have a consistent shape by wrapping the scalar in a list. However, at this point it is probably worth enforcing a default shape on tensors made from scalars, as the different tensor libraries handle them differently and we should have a consistent form.

Checklist Before Requesting Reviewer

Tests are passing
"WIP" removed from the title of the pull request
Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

Summarize commit messages into a comprehensive review of the PR

* Ensure non-empty tensor shape for consistency
   - Do this by asking for forgiveness and attempting to access shape information, and if it fails modifying the tensor object such that it has the proper shape
* Remove generic_len function from backends as a consistent tensor shape makes it unnecessary
* Add error logging and raise KeyError if an unsupported dtype is used for the tensorlibs astensor method
* Add tests for tensor shape for scalars
* Add tests for bad tensor dtypes

In mxnet >= v1.4.0 the shpae of a NDArray when a scalar has been passed to it is empty. As a result, this should be caught and forced to have a consistent shape by wrapping the scalar in a list.

pyhf/tensor/mxnet_backend.py

coveralls · 2019-03-02T08:43:27Z

Coverage increased (+0.02%) to 97.538% when pulling 23d8e13 on fix/update-mxnet-return-values into 4fa45dd on master.

kratsg · 2019-03-02T15:34:44Z

We're missing coverage tests where we force the shape checks for the exceptions.

matthewfeickert · 2019-03-02T19:33:32Z

Actually, looking at this in more depth now I think that the MXNet's backend's simple_broadcast needs to be rewritten instead with something like the generic_len of the TensorFlow and PyTorch backends.

matthewfeickert · 2019-03-02T19:57:19Z

@kratsg @lukasheinrich Looking at this again, is there a specific reason that for our backends the astensor method doesn't automatically convert input scalars to lists, turning

pyhf.tensorlib.astensor(1.)

into the equivalent of

pyhf.tensorlib.astensor([1.])

such that they both return

that_libs_version_of_tensor([1.])

? I can't remember if there is a reason that we don't force this, beyond trying to keep parity with the underlying libraries themselves as

>>> np.array(1.)
array(1.)
>>> np.array([1.])
array([1.])

Another reason I am tempted to do this, is as

We're missing coverage tests where we force the shape checks for the exceptions.

to do this properly we would want to have astensor behave the same across all backends. MXNet is now assuming that all objects with operations performed on them will have nonzero shape, which means the MXNet backend will either need to apply additional checks everywhere, or the behavior of astensor for all backends will take turn scalar tensors into tensors of shape (1,).

This is perhaps more efficient(?)

The TensorFlow backend's astensor method already ensures that a tensor has shape 1 at minimum Example: >>> import pyhf >>> import tensorflow as tf >>> pyhf.set_backend(pyhf.tensor.tensorflow_backend(session=tf.Session())) >>> pyhf.tensorlib.astensor(10) <tf.Tensor 'Cast:0' shape=(1,) dtype=float32>

lukasheinrich · 2019-03-03T14:20:39Z

Hi @matthewfeickert -- I think you are right, the point of the tensorlib is to act as a shim that unifies the interfaces across the different backends (as annoyingly they have small differences) in that sense, we should expect all shapes to be the same for the a given input shape (as well as decide on specific casts) and implement them accordingly

Instead of trying to catch an IndexError, just first check if the input is a scalar and if so, wrap it in a list

Additionally add tests for these types of bad tesnor types

pyhf/tensor/mxnet_backend.py

kratsg · 2019-03-04T20:52:27Z

I think we shouldn't cast scalars to lists. There's nothing in our code that is returning scalars -- so it's a user problem if scalars are passed in...

matthewfeickert · 2019-03-04T21:04:56Z

I think we shouldn't cast scalars to lists. There's nothing in our code that is returning scalars

So do you just want the offending tests to get changed then? The CI currently breaks without this because we don't enforce a consistent return structure.

kratsg · 2019-03-04T21:07:51Z

So do you just want the offending tests to get changed then? The CI currently breaks without this because we don't enforce a consistent return structure.

Maybe. I wonder what @lukasheinrich thinks. The problem is that I think for a given fit, we call astensor something like 10k times?

matthewfeickert · 2019-03-04T21:28:16Z

The problem is that I think for a given fit, we call astensor something like 10k times

Yeah, this is a very valid point, though we already have this in the TensorFlow backend

https://github.com/diana-hep/pyhf/blob/4fa45dda8d2314138549ac6b67ebdc9d9a506742/pyhf/tensor/tensorflow_backend.py#L110-L118

so do we know how much it gets slowed down?

Instead of doing any type checking, try to access different data members and if they don't exist then make the necessary corrections to the object to make it be a tensor of non-empty shape

If we have people trying to pass strings as tensors then there are bigger problems afoot

This is more appropriate for the use case. From the pytest docs: Using pytest.raises is likely to be better for cases where you are testing exceptions your own code is deliberately raising, whereas using @pytest.mark.xfail with a check function is probably better for something like documenting unfixed bugs (where the test describes what "should" happen) or bugs in dependencies. c.f. https://docs.pytest.org/en/latest/assert.html#assertions-about-expected-exceptions

c.f. https://docs.pytest.org/en/latest/logging.html

* Allow for arbitrary shape tensors, including 0-dimensional ("empty") shape tensors (floats) - Reverts PR #413 * Enforce optimizers return objective function as a scalar to harmonize return type * Revert tests that enforce minimum shape * Add tests for return structure shape of pyhf.infer.mle.fit * Add docstring examples for astensor

Ensure MXNet tensor has consistent shape even with scalar

bd206f3

In mxnet >= v1.4.0 the shpae of a NDArray when a scalar has been passed to it is empty. As a result, this should be caught and forced to have a consistent shape by wrapping the scalar in a list.

matthewfeickert self-assigned this Mar 2, 2019

kratsg reviewed Mar 2, 2019

View reviewed changes

pyhf/tensor/mxnet_backend.py Show resolved Hide resolved

Add clarifying comment

c006849

kratsg approved these changes Mar 2, 2019

View reviewed changes

matthewfeickert changed the title ~~Ensure tensors in MXNet backend have consistent shape even with scalar~~ WIP: Ensure tensors in MXNet backend have consistent shape even with scalar Mar 2, 2019

matthewfeickert added 7 commits March 2, 2019 22:50

Use broadcast_to instead of creating a new array

9b2fe82

This is perhaps more efficient(?)

Require PyTorch backend tensors to have non-zero shape

a82f5e4

Require NumPy backend tensors to have non-zero shape

1292133

Apply superficial naming change for cross-library consistency

603cec5

Add additional shape tests

55a66dc

Have consitent return struture from operations

a6ea56c

matthewfeickert changed the title ~~WIP: Ensure tensors in MXNet backend have consistent shape even with scalar~~ WIP: Enforce consistent tensor shape for scalars Mar 4, 2019

matthewfeickert added 2 commits March 4, 2019 14:23

Check if scalar first, rather then catching IndexError

092fb19

Instead of trying to catch an IndexError, just first check if the input is a scalar and if so, wrap it in a list

Raise KeyError and log reason if unsupported dtype used

c7f5b16

Additionally add tests for these types of bad tesnor types

matthewfeickert changed the title ~~WIP: Enforce consistent tensor shape for scalars~~ Enforce consistent tensor shape for scalars Mar 4, 2019

matthewfeickert requested review from kratsg and lukasheinrich March 4, 2019 20:33

matthewfeickert commented Mar 4, 2019

View reviewed changes

pyhf/tensor/mxnet_backend.py Show resolved Hide resolved

kratsg reviewed Mar 4, 2019

View reviewed changes

pyhf/tensor/mxnet_backend.py Outdated Show resolved Hide resolved

matthewfeickert added 5 commits March 4, 2019 23:35

Use IndexError and ask for forgivness rather than check

0ead9dc

Ask for forgivness and use fallthrough exceptions for TF backend

4297019

Instead of doing any type checking, try to access different data members and if they don't exist then make the necessary corrections to the object to make it be a tensor of non-empty shape

Specify the excpetion raised in pytest tests marked xfail

60aaebb

Remove test_astensor_type as no need to check for strings being passed

05d8768

If we have people trying to pass strings as tensors then there are bigger problems afoot

matthewfeickert requested a review from kratsg March 5, 2019 18:13

kratsg approved these changes Mar 5, 2019

View reviewed changes

Add checking of log messages with pytest caplog

23d8e13

c.f. https://docs.pytest.org/en/latest/logging.html

kratsg approved these changes Mar 5, 2019

View reviewed changes

matthewfeickert merged commit 1f6e3b0 into master Mar 5, 2019

matthewfeickert deleted the fix/update-mxnet-return-values branch March 5, 2019 20:55

matthewfeickert mentioned this pull request Mar 5, 2019

Default "shapes" for precomputation #288

Closed

3 tasks

This was referenced Jul 17, 2020

Don't enforce a non-empty shape on tensors #954

Closed

revert: Allow 0-dimensional shape tensors #972

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce consistent tensor shape for scalars #413

Enforce consistent tensor shape for scalars #413

matthewfeickert commented Mar 2, 2019 •

edited

Loading

coveralls commented Mar 2, 2019 •

edited

Loading

kratsg commented Mar 2, 2019

matthewfeickert commented Mar 2, 2019

matthewfeickert commented Mar 2, 2019 •

edited

Loading

lukasheinrich commented Mar 3, 2019

kratsg commented Mar 4, 2019

matthewfeickert commented Mar 4, 2019 •

edited

Loading

kratsg commented Mar 4, 2019

matthewfeickert commented Mar 4, 2019

Enforce consistent tensor shape for scalars #413

Enforce consistent tensor shape for scalars #413

Conversation

matthewfeickert commented Mar 2, 2019 • edited Loading

Description

Checklist Before Requesting Reviewer

Before Merging

coveralls commented Mar 2, 2019 • edited Loading

kratsg commented Mar 2, 2019

matthewfeickert commented Mar 2, 2019

matthewfeickert commented Mar 2, 2019 • edited Loading

lukasheinrich commented Mar 3, 2019

kratsg commented Mar 4, 2019

matthewfeickert commented Mar 4, 2019 • edited Loading

kratsg commented Mar 4, 2019

matthewfeickert commented Mar 4, 2019

matthewfeickert commented Mar 2, 2019 •

edited

Loading

coveralls commented Mar 2, 2019 •

edited

Loading

matthewfeickert commented Mar 2, 2019 •

edited

Loading

matthewfeickert commented Mar 4, 2019 •

edited

Loading