Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Make hypotest return CLs as 0-d tensor #944

Merged
merged 19 commits into from
Aug 14, 2020
Merged

Conversation

matthewfeickert
Copy link
Member

@matthewfeickert matthewfeickert commented Jul 14, 2020

Description

Resolves #714

Results in the following behavior for the CLs values

>>> import pyhf
>>> model = pyhf.simplemodels.hepdata_like(signal_data=[12.0, 11.0], bkg_data=[50.0, 52.0], bkg_uncerts=[3.0, 7.0])
>>> data = [51, 48] + model.config.auxdata
>>> test_mu = 1.0
>>> CLs_obs, CLs_exp = pyhf.infer.hypotest(test_mu, data, model, qtilde=True, return_expected=True)
>>> CLs_obs
array(0.05251554)
>>> CLs_exp
array(0.06445521)

previous behavior was

>>> import pyhf
>>> model = pyhf.simplemodels.hepdata_like(signal_data=[12.0, 11.0], bkg_data=[50.0, 52.0], bkg_uncerts=[3.0, 7.0])
>>> data = [51, 48] + model.config.auxdata
>>> test_mu = 1.0
>>> CLs_obs, CLs_exp = pyhf.infer.hypotest(test_mu, data, model, qtilde=True, return_expected=True)
>>> CLs_obs
array([0.05251554])
>>> CLs_exp
array([0.06445521])

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR
* Have hypotest return the CLs as a 0-d tensor
   - Important for fully differential likelihoods
* Update the docs to reflect changes
* Update tests to use return type of 0-d tensor

@matthewfeickert matthewfeickert added docs Documentation related API Changes the public API labels Jul 14, 2020
@matthewfeickert matthewfeickert self-assigned this Jul 14, 2020
@matthewfeickert
Copy link
Member Author

Still need to correct PyTorch's behavior of returning a Tensor.

@matthewfeickert matthewfeickert changed the title api: Change CLs to be scalar refactor: Change CLs to be scalar Jul 14, 2020
@lukasheinrich
Copy link
Contributor

this will likely require a major version bump

@kratsg
Copy link
Contributor

kratsg commented Jul 14, 2020

this will likely require a major version bump

you want this to be v1.0.0? or v0.5.0?

@lukasheinrich
Copy link
Contributor

meh as long as we're still 0.X we can do minor bumps I guess

@lukasheinrich
Copy link
Contributor

lukasheinrich commented Jul 17, 2020

As part of this we probably need to fix this as well

import pyhf
import numpy as np
import jax
import torch
import tensorflow as tf

print("numpy")
print(np.asarray(0.1))
pyhf.set_backend("numpy")
print(pyhf.tensorlib.astensor(0.1))

print("\njax")
print(jax.numpy.asarray(0.1))
pyhf.set_backend("jax")
print(pyhf.tensorlib.astensor(0.1))

print("\ntorch")
print(torch.tensor(0.1))
pyhf.set_backend("pytorch")
print(pyhf.tensorlib.astensor(0.1))

print("\ntensorflow")
print(tf.constant(0.1))
pyhf.set_backend("tensorflow")
print(pyhf.tensorlib.astensor(0.1))
numpy
0.1
[0.1]

jax
0.1
[0.1]

torch
tensor(0.1000)
tensor([0.1000])

tensorflow
tf.Tensor(0.1, shape=(), dtype=float32)
tf.Tensor([0.1], shape=(1,), dtype=float32)

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Jul 17, 2020

As part of this we probably need to fix this as well

Ah, I guess so. I think we did this in the past to intentionally enforce uniformity across all backends, but I guess they're all the same now and so enforcing shape isn't helping.

import pyhf
import numpy as np
import torch

print("numpy")
example = np.asarray(0.1)
print(f"example {example} is a {type(example)} with shape {example.shape}")
pyhf.set_backend("numpy")
example = pyhf.tensorlib.astensor(0.1)
print(f"example {example} is a {type(example)} with shape {example.shape}")

print("\ntorch")
example = torch.tensor(0.1)
print(f"example {example} is a {type(example)} with shape {example.shape}")
pyhf.set_backend("pytorch")
example = pyhf.tensorlib.astensor(0.1)
print(f"example {example} is a {type(example)} with shape {example.shape}")
numpy
example 0.1 is a <class 'numpy.ndarray'> with shape ()
example [0.1] is a <class 'numpy.ndarray'> with shape (1,)

torch
example 0.10000000149011612 is a <class 'torch.Tensor'> with shape torch.Size([])
example tensor([0.1000]) is a <class 'torch.Tensor'> with shape torch.Size([1])

So we should probably just fix this first in a separate PR.

@matthewfeickert
Copy link
Member Author

@lukasheinrich @kratsg Related to Issue #974 and my last question in Issue #714, do we want this to return a 0-d Tensor or do we want a float?

@matthewfeickert matthewfeickert changed the title refactor: Change CLs to be scalar refactor: Make hypotest return CLs as 0-d tensor Aug 3, 2020
@codecov
Copy link

codecov bot commented Aug 3, 2020

Codecov Report

Merging #944 into master will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #944      +/-   ##
==========================================
- Coverage   96.70%   96.70%   -0.01%     
==========================================
  Files          59       59              
  Lines        3338     3337       -1     
  Branches      467      468       +1     
==========================================
- Hits         3228     3227       -1     
  Misses         69       69              
  Partials       41       41              
Flag Coverage Δ
#unittests 96.70% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/pyhf/cli/infer.py 100.00% <ø> (ø)
src/pyhf/infer/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e5b7bb...5cd8324. Read the comment docs.

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Aug 3, 2020

The current state of the PR results in the following

import pyhf
model = pyhf.simplemodels.hepdata_like(
    signal_data=[12.0, 11.0], bkg_data=[50.0, 52.0], bkg_uncerts=[3.0, 7.0]
)
data = [51, 48] + model.config.auxdata
test_mu = 1.0
for backend in ["numpy", "jax", "tensorflow", "pytorch"]:
    pyhf.set_backend(backend)
    CLs_obs, CLs_exp = pyhf.infer.hypotest(
        test_mu, data, model, qtilde=True, return_expected=True
    )
    print(
        f"Observed: {CLs_obs} is of type {type(CLs_obs)}, Expected: {CLs_exp} is of type {type(CLs_exp)}"
    )

giving

Observed: 0.052515541856109765 is of type <class 'numpy.ndarray'>, Expected: 0.06445521290832805 is of type <class 'numpy.ndarray'>
Observed: 0.05251554103025729 is of type <class 'jax.interpreters.xla.DeviceArray'>, Expected: 0.06445520998614704 is of type <class 'jax.interpreters.xla.DeviceArray'>
Observed: 0.05251520499587059 is of type <class 'tensorflow.python.framework.ops.EagerTensor'>, Expected: 0.06445307284593582 is of type <class 'tensorflow.python.framework.ops.EagerTensor'>
Observed: 0.052520569413900375 is of type <class 'torch.Tensor'>, Expected: 0.06446529179811478 is of type <class 'torch.Tensor'>

This change in return type should probably result in a release, even though the public API tests still pass.

@matthewfeickert
Copy link
Member Author

@phinate @WolfgangWaltenberger can you give feedback on if this will be problematic?

@phinate
Copy link
Contributor

phinate commented Aug 3, 2020

@matthewfeickert it’s gonna break neos temporarily, but it’s the smallest fix, so I say just go for it. I’m in the middle of a refactor, so I can add this as part of it.
(Well, I guess we’re actually using pyhf@diffable_json, so it won’t break until that gets rebased)

@WolfgangWaltenberger
Copy link

@phinate @WolfgangWaltenberger can you give feedback on if this will be problematic?

If with "this" you refer to the change of return types, then no, thats not a problem.

@matthewfeickert
Copy link
Member Author

Thanks for the prompt feedback @phinate and @WolfgangWaltenberger. We just wanted to make sure that having the CLs values be 0-d tensors (shape ()) instead of (1,) tensors wouldn't be an issue and make sure that a 0-d tensor is still okay as compared to just going to a float.

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Aug 6, 2020

@alexander-held Will this affect cabinetry?

@alexander-held
Copy link
Member

Hypothesis tests are not yet interfaced, so no impact on cabinetry.

@kratsg kratsg merged commit 6e6f7a1 into master Aug 14, 2020
@kratsg kratsg deleted the api/make-CLs-scalar branch August 14, 2020 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes the public API docs Documentation related tests pytest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: change CLs to be scalar
6 participants