Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: fixed_vals #846

Closed
wants to merge 34 commits into from
Closed

feat: fixed_vals #846

wants to merge 34 commits into from

Conversation

cgottard
Copy link
Contributor

@cgottard cgottard commented Apr 27, 2020

fixed_vals argument in infer functions

I propagated to the infer functions the capability of fixing some parameters to a constant the value. The optimizer already implemented this functionality so it was only a matter of interfacing.
I tested the changes running both CLs calculations and MLE fits. The CI succeeds.

Note: before these changes it was possible to perform a MLE fit passing the list of fixed parameters via **kwargs. The same was not true for hypotest. I found that adding an explicit function argument for both cases was appropriate for such a common task.

@kratsg, we discussed the use case of these changed via e-mail. Could you review this MR?

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR

@cgottard cgottard changed the title Fixed vals feat:fixed_vals Apr 27, 2020
@cgottard cgottard changed the title feat:fixed_vals feat: fixed_vals Apr 27, 2020
@matthewfeickert matthewfeickert added the feat/enhancement New feature or request label Apr 27, 2020
@matthewfeickert matthewfeickert added the API Changes the public API label Apr 27, 2020
@lukasheinrich
Copy link
Contributor

Hi @cgottard thanks a lot for this PR. It's been something we wanted to add for some time. I wonder whether we could somehow streamline the API further. As it is, we do a lot of passing around for these items

  • init values
  • bounds values
  • fixed_values

and perhaps we should be rather moving around pdfconfig objects

sb_config = model.make_config()
sb_config.set_poival(1.0)

b_config = model.make_config()
b_config.set_poival(0.0)

pyhf.infer.hypotest(sb_config, data, pdf)

what do you think?

@@ -17,6 +24,7 @@ def hypotest(
pdf (~pyhf.pdf.Model): The HistFactory statistical model
init_pars (Array or Tensor): The initial parameter values to be used for minimization
par_bounds (Array or Tensor): The parameter value bounds to be used for minimization
fixed_vals (list of tuples): Parameters to be held constant and their value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to be careful, since this will be confusing given the existing poi_test argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be confusing because fixed_poi_fit is not strictly necessary anymore. Anyway in fixed_poi_fit the mu is fixed via the fixed_vals and if additional parameters are set to constant they're added to the list, see https://github.com/cgottard/pyhf/blob/fixed_vals/src/pyhf/infer/mle.py#L56
So there should be no problem nor ambiguity.

@kratsg
Copy link
Contributor

kratsg commented Apr 27, 2020

I think we should have this in for 0.6.0 because I don't think trying to keep the same API is a good idea.

@lukasheinrich
Copy link
Contributor

@cgottard would you be up for shepherding a larger change to the API that enables this feature?

@cgottard
Copy link
Contributor Author

Dear all,

thanks for the feedback. I was addressing @matthewfeickert's comment about the test but locally pytest is not running as it should and I see the errors only from the CI.

Anyway, I am ok with changing the API as @lukasheinrich suggested. I agree that it makes everything clearer and more transparent. In the meantime I think we can keep this MR open to discuss the progress, then we'll see if we want to close it and create a new one using a new feature branch name.

@kratsg
Copy link
Contributor

kratsg commented Apr 27, 2020

Anyway, I am ok with changing the API as @lukasheinrich suggested. I agree that it makes everything clearer and more transparent. In the meantime I think we can keep this MR open to discuss the progress, then we'll see if we want to close it and create a new one using a new feature branch name.

I think this PR should go in as close to how it is, but with tests and coverage, as part of 0.5.0 -> 0.6.0 (we don't want very large PRs) and you've pointed out this is a relatively quick change. The API is backwards compatible here with it. Then we change the API in another PR. Having very large PRs in general scares me.

@lukasheinrich
Copy link
Contributor

lukasheinrich commented Apr 27, 2020 via email

@cgottard
Copy link
Contributor Author

Ok, I'll ping you when this is ready. I implemented a new background uncertainty in test_backend_consistency which is then fixed and shifted.
The syntax is correct but I see AssertionErrors that I need to investigate. I'll fix my local pytest first, so to avoid all these useless commits.

@cgottard
Copy link
Contributor Author

cgottard commented May 8, 2020

Dear all,

I cloned the repo from master and run locally the test_backend_consistency.py. As it is everything succeds.
Then I changed the statistical model to match what I have in this MR.

Model
source = generate_source_static(n_bins)
bkg_unc_5pc_up = [x + 0.05 * x for x in source['bindata']['bkg']]
bkg_unc_5pc_dn = [x - 0.05 * x for x in source['bindata']['bkg']]
signal_sample = {
    'name': 'signal',
    'data': source['bindata']['sig'],
    'modifiers': [{'name': 'mu', 'type': 'normfactor', 'data': None}],
}

background_sample = {
    'name': 'background',
    'data': source['bindata']['bkg'],
    'modifiers': [
        {
            'name': 'uncorr_bkguncrt',
            'type': 'shapesys',
            'data': source['bindata']['bkgerr'],
        },
        {
            'name': 'norm_bkgunc',
            'type': 'histosys',
            'data': {'hi_data': bkg_unc_5pc_up, 'lo_data': bkg_unc_5pc_dn,},
        },
    ],
}
and pytorch returns a result slightly above tolerance.

FAILED test_backend_consistency.py::test_hypotest_q_mu[normal-500_bins] - assert False FAILED test_backend_consistency.py::test_hypotest_q_mu[inverted-500_bins] - assert False
because
array([0.00000000e+00, 1.59306155e-03, 1.06477109e-02, 2.60919799e-06]) < 0.01.all array([0.00000000e+00, 1.59260676e-03, 1.06472520e-02, 2.14421824e-06]) < 0.01.all

Then, I updated my feature branch to master and run the test with the fixed values:

FAILED tests/test_backend_consistency.py::test_hypotest_q_mu[none-normal-500_bins] - assert False FAILED tests/test_backend_consistency.py::test_hypotest_q_mu[none-inverted-500_bins] - assert False
because

array([0.00000000e+00, 1.59306155e-03, 1.06477109e-02, 2.60919799e-06]) < 0.01.all array([0.00000000e+00, 1.59260676e-03, 1.06472520e-02, 2.14421824e-06]) < 0.01.all

Discrepancies among tensors are gone. Possibly because I am using a conda env correctly configured for cuda and TF. And yes, I am running from my branch as I can verify from the stdout and already the "[none-" in the name of the test.

Given the small excess in the tolerance w.r.t 0.01 shall we increase the tolerance to 1.5%?

EDIT: actually the tensors are also slightly above the 5 permille tolerance for those two cases:
0.008959253804614598 < 0.005.all

EDIT 2: different results are obtained for the same backend if the fit is run on CPU or GPU

GPU - TEST PASSED
Fit with fixed pars [] BINS 500 QMU 3.936610076294528 ORDER True BACKEND <pyhf.tensor.jax_backend.jax_backend object at 0x7fbfda648dd0>
CHECK: Numpy difference [0.00000000e+00 1.59260676e-03 1.06472520e-02 2.14421824e-06]

CI - TEST FAILED
same test from the CI:
array([0.00000000e+00, 4.86420689e-02, 1.06472520e-02, 7.35593908e-09]) < 0.015.all

Can we open a ticket for this? The backend consistency has little to do with this MR which simply propagated a function argument.

@lgtm-com
Copy link

lgtm-com bot commented Jun 16, 2020

This pull request introduces 1 alert when merging f610990 into 94b87a8 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@lgtm-com
Copy link

lgtm-com bot commented Jun 26, 2020

This pull request introduces 1 alert when merging 41d6b7f into cb4d37b - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@codecov
Copy link

codecov bot commented Jul 2, 2020

Codecov Report

Merging #846 into master will decrease coverage by 0.35%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #846      +/-   ##
==========================================
- Coverage   96.64%   96.28%   -0.36%     
==========================================
  Files          59       56       -3     
  Lines        3279     3180      -99     
  Branches      454      438      -16     
==========================================
- Hits         3169     3062     -107     
- Misses         69       75       +6     
- Partials       41       43       +2     
Flag Coverage Δ
#unittests 96.28% <0.00%> (-0.36%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/pyhf/modifiers/normfactor.py 89.18% <0.00%> (-10.82%) ⬇️
src/pyhf/infer/calculators.py 97.95% <0.00%> (-2.05%) ⬇️
src/pyhf/tensor/jax_backend.py 94.16% <0.00%> (-1.56%) ⬇️
src/pyhf/__init__.py 97.87% <0.00%> (-0.72%) ⬇️
src/pyhf/tensor/pytorch_backend.py 98.00% <0.00%> (-0.04%) ⬇️
src/pyhf/pdf.py 95.83% <0.00%> (-0.02%) ⬇️
src/pyhf/cli/cli.py 100.00% <0.00%> (ø)
src/pyhf/cli/infer.py 100.00% <0.00%> (ø)
src/pyhf/infer/mle.py 100.00% <0.00%> (ø)
src/pyhf/cli/__init__.py 100.00% <0.00%> (ø)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 27f35e9...ccc9596. Read the comment docs.

@matthewfeickert
Copy link
Member

matthewfeickert commented Aug 18, 2020

@cgottard Sorry to have left you hanging here on this. If you have time can you rebase this so that we can try to get this in for v0.5.2? If you don't one of the core devs can do it.

Edit: @kratsg mentioned that he already talked with you, so he'll take care of this PR and we'll shepard it in. Thank you in advance for your contribution!

@kratsg
Copy link
Contributor

kratsg commented Sep 4, 2020

Closing in favor of #1051. Thanks a lot @cgottard for the initial push to get this working. I cleaned it up a lot into the new PR and rebased it all.

Will be migrating the tests shortly.

@kratsg kratsg closed this Sep 4, 2020
@kratsg kratsg added the wontfix This will not be worked on label Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes the public API feat/enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants