Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precomputes all modifications at once #269

Merged
merged 24 commits into from
Sep 20, 2018

Conversation

lukasheinrich
Copy link
Contributor

Description

This PR refactors the pdf to compute all modifications needed in one go. This should be possible independent of #231 #251 but help with the overall refactoring by collecting all necessary computations in a single function that can then be optimized

Checklist Before Requesting Approver

  • Tests are passing
  • "WIP" removed from the title of the pull request

@coveralls
Copy link

coveralls commented Sep 18, 2018

Coverage Status

Coverage decreased (-0.2%) to 98.214% when pulling 1c76044 on refactor/precompute_all_modifications into e03a39a on master.

@lukasheinrich
Copy link
Contributor Author

lukasheinrich commented Sep 18, 2018

this is passing tests. the decrease in coverage is because this refactoring exposed that the

pdf.expected_sample(...)

method is not tested separately (it was only traversed as part of the other tests). so it should

  1. either be removed
  2. as part of the public API be tested explicitly (https://github.com/diana-hep/pyhf/pull/270)

maybe you guys can already have a look. I added some comments inline

@lukasheinrich
Copy link
Contributor Author

to summarize this PR refactors the expected_actualdata method such that all modifications are computed in one go

  • instead of computing the modificatio within expected_sample we pre-compute it into a structure all_modifications (the expensive part)

  • the internal _expected_sample really only implements how these modifications and the nominal data get amalgated into a singe sample histogram

  • the all_modifications computation is performed by modifier type _mtype_results which will be useful in follow-up PRs where for some types we can replace the naive loops in _mtype_results with a fully vectorized calculation

pyhf/pdf.py Outdated
factors += basefactor

return tensorlib.product(tensorlib.stack(tensorlib.simple_broadcast(*factors)), axis=0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this just implements how the modifications get combined with the nominal data. Up to now we have 2 types of modifications factors, and shifts (deltas)

first we sum all deltas with the nominal and then apply all factors on that sum

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow your explanation here. "factor" is a very vague term. Are factors nominal*delta and deltas nominal + delta? In both cases, we still call them delta...

Copy link
Contributor Author

@lukasheinrich lukasheinrich Sep 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what this does is compute

summands = [delta1,delta2,... nominal]
basefactor = np.sum(summands)
factors = [f1,f2,f3...,basefactor]
np.product(factors)

but in a follow up PR this entire function will be removed and expected_actualdata will look like this

    all_modifications = self._all_modifications(pars)
    delta_field  = np.zeros(self.cube.shape)
    factor_field = np.ones(self.cube.shape)
    for cname,smods in all_modifications.items():
        for sname,(factors,deltas) in smods.items():
            ind = self.hm[cname][sname]['index']
            for f in factors:
                factor_field[ind] = factor_field[ind] * f
            for d in deltas:
                delta_field[ind]  = delta_field[ind] + d
    combined = factor_field * (delta_field + self.cube)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, the way the code is currently written is verrrrry opaque to me. It looks like you sum up a bunch of summands, and then multiply those summand results together. Especially because you do factors += basefactor. Can you do sum and products separately, and return the combination of them rather than doing internal bookkeeping?

@lukasheinrich
Copy link
Contributor Author

rebased

@lukasheinrich lukasheinrich changed the title [WIP] precomputes all modifications at once precomputes all modifications at once Sep 19, 2018
@lukasheinrich
Copy link
Contributor Author

this is ready for review I think

@lukasheinrich
Copy link
Contributor Author

the next PR on top of this is then something along the lines of #273

Copy link
Contributor

@kratsg kratsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have all of these comments in the code itself.

pyhf/pdf.py Show resolved Hide resolved
pyhf/pdf.py Outdated
factors += basefactor

return tensorlib.product(tensorlib.stack(tensorlib.simple_broadcast(*factors)), axis=0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow your explanation here. "factor" is a very vague term. Are factors nominal*delta and deltas nominal + delta? In both cases, we still call them delta...

pyhf/pdf.py Outdated Show resolved Hide resolved
pyhf/pdf.py Show resolved Hide resolved
pyhf/pdf.py Outdated Show resolved Hide resolved
pyhf/pdf.py Outdated Show resolved Hide resolved
@matthewfeickert
Copy link
Member

Reviewing this quickly I think that I agree with everything that @kratsg put down. So if those requests get resolved then I think this looks great. I'll review this again in more depth, but thanks very much for taking time to put in review comments in advance @lukasheinrich — they were quite nice.

@lukasheinrich
Copy link
Contributor Author

@kratsg PTAL

@lukasheinrich lukasheinrich force-pushed the refactor/precompute_all_modifications branch from 920459f to d7aa4d5 Compare September 20, 2018 07:29
@lukasheinrich
Copy link
Contributor Author

rebased

@lukasheinrich lukasheinrich force-pushed the refactor/precompute_all_modifications branch from d7aa4d5 to 695555c Compare September 20, 2018 17:23
@kratsg kratsg merged commit 5436620 into master Sep 20, 2018
@kratsg kratsg deleted the refactor/precompute_all_modifications branch September 20, 2018 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants