precomputes all modifications at once #269

lukasheinrich · 2018-09-17T19:28:27Z

Description

This PR refactors the pdf to compute all modifications needed in one go. This should be possible independent of #231 #251 but help with the overall refactoring by collecting all necessary computations in a single function that can then be optimized

Checklist Before Requesting Approver

Tests are passing
"WIP" removed from the title of the pull request

coveralls · 2018-09-18T09:22:16Z

Coverage decreased (-0.2%) to 98.214% when pulling 1c76044 on refactor/precompute_all_modifications into e03a39a on master.

lukasheinrich · 2018-09-18T10:47:28Z

this is passing tests. the decrease in coverage is because this refactoring exposed that the

pdf.expected_sample(...)

method is not tested separately (it was only traversed as part of the other tests). so it should

either be removed
as part of the public API be tested explicitly (https://github.com/diana-hep/pyhf/pull/270)

maybe you guys can already have a look. I added some comments inline

lukasheinrich · 2018-09-18T10:51:45Z

to summarize this PR refactors the expected_actualdata method such that all modifications are computed in one go

instead of computing the modificatio within expected_sample we pre-compute it into a structure all_modifications (the expensive part)
the internal _expected_sample really only implements how these modifications and the nominal data get amalgated into a singe sample histogram
the all_modifications computation is performed by modifier type _mtype_results which will be useful in follow-up PRs where for some types we can replace the naive loops in _mtype_results with a fully vectorized calculation

pyhf/pdf.py

lukasheinrich · 2018-09-18T11:04:19Z

pyhf/pdf.py

+        factors += basefactor
+
+        return tensorlib.product(tensorlib.stack(tensorlib.simple_broadcast(*factors)), axis=0)
+


this just implements how the modifications get combined with the nominal data. Up to now we have 2 types of modifications factors, and shifts (deltas)

first we sum all deltas with the nominal and then apply all factors on that sum

I don't follow your explanation here. "factor" is a very vague term. Are factors nominal*delta and deltas nominal + delta? In both cases, we still call them delta...

what this does is compute

summands = [delta1,delta2,... nominal] basefactor = np.sum(summands) factors = [f1,f2,f3...,basefactor] np.product(factors)

but in a follow up PR this entire function will be removed and expected_actualdata will look like this

all_modifications = self._all_modifications(pars) delta_field = np.zeros(self.cube.shape) factor_field = np.ones(self.cube.shape) for cname,smods in all_modifications.items(): for sname,(factors,deltas) in smods.items(): ind = self.hm[cname][sname]['index'] for f in factors: factor_field[ind] = factor_field[ind] * f for d in deltas: delta_field[ind] = delta_field[ind] + d combined = factor_field * (delta_field + self.cube)

Hrmm, the way the code is currently written is verrrrry opaque to me. It looks like you sum up a bunch of summands, and then multiply those summand results together. Especially because you do factors += basefactor. Can you do sum and products separately, and return the combination of them rather than doing internal bookkeeping?

pyhf/pdf.py

lukasheinrich · 2018-09-19T13:29:52Z

rebased

lukasheinrich · 2018-09-19T15:27:19Z

this is ready for review I think

lukasheinrich · 2018-09-19T15:32:56Z

the next PR on top of this is then something along the lines of #273

kratsg

It would be great to have all of these comments in the code itself.

pyhf/pdf.py

kratsg · 2018-09-19T15:49:40Z

pyhf/pdf.py

+        factors += basefactor
+
+        return tensorlib.product(tensorlib.stack(tensorlib.simple_broadcast(*factors)), axis=0)
+


I don't follow your explanation here. "factor" is a very vague term. Are factors nominal*delta and deltas nominal + delta? In both cases, we still call them delta...

pyhf/pdf.py

matthewfeickert · 2018-09-19T17:02:30Z

Reviewing this quickly I think that I agree with everything that @kratsg put down. So if those requests get resolved then I think this looks great. I'll review this again in more depth, but thanks very much for taking time to put in review comments in advance @lukasheinrich — they were quite nice.

lukasheinrich · 2018-09-19T20:57:15Z

@kratsg PTAL

lukasheinrich · 2018-09-20T07:29:34Z

rebased

lukasheinrich requested review from kratsg and matthewfeickert September 18, 2018 10:47

lukasheinrich mentioned this pull request Sep 18, 2018

Test all public methods of classes (mainly pdf.Model) #270

Open

8 tasks

lukasheinrich commented Sep 18, 2018

View reviewed changes