-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application of POIs using formulas #850
Comments
apologies for the slow response. the way to implement this is to add a new
I think in the constrained language of numexpr https://github.com/pydata/numexpr and the with this constraint that no new parameters are introduced this would be quite doable. cc @alexander-held who has also raised this issue |
Hi @lukasheinrich , Yes, that would be great for more complex likelihood models. I will give it a try this weekend and, if I manage to implement your idea, will come back here to report. Thanks again! |
I would suggest you orient yourself along the lines of the normfactor code https://scikit-hep.org/pyhf/_modules/pyhf/modifiers/normfactor.html#normfactor the output shape of a modifier is
here's the meaning of the output shape
so if you need to compute the affect of 3 formulas and in total have 10 sample (which might or might not be affected by the formula(s) and want to evaluate the formula for 5 parameter settings and the total number of bins is 20 you would have an output tensor of
|
Hi, I will likely take a stab at it since I need this for my analysis. (unless someone is already working on it?) |
Hi, just to add an example of what other libraries do: I imagine that in practical applications, a very limited amount of operators will be enough for most fits: addition, subtraction, multiplication, division. Taking a power is convenient, but I guess non-integer powers are rarely needed, so multiplication is fine. Square roots are convenient too, but could be worked around by redefining normfactor->sqrt(normfactor) and then taking the power instead. |
Ok, I am coming back here because I am starting a new project that could use this and this time I really want to give pyhf a try. I would say that powers of the normfactor would already solve 99% of the cases I have in mind. The main application I have in my is cases with interference (like theta^2 * S + theta * I + B) [think EFT, for instance] Maybe we could just have an option in normfactor that says the power which which the normfactor acts on the sample. Would that be a simple way to implement it? |
To bring this issue up again, I see that @lukasheinrich added some "custom modifier" feature but I can't locate any documentation on how to use this to define my own modifier. Is this something users can currently interface with? We'd like to use something like this in ATLAS di-Higgs analyses, where our POI can be a coupling which has rather complicated interference effects and can't be described by any existing modifier. To give a concrete example, we want to implement something like yield = B + f(k)*S1 + g(k)*S2 + h(k)*S3 where k is our POI, f, g, and h are functions we define, and S1, S2, S3 are different "basis" signal distributions. Is something like this already supported in some way, or is this part of the open items remaining on this issue? |
Hi @lukasheinrich , @alexander-held , I wanted to bring this up again as it's starting to become a limiting factor in our analysis, and it would be really nice to avoid having to convert everything back to Roofit. Is there any documentation/testing/etc. that still needs to be done here? I also have a student who could potentially contribute to this (development, testing, etc.) if that's the limiting factor. Please let us know-- it would be really great to get this last feature that would enable a pure pyhf fit structure for our analysis! Best wishes, |
Yes, we're working on the same analysis :) |
Hi @lukasheinrich - The ideal scenario would be if we could implement f(k) etc. ourselves in python, but in practice we're dealing with polynomials here at the moment. That is, f(k) = a + bk + ck^2 + ... There's also a k-dependent overall normalization (which gets looked up from the theory calculation). In principle that can be handled with a separate There are no cross-bin relations that come in, the only dependence is on the POI. Looking further ahead, we'd also love to use this sort of functionality with multiple POIs. I know general support for that is still in the works, but the dream would be to have something like yield = B + f(k1, k2, k3)*S1 + g(k1, k2, k3)*S2 + ... |
I think if the general structure is a multiplicative modifier per-sample that only depends on the parameters
this would not cover yet e.g. custom functions that depend on the bin location or other bins e.g. |
As long as k1 + k2*3.5 + k3*k2*7.0 without any problem (in the existing In case it may be helpful, here's an (ATLAS-internal) documentation page describing the use of this in |
thanks @alexander-held - so I guess that's compatible wiith my previous reply? can you confirm? |
@lukasheinrich I think so, yes. The |
(not sure why github is not auto-linking this thread here: #1627) |
It's pretty annoying, but GitHub won't show a visual cross-references between Issues and Discussions. No idea why. 😞 |
hi all, we're preparing our 0.7.0 release and with this we are making some progress on this front check out the code here https://gist.github.com/lukasheinrich/a7f23b71cf048727ca9012f5f6ea940a @alexander-held has been testing this and an initial look make it seem like this would cover quite a bit of ground. It'd be interesting to get your opinion @balunas @rafaellopesdesa |
This is migrated into a draft PR #1991. See |
Hi, I'm a master's student and for my thesis I'm working on the HWW off-shell analysis. In my case, I'd need to have modifiers which are function of the POI, something like mu - sqrt(mu). Can this be done via the above implementation? |
Yes this works, an example of that is also discussed in this issue: scikit-hep/cabinetry#382. |
Hi,
It seems that, right now, the modifiers are limited to addition of multiplication, with the POIs as normfactors being defined by simple modification POI*pdf. That's ok for most models, but some models require more complex likelihoods. The case that comes to my mind (need?) is EFT analyses which have an interference term and a squared term and, therefore, would need something like sqrt(POI)inter + POIsquare.
I don't know exactly what is the best way to implement that. I think that a simple way would be if one could define multiple POIs and if one could define the value of a modifier as a function of other modifiers. Even simple functions like polynomials would already encompass most of the case while keeping all operations very fast and efficient.
Let me know if there is anything I can do to help.
Thanks for considering.
The text was updated successfully, but these errors were encountered: