Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning when using Gamma distribution #609

Open
kirimaru-jp opened this issue May 2, 2022 · 4 comments
Open

Warning when using Gamma distribution #609

kirimaru-jp opened this issue May 2, 2022 · 4 comments

Comments

@kirimaru-jp
Copy link

kirimaru-jp commented May 2, 2022

I'd like to run GLMMs using MixedModels.jl, assuming that the response variable follows some continuous distribution. To test, I followed an example here, with the newest version 4.6.2

using RCall, MixedModels, GLM

lexdec = rcopy(R"languageR::lexdec");

lexdec.rt_raw = exp.(lexdec.RT);

f = @formula(rt_raw ~ 1 + Class*NativeLanguage + (1|Subject) + (1|Word));

lexdec_glmm_gamma = fit!(GeneralizedLinearMixedModel(f, lexdec, Gamma(), IdentityLink()), fast=true)

It shown a warning

┌ Warning: Results for families with a dispersion parameter are not reliable.
│ It is best to avoid trying to fit such models in MixedModels until
│ the authors gain a better understanding of those cases

I checked the code, and I understand that when the distribution is not one of Bernoulli, Binomial, Poisson distributions, the warning shows up.

Is it a specific issue of MixedModels.jl, or a theoretical problem of GLMMs?
Does lme4 in R have the same trouble?

@kirimaru-jp kirimaru-jp changed the title Warning when using Warning when using Gamma distribution May 2, 2022
@palday
Copy link
Member

palday commented May 2, 2022

There is a theoretical difficulty which makes it hard to implement in MixedModels.jl. lme4 can handle the gamma family, but note a few things:

  1. There are numerical and theoretical reasons why I've been skeptical of the Lo and Andrews paper that seems to have increased the popularity of gamma models, at least in some areas. You can see this skepticism if you look through the list archives of R-SIG-mixed-models.
  2. lme4 and MixedModels.jl share contributors, including a particular core contributor, so it's not ignorance on the developer's parts. There are some structural differences that make it difficult to directly port the lme4 solution to MixedModels.jl (and some of these differences are actually things that make Julia better, in our opinion, but which mean that there's not a direct correspondence between all the details in lme4 vs. MixedModels.jl).
  3. The problem is with the dispersion parameter, not with the gamma distribution in particular. Gaussian with nonidentity link, gamma, and inverse gaussian all have this dispersion parameter. There is a bit of relevant explanation as part of a larger thread on the Julia discourse.

@kirimaru-jp
Copy link
Author

kirimaru-jp commented May 3, 2022

Thank you for the response!

I have a big dataset, hence Julia should be a good choice.

I guess lme4pureR is equivalent to lme4 when nAGQ = 1?
Since lme4pureR seems to be easier to port to Julia than lme4, hence I would like to give it a try:
https://github.com/lme4/lme4pureR/blob/master/R/pirls.R

Is there any specific issue of lme4pureR that I should consider when porting to Julia?

@palday
Copy link
Member

palday commented May 4, 2022

@kirimaru-jp The fact that it's GPL'd and so any derivative work must be GPL'd, but MixedModels.jl is MIT-licensed, may pose a problem. It looks like @stevencarlislewalker is the only contributor to that particular file, so he may be willing to dual-license that one file to include the MIT license.

Both lme4 and MixedModels have an nAGQ argument; the biggest difference is that lme4 re-uses nAGQ=0 to do a slightly different optimization, while MixedModels.jl has fast=truefor that case.

I'm not sure that PIRLS is where the biggest problem is for the dispersion parameter. My suspicion is that there is an incorrect computation in the deviance (note that in the current formulation, the dispersion parameter doesn't play a role in the computation of the GLMM deviance, which seems ... wrong). Besides the licensing difficulties, one of the challenges in porting things from lme4 to MixedModels.jl is that the deviance computation is done very differently, e.g. in R aic is used with a family argument to compute the deviance, but in Julia it's exactly the other way around (because AIC is ultimately defined in terms of -2 log likelihood, which is the deviance, up to an additive constant).

We are of course thrilled about contributions, but I wanted to be straightforward about some of the the difficulties. 😄

@stevencarlislewalker
Copy link

It has unfortunately been too long since I've worked on mixed effects models and lme4.

The fact that it's GPL'd and so any derivative work must be GPL'd

I don't think that I even specified the license -- does this imply GPL? Anyways I'm quite happy to use MIT. Let me know if this moves forward and the license change is actually required.

in the current formulation, the dispersion parameter doesn't play a role in the computation of the GLMM deviance, which seems ... wrong

I agree that it is wrong. Generally speaking the pls function got further along than pirls. The pls function was helpful when writing the JSS paper -- see bottom of page 19. We had plans of publishing a GLMM version of that paper that would similarly use pirls, but it never happened because I needed to move on and I suspect that pirls in lme4pureR was just never finished for this reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants