Not understanding tolerance with Cholesky decomposition #136

pdeffebach · 2021-04-29T15:34:18Z

In GLM in Julia, we are trying to diagnose a "false positive" rank deficiency in Cholesky decomposition for OLS here.

I'm trying to benchmark why a lm call in Julia fails while a feols call in fixest succeeds. I'm comparing with fixest because R's lm uses QR decomposition by default, but it seems fixest uses Cholesky, like GLM.jl.

Consider the following MWE with fixest from cpp_cholesky defined here.

library(tidyverse)
library(fixest)
cpp_cholesky = utils::getFromNamespace("cpp_cholesky", "fixest"); 
df = read_csv("https://github.com/JuliaStats/GLM.jl/files/6384056/Car-Training.csv")
m = feols(Price ~ Year + Mileage, df)
X = model.matrix(m)
XtX = t(X) %*% X 
cpp_cholesky(XtX, tol=80.0) # full rank
cpp_cholesky(XtX, tol=81.0) # not full rank

Any idea why tol=81 fails? It's odd that the tolerance that leads cpp_cholesky to conclude XtX is rank-deficient is so high, given that the default tolerance in the cpp_cholesky is 1e-10.

I would like to explore this algorithm for computing Cholesky decomposition compared to the LAPACK one that Julis calling. But I don't think I'm getting the tolerance right.

Any insight is helpful. Thank you.

The text was updated successfully, but these errors were encountered:

lrberge · 2021-04-30T09:20:42Z

Hi, the algorithm I implemented is a bit peculiar. For speed consideration, I don't apply any pivoting: given a threshold (here 1e-10), variables are removed on-the-fly as the algorithm goes. This means that the order of the variables matters:

# Let's reorder the columns
X1 = X
X2 = X[, c(2, 1, 3)]
X3 = X[, 3:1]

# => leads to different thresholds
cpp_cholesky(crossprod(X1))$min_norm
#> [1] 80.42991
cpp_cholesky(crossprod(X2))$min_norm
#> [1] 1.991382e-05
cpp_cholesky(crossprod(X3))$min_norm
#> [1] 1.366587e-05

The difference can be large, as shown in your example. But my algorithm is really non standard, although it works rather well.

From my viewpoint, it's on the user side to ensure that his/her model is not ill conditioned. Removing variables on the fly is just for convenience but I really consider it not good practice. By construction, different algorithms will pick different variables since there is no "truth" in that regard.

Hope it clarifies a bit how it works!

lrberge · 2021-05-03T07:49:33Z

OK, I'm closing. Let me know if you need more information.

pdeffebach · 2021-05-03T12:59:30Z

Thanks for the response, and apologies for the slow reply.

It's good to know this wasn't a bug! I will do more to understand this algorithm after finals and see how it compares to LAPACK's.

From my viewpoint, it's on the user side to ensure that his/her model is not ill conditioned.

Yes, this is the tension. Hopefully we can find a solution in GLM.jl that maintains flexibility while also being numerically accurate.

Thank you!

lrberge closed this as completed May 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not understanding tolerance with Cholesky decomposition #136

Not understanding tolerance with Cholesky decomposition #136

pdeffebach commented Apr 29, 2021

lrberge commented Apr 30, 2021 •

edited

Loading

lrberge commented May 3, 2021

pdeffebach commented May 3, 2021

Not understanding tolerance with Cholesky decomposition #136

Not understanding tolerance with Cholesky decomposition #136

Comments

pdeffebach commented Apr 29, 2021

lrberge commented Apr 30, 2021 • edited Loading

lrberge commented May 3, 2021

pdeffebach commented May 3, 2021

lrberge commented Apr 30, 2021 •

edited

Loading