-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not understanding tolerance with Cholesky decomposition #136
Comments
Hi, the algorithm I implemented is a bit peculiar. For speed consideration, I don't apply any pivoting: given a threshold (here 1e-10), variables are removed on-the-fly as the algorithm goes. This means that the order of the variables matters: # Let's reorder the columns
X1 = X
X2 = X[, c(2, 1, 3)]
X3 = X[, 3:1]
# => leads to different thresholds
cpp_cholesky(crossprod(X1))$min_norm
#> [1] 80.42991
cpp_cholesky(crossprod(X2))$min_norm
#> [1] 1.991382e-05
cpp_cholesky(crossprod(X3))$min_norm
#> [1] 1.366587e-05 The difference can be large, as shown in your example. But my algorithm is really non standard, although it works rather well. From my viewpoint, it's on the user side to ensure that his/her model is not ill conditioned. Removing variables on the fly is just for convenience but I really consider it not good practice. By construction, different algorithms will pick different variables since there is no "truth" in that regard. Hope it clarifies a bit how it works! |
OK, I'm closing. Let me know if you need more information. |
Thanks for the response, and apologies for the slow reply. It's good to know this wasn't a bug! I will do more to understand this algorithm after finals and see how it compares to LAPACK's.
Yes, this is the tension. Hopefully we can find a solution in GLM.jl that maintains flexibility while also being numerically accurate. Thank you! |
In GLM in Julia, we are trying to diagnose a "false positive" rank deficiency in Cholesky decomposition for OLS here.
I'm trying to benchmark why a
lm
call in Julia fails while afeols
call infixest
succeeds. I'm comparing withfixest
because R'slm
uses QR decomposition by default, but it seemsfixest
uses Cholesky, like GLM.jl.Consider the following MWE with
fixest
fromcpp_cholesky
defined here.Any idea why
tol=81
fails? It's odd that the tolerance that leadscpp_cholesky
to concludeXtX
is rank-deficient is so high, given that the default tolerance in thecpp_cholesky
is1e-10
.I would like to explore this algorithm for computing Cholesky decomposition compared to the LAPACK one that Julis calling. But I don't think I'm getting the tolerance right.
Any insight is helpful. Thank you.
The text was updated successfully, but these errors were encountered: