Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PM2-B #30

Open
alexanderquispe opened this issue Apr 13, 2021 · 9 comments
Open

PM2-B #30

alexanderquispe opened this issue Apr 13, 2021 · 9 comments

Comments

@alexanderquispe
Copy link
Owner

  1. Inference on Predictive and Causal Effects in High Dimensional Linear Regression Models
alexanderquispe added a commit that referenced this issue Apr 13, 2021
@alexanderquispe
Copy link
Owner Author

  • pm2_notebook_jannis
    In this JN we use the double Lassso regression. Again we face the problem of equivalent "alphas"
    I used alpha = 0.00077 (manually chosen) as the best proxy to get similar results as in the RN
    Coefficients are similar but CI are not equal.

@alexanderquispe
Copy link
Owner Author

alexanderquispe commented Apr 13, 2021

*python-notebook-experiment-on-orthogonal-learning
I found that the iteration process may have an error.
image

I think that this line:
if sum(SX_IDs) == 0 :
Naive[ 0 ] = sm.OLS( Y , sm.add_constant(D) ).fit().summary2().tables[1].round(3).iloc[ 1, 0 ]

should be replaced by

if sum(SX_IDs) == 0 :
Naive[ i ] = sm.OLS( Y , sm.add_constant(D) ).fit().summary2().tables[1].round(3).iloc[ 1, 0 ]

Otherwise, the iteration is not saving the results for that regression.
Also when I checked the results of the naive matrix you got and the naive matrix from the RNotebook the mean is totally different. But when I change this the results seem to converge.
@anzonyqr , let me know if this makes sense to you.
Thanks!

@alexanderquispe
Copy link
Owner Author

This is what I found in the R code :
image
This is what we have in the Python code:
image

I will divide by 4 the error term. And also modify the iterator as I did in the code above.

alexanderquispe added a commit that referenced this issue Apr 13, 2021
Revision of all codes
@anzonyquispe
Copy link
Collaborator

anzonyquispe commented Apr 13, 2021

*python-notebook-experiment-on-orthogonal-learning
I found that the iteration process may have an error.
image

I think that this line:
if sum(SX_IDs) == 0 :
Naive[ 0 ] = sm.OLS( Y , sm.add_constant(D) ).fit().summary2().tables[1].round(3).iloc[ 1, 0 ]

should be replaced by

if sum(SX_IDs) == 0 :
Naive[ i ] = sm.OLS( Y , sm.add_constant(D) ).fit().summary2().tables[1].round(3).iloc[ 1, 0 ]

Otherwise, the iteration is not saving the results for that regression.
Also when I checked the results of the naive matrix you got and the naive matrix from the RNotebook the mean is totally different. But when I change this the results seem to converge.
@anzonyqr , let me know if this makes sense to you.
Thanks!

You are right. I made a mistake. Zero should be change for i otherwise Naive will not save results.

@anzonyquispe
Copy link
Collaborator

This is what I found in the R code :
image
This is what we have in the Python code:
image

I will divide by 4 the error term. And also modify the iterator as I did in the code above.

You are right. I have already made corrections. Now, the python script is equal to r code.

@SandraMartinezGutierrez
Copy link
Collaborator

[JULIA SCRIPT]
When running OLS regression, there is a problem with the intercepts due to the Cholesky factorization.
image

Found a solution here:

  1. default intercept behavior JuliaStats/StatsModels.jl#31
  2. Incorrect linear regression results JuliaStats/GLM.jl#426

Final output:
image

Notice that to iterate over columns from an specific data, I used:

term(:y) ~ sum(term.(names(data[!, Not(["y", "intercept"])]))

@SandraMartinezGutierrez
Copy link
Collaborator

[JULIA SCRIPT]

In this section of the Python script, I was looking for a function like .set_index() in python at Julia dataframe.
image

I've found out NamedArray can give similar result with .set_index() in Python as below:

image

However, NamedArray returns as matrix format only, and I could not find a function to be used in Julia dataframe. For this reason, it is recommended to add the "index information" as a column in the dataframe. Then, it is possible to make a groupby to the data by that column to have a quick lookup.

Documentation about NamedArrays can be found here:

  1. https://github.com/davidavdav/NamedArrays.jl

Final output:
image

@SandraMartinezGutierrez
Copy link
Collaborator

[JULIA SCRIPT]

In the Python script, when trying to convert this table to HTM, it vas only necessary to add "to_html()".

image

However, in Julia, I found out an interesting way to convert tables to HTML.

For more information:

  1. https://ronisbr.github.io/PrettyTables.jl/stable/
  2. https://github.com/ronisbr/PrettyTables.jl

Final output:
image

SandraMartinezGutierrez added a commit that referenced this issue Mar 7, 2022
Problems when running Lasso about to be solved
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
@SandraMartinezGutierrez
Copy link
Collaborator

  • pm2_notebook_jannis
    In this JN we use the double Lassso regression. Again we face the problem of equivalent "alphas"
    I used alpha = 0.00077 (manually chosen) as the best proxy to get similar results as in the RN
    Coefficients are similar but CI are not equal.

[JULIA SCRIPT]

Julia_pm2_notebook_jannis
In this notebook, when using the double Lassso regression, we face the problem of equivalent "alphas"
I used alpha = 0..8 (manually chosen) as the best proxy to get similar results as in the RN
Coefficients are similar but CI are not equal.

R Notebook::
image

Julia Notebook:
image

SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 8, 2022
SandraMartinezGutierrez added a commit that referenced this issue Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants