Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy does not overallocate nested lists of data.tables #4205

Closed
d-sci opened this issue Jan 29, 2020 · 0 comments · Fixed by #4206
Closed

copy does not overallocate nested lists of data.tables #4205

d-sci opened this issue Jan 29, 2020 · 0 comments · Fixed by #4206
Milestone

Comments

@d-sci
Copy link
Contributor

d-sci commented Jan 29, 2020

Similar to #1476 , if I copy a list of lists (of lists ...) of data.tables I will get the notorious "Invalid .internal.selfref detected" warning when later modifying the data.table elements by reference. While this has since been fixed for lists one layer deep, for people who use data.table as their main form of data representation, it is not so rare to have nested lists of data.tables, and an occasional need to take a deep copy.

library(data.table)
x = list(a = data.table("A"), b = list(bb = data.table("B")))
y = copy(x)

# All good one layer deep
y$a[, new := "new"]
y$a
#>    V1 new
#> 1:  A new

# Warning on second layer
record <- y$b$bb[, new := "new"]
#> Warning in `[.data.table`(y$b$bb, , `:=`(new, "new")): Invalid .internal.selfref
#> detected and fixed by taking a (shallow) copy of the data.table so that :=
#> can add this new column by reference. At an earlier point, this data.table
#> has been copied by R (or was created manually using structure() or similar).
#> Avoid names<- and attr<- which in R currently (and oddly) may copy the whole
#> data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?
#> setattr. If this message doesn't help, please report your use case to the
#> data.table issue tracker so the root cause can be fixed or this message
#> improved.
# And although it did something ...
record[]
#>    V1 new
#> 1:  B new
# ... it didn't persist
y$b$bb
#>    V1
#> 1:  B

sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS Mojave 10.14.6
#> 
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] data.table_1.12.8
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.5.2  magrittr_1.5    tools_3.5.2     htmltools_0.4.0
#>  [5] yaml_2.2.0      Rcpp_1.0.3      stringi_1.4.3   rmarkdown_1.16 
#>  [9] highr_0.8       knitr_1.26      stringr_1.4.0   xfun_0.11      
#> [13] digest_0.6.22   rlang_0.4.2     evaluate_0.14

Proposed solution to follow.

d-sci pushed a commit that referenced this issue Jan 29, 2020
@mattdowle mattdowle added this to the 1.12.9 milestone Feb 17, 2020
@jangorecki jangorecki modified the milestones: 1.12.11, 1.12.9 May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants