Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unique returns a mutable DT alias when number of rows is <= 1 #5932

Closed
dshemetov opened this issue Feb 9, 2024 · 3 comments · Fixed by #5960
Closed

unique returns a mutable DT alias when number of rows is <= 1 #5932

dshemetov opened this issue Feb 9, 2024 · 3 comments · Fixed by #5960
Milestone

Comments

@dshemetov
Copy link
Contributor

# Minimal reproducible example; please be sure to set verbose=TRUE where possible!

# Alias
library(data.table)
DT <- data.table(a = 1, b = 1)
DT2 = unique(DT)
address(DT)
#> [1] "0x55b61aaf4b40"
address(DT2)
#> [1] "0x55b61aaf4b40"

# No alias
library(data.table)
DT <- data.table(a = c(1, 2), b = c(3, 4))
DT2 = unique(DT)
address(DT)
#> [1] "0x55573fddfba0"
address(DT2)
#> [1] "0x555741ec0f80"

This behavior seems to be pretty clearly indicated here. Is this intended? We're adding logic to our package around unique to ensure safety.

# Output of sessionInfo()

r$> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] data.table_1.14.8 lubridate_1.9.3   forcats_1.0.0     stringr_1.5.0    
 [5] dplyr_1.1.4       purrr_1.0.2       readr_2.1.4       tidyr_1.3.0      
 [9] tibble_3.2.1      tidyverse_2.0.0   ggplot2_3.4.3    

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5      rspm_0.4.0       cli_3.6.1        rlang_1.1.1     
 [5] stringi_1.7.12   generics_0.1.3   glue_1.6.2       colorspace_2.1-0
 [9] hms_1.1.3        scales_1.2.1     fansi_1.0.4      grid_4.3.1      
[13] munsell_0.5.0    tzdb_0.4.0       lifecycle_1.0.3  compiler_4.3.1  
[17] timechange_0.2.0 pkgconfig_2.0.3  R6_2.5.1         tidyselect_1.2.0
[21] utf8_1.2.3       pillar_1.9.0     magrittr_2.0.3   tools_4.3.1     
[25] withr_2.5.0      gtable_0.3.4    
@dshemetov
Copy link
Contributor Author

Found this conversation that seemed in agreement about fixing it for the maxgrpn case, filed and fixed here, but the nrows(x) <= 1 case wasn't touched in that fix.

@MichaelChirico
Copy link
Member

LGTM & it looks like you've already found what to fix. Do you want to file a PR?

@jangorecki jangorecki added this to the 1.16.0 milestone Feb 18, 2024
@dshemetov
Copy link
Contributor Author

Sure, can do that sometime soon.

dshemetov added a commit to dshemetov/data.table that referenced this issue Feb 24, 2024
MichaelChirico added a commit that referenced this issue Feb 26, 2024
* fix: mutable dt alias from unique.data.table #5932
Thanks to @brookslogan for the original bug report.

* doc+test: unique mutable alias #5960

* NEWS links issue, not PR

Co-authored-by: Benjamin Schwendinger <[email protected]>

* restore whitespace diff

* improve NEWS wording

* explanatory comment

Co-authored-by: Benjamin Schwendinger <[email protected]>

* apply suggested changes

* move NEWS item

* integrate test with earlier similar test

---------

Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants