Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge.data.table ignores incomparables without warnings. #2587

Closed
GBsuperman opened this issue Jan 24, 2018 · 4 comments · Fixed by #5233
Closed

merge.data.table ignores incomparables without warnings. #2587

GBsuperman opened this issue Jan 24, 2018 · 4 comments · Fixed by #5233
Assignees
Labels
bug joins Use label:"non-equi joins" for rolling, overlapping, and non-equi joins Low
Milestone

Comments

@GBsuperman
Copy link

GBsuperman commented Jan 24, 2018

example of using 'incomparables' in DF.

x <- data.frame(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)
y <- data.frame(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)
merge(x, y, by = "k2", incomparables = NA) # 2 rows
#   k2 k1.x data.x k1.y data.y
# 1  4    4      4    4      4
# 2  5    5      5    5      5

example of using 'incomparables' in DT.

x <- data.table(k1 = c(NA,NA,3,4,5), k2 = c(1,NA,NA,4,5), data = 1:5)
y <- data.table(k1 = c(NA,2,NA,4,5), k2 = c(NA,NA,3,4,5), data = 1:5)
merge(x, y, by = "k2", incomparables = NA) # 6 rows
#    k2 k1.x data.x k1.y data.y
# 1: NA   NA      2   NA      1
# 2: NA   NA      2    2      2
# 3: NA    3      3   NA      1
# 4: NA    3      3    2      2
# 5:  4    4      4    4      4
# 6:  5    5      5    5      5

Output of sessionInfo()

R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS  10.13.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2          tidyr_0.7.2           dplyr_0.7.4           ggplot2_2.2.1        
 [5] gtools_3.5.0          knitcitations_1.0.8   knitr_1.18            GBsuperman_0.0.0.9000
 [9] data.table_1.10.4-3   roxygen2_6.0.1        devtools_1.13.4       S4Vectors_0.12.2     
[13] BiocGenerics_0.20.0  

loaded via a namespace (and not attached):
 [1] nlme_3.1-131              bitops_1.0-6              lubridate_1.7.1           dimRed_0.1.0             
 [5] RColorBrewer_1.1-2        httr_1.3.1                rprojroot_1.3-2           grpreg_3.1-2             
 [9] ggbiplot_0.55             tools_3.3.3               backports_1.1.2           utf8_1.1.3               
[13] R6_2.2.2                  rpart_4.1-11              lazyeval_0.2.1            colorspace_1.3-2         
[17] nnet_7.3-12               withr_2.1.1               tidyselect_0.2.3          gridExtra_2.3            
[21] mnormt_1.5-5              curl_3.1                  git2r_0.21.0              cli_1.0.0                
[25] rvest_0.3.2               glmnet_2.0-13             xml2_1.1.1                desc_1.1.1               
[29] scales_0.5.0              sfsmisc_1.1-1             DEoptimR_1.0-8            psych_1.7.8              
[33] robustbase_0.92-8         commonmark_1.4            stringr_1.2.0             digest_0.6.13            
[37] foreign_0.8-69            pkgconfig_2.0.1           bibtex_0.4.2              highr_0.6                
[41] rlang_0.1.6               ddalpha_1.3.1             rstudioapi_0.7            bindr_0.1                
[45] jsonlite_1.5              ModelMetrics_1.1.0        RCurl_1.95-4.10           magrittr_1.5             
[49] leaps_3.0                 Matrix_1.2-12             Rcpp_0.12.14              munsell_0.4.3            
[53] RefManageR_0.14.20        stringi_1.1.6             pROC_1.10.0               ggpmisc_0.2.16           
[57] MASS_7.3-48               plyr_1.8.4                recipes_0.1.1             grid_3.3.3               
[61] crayon_1.3.4              lattice_0.20-35           splines_3.3.3             HiClimR_1.2.3            
[65] MetaboAnalystR_0.0.0.9000 pillar_1.0.1              webchem_0.3.0             reshape2_1.4.3           
[69] codetools_0.2-15          CVST_0.2-1                glue_1.2.0                evaluate_0.10.1          
[73] foreach_1.4.4             gtable_0.2.0              purrr_0.2.4               kernlab_0.9-25           
[77] assertthat_0.2.0          DRR_0.0.3                 gower_0.1.2               prodlim_1.6.1            
[81] broom_0.4.3               class_7.3-14              survival_2.41-3           timeDate_3042.101        
[85] RcppRoll_0.2.2            tibble_1.4.1              iterators_1.0.9           memoise_1.1.0            
[89] lava_1.5.1                bestglm_0.36              caret_6.0-78              ipred_0.9-6  
@MarkusBonsch
Copy link
Contributor

MarkusBonsch commented Jan 25, 2018

good point. 2 Options. Implement "incomparables" or warn. I will see if implementing seems feasible on short notice. Otherwise, I will include the warning. Definitely, there will be a warning, if any argument is given that ends up in the ignored ....

@MarkusBonsch
Copy link
Contributor

I have looked into the issue. It can be solved with reasonable effort, but while I was writing the corresponding unit tests, I stumbled upon a much bigger problem: #2592. The latter needs to be solved first, definitely. I have created a master issue to collect all open inconsistencies between merge.data.frame and merge.data.table: #2593.

@jaapwalhout
Copy link

jaapwalhout commented Jan 19, 2019

@mattdowle mattdowle added this to the 1.12.4 milestone May 17, 2019
@jangorecki jangorecki modified the milestones: 1.12.4, 1.13.0 Sep 17, 2019
@mattdowle mattdowle modified the milestones: 1.12.7, 1.12.9 Dec 8, 2019
@jangorecki jangorecki added the joins Use label:"non-equi joins" for rolling, overlapping, and non-equi joins label Apr 6, 2020
@mattdowle mattdowle modified the milestones: 1.13.1, 1.13.3 Oct 17, 2020
@ben-schwen
Copy link
Member

I guess we have 3 options here:

  1. Add a warning to merge for applying unused arguments
  2. Add an incomparables argument only to the merge wrapper to add compatibility to base
  3. Add an incomparables argument to both [ and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug joins Use label:"non-equi joins" for rolling, overlapping, and non-equi joins Low
Projects
None yet
8 participants