-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block duplicates in assignment #4634
Conversation
I agree we should prevent duplicate names. This PR seems to mainly detect that a Fundamentally, it seems like if we more-or-less lock people out of their data.table in library(data.table)
DT <- data.table(x=1:2, y=3:4, x=5:6, x=7:8, y=9:10, z=11:12)
DT[, x] ##copied from one of the AppVeyor for test 1229.28
## Observed: Unable to disambiguate reference to duplicated columns in x: [x] I also thought this PR would address users trying to assign duplicate names but I do not believe it does. This would still work (although it may be out-of-scope of the FR / PR): data.table(x = 1)[ , .(y = 1, y = 2)]
## y y
## <num> <num>
##1: 1 2 Another idea for if we are mainly erroring when users select a name that is already duplicated, it would be nice to have a |
## VIOLATES TESTED BEHAVIOR WRT DUPLICATES, SEE TESTS 1290 ** | ||
if (anyDuplicated(used_from_x <- dupintersect(names_x, av))) { | ||
dupcols = unique(used_from_x[duplicated(used_from_x)]) | ||
stop("Unable to disambiguate reference to duplicated columns in x: ", brackify(dupcols)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
stop(gettextf("Unable to disambiguate to duplicated columns in %s: %s", deparse(substitute(x)), brackify(dupcols), domain = ...)
This will allow the original name of the data.table
to print instead of x
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great in simple cases, but e.g. data.table(x = 1)[[ , c('y', 'y') := .(1, 2)]
and other dynamic cases, or when a function the user calls is the one that creates the error, it may not be as helpful. There might be an argument for building something like guess_name(x)
that checks for the "non-standard" cases and returns x
, otherwise a user-friendly name, but that also comes with some maintenance burden...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point. Here is a simple alternative that should both be maintainable but still allow for user-friendly messages.
fx = function(x) {
print(if (is.name(x_sub <- substitute(x))) deparse(x_sub) else "x")
}
fx(iris)
#> [1] "iris"
fx(iris[])
#> [1] "x"
edit - and I do not see this as a separate function. I would just drop it into the gettextf(...)
biti.
to be clear I would not block duplicates in all cases -- e.g. |
Closing as discussed in #3077 |
Closes #3077
on hold until we decide the right behavior (see issue comments)