Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't delete a column during combined by call? #1873

Closed
MichaelChirico opened this issue Oct 9, 2016 · 2 comments · Fixed by #6080
Closed

Can't delete a column during combined by call? #1873

MichaelChirico opened this issue Oct 9, 2016 · 2 comments · Fixed by #6080

Comments

@MichaelChirico
Copy link
Member

MichaelChirico commented Oct 9, 2016

This comes out of this attempted (Japanese) SO answer of mine.

At essence, I wanted to to two things within []: add some new columns, and delete a column.

MRE:

DT = data.table(id = c(1, 1, 2, 2), a = 1:4, b = 5:8)
DT[ , c("c", "a") := .(a + 1, NULL), by = id]

Gives error:

Error in [.data.table(DT, , :=(c("c", "a"), .(a + 1, NULL)), by = id) :
Type of RHS ('NULL') must match LHS ('integer'). To check and coerce would impact performance too much for the fastest cases. Either change the type of the target column, or coerce the RHS of := yourself (e.g. by using 1L instead of 1)

However, this approach usually works:

DT[ , c("c", "a") := .(a + 1, NULL)][]
#    id b c
#1:  1 5 2
#2:  1 6 3
#3:  2 7 4
#4:  2 8 5

Can't see why we should be able to do it by, and in fact it works if deleting is all we're trying to do:

DT[ , a := NULL, by = id][]
#    id a b
#1:  1 1 5
#2:  1 2 6
#3:  2 3 7
#4:  2 4 8

So something's going wrong when we are trying to both delete a column in a by operation and create some other columns at the same time. (i.e., the operation also succeeds if we try and delete two columns: DT[ , c("a", "b") := NULL, by = id]; and it also fails if the created column doesn't depend on the column to be deleted: DT[ , c("c", "a") := .(b + 1, NULL), by = id])

@MichaelChirico
Copy link
Member Author

Delete column by group sounds like a strange operation to me. Since operations are by reference, there's "no" cost to just splitting this into two [ queries.

To close this issue, we might try and improve the error message.

@MichaelChirico
Copy link
Member Author

Looks like the message has already improved in the interim:

DT[ , c("c", "a") := .(a + 1, NULL), by = id]
Error in `[.data.table`(DT, , `:=`(c("c", "a"), .(a + 1, NULL)), by = id) : 
  RHS of := is NULL during grouped assignment, but it's not possible to delete parts of a column.

This happened in #3310 but I don't see a specific regression test here. Let's add one to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant