-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List column support in data.table #4290
Comments
library(tidyr)
library(data.table)
df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1)
df %>% nest(data = c(y, z))
#> # A tibble: 3 x 2
#> x data
#> <dbl> <list>
#> 1 1 <tibble [3 × 2]>
#> 2 2 <tibble [2 × 2]>
#> 3 3 <tibble [1 × 2]>
df %>% chop(c(y, z))
#> # A tibble: 3 x 3
#> x y z
#> <dbl> <list> <list>
#> 1 1 <int [3]> <int [3]>
#> 2 2 <int [2]> <int [2]>
#> 3 3 <int [1]> <int [1]>
dt <- as.data.table(df)
dt[, .(data = list(.SD)), keyby = x]
#> x data
#> <num> <list>
#> 1: 1 <data.table[3x2]>
#> 2: 2 <data.table[2x2]>
#> 3: 3 <data.table[1x2]>
dt[, lapply(.SD, list), keyby = x, .SDcols = c('y', 'z')]
#> x y z
#> <num> <list> <list>
#> 1: 1 1,2,3 6,5,4
#> 2: 2 4,5 3,2
#> 3: 3 6 1 Created on 2020-03-09 by the reprex package (v0.3.0) |
You might also check out @TysonStanley 's https://resources.rstudio.com/rstudio-conf-2020/list-columns-in-data-table-tyson-s-barrett |
Thank you for the prompt feedback. I tried some of these, but meet some trouble.
|
For that, you'll have to wait for this: In turn I've been waiting for the work on For now, you'll have to know your input types -- |
Thanks, I know his work, there is an article too. Check https://osf.io/f6pxw/download. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
A verbose but work solution should be : library(data.table)
dt <- data.table(
x = 11:13,
y = list(
NULL,
data.table(a = 1, b = 2),
data.table(a = 1:3, b = 3:1)
)
)
dt[, ID := seq_len(.N)]
rbindlist(dt$y, idcol = "ID")[dt[, .(x, ID)], on = "ID", nomatch = 0L][, ID := NULL][]
#> a b x
#> <num> <num> <int>
#> 1: 1 2 12
#> 2: 1 3 13
#> 3: 2 2 13
#> 4: 3 1 13
rbindlist(dt$y, idcol = "ID")[dt[, .(x, ID)], on = "ID"][, ID := NULL][]
#> a b x
#> <num> <num> <int>
#> 1: NA NA 11
#> 2: 1 2 12
#> 3: 1 3 13
#> 4: 2 2 13
#> 5: 3 1 13 Created on 2020-03-09 by the reprex package (v0.3.0) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Still, there's no uniform way to handle it. Some examples:
I am considering what could be in each cell of the table. A data.table, a vector, a list. How could we handle all of them in a consistent way (which tidyr seems to tackle already). |
Just add a library(data.table)
dt <- data.table(
x = c(2,3,1),
y = list(
1:3,
4:5,
7:9
)
)
# tidyr
tidyr::unnest(dt,y)
#> # A tibble: 8 x 2
#> x y
#> <dbl> <int>
#> 1 2 1
#> 2 2 2
#> 3 2 3
#> 4 3 4
#> 5 3 5
#> 6 1 7
#> 7 1 8
#> 8 1 9
# data.table
dt[, ID := seq_len(.N)]
rbindlist(lapply(dt$y, as.data.table), idcol = "ID")[dt[, .(x, ID)], on = "ID", nomatch = 0L][, ID := NULL][]
#> V1 x
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 3
#> 5: 5 3
#> 6: 7 1
#> 7: 8 1
#> 8: 9 1
rbindlist(lapply(dt$y, as.data.table), idcol = "ID")[dt[, .(x, ID)], on = "ID"][, ID := NULL][]
#> V1 x
#> 1: 1 2
#> 2: 2 2
#> 3: 3 2
#> 4: 4 3
#> 5: 5 3
#> 6: 7 1
#> 7: 8 1
#> 8: 9 1 Created on 2020-03-09 by the reprex package (v0.3.0) |
Nice, turn everything into data.table and then use |
Any hints for the |
I need an example. I don't know the differences between |
Example:
While this is no difference in this example, I don't know how to make it back in data.table. |
Recently, I tried to get rid of tidyverse in every aspect and see if
data.table
could do the same more efficiently. I know that nowdata.table
supports functions likenest
andunnest
in tidyr. However, could I find an example for all the data.table way to run examples intidyr::nest
andtidyr::chop
? Any hints? Thanks.The text was updated successfully, but these errors were encountered: