Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setcolorder gains before= and after= #4691

Merged
merged 10 commits into from
Aug 26, 2021
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@

24. `DT[, head(.SD,n), by=grp]` and `tail` are now optimized when `n>1`, [#5060](https://github.com/Rdatatable/data.table/issues/5060) [#523](https://github.com/Rdatatable/data.table/issues/523#issuecomment-162934391). `n==1` was already optimized. Thanks to Jan Gorecki and Michael Young for requesting, and Benjamin Schwendinger for the PR.

25 `setcolorder()` gains `before=` and `after=`, [#4385](https://github.com/Rdatatable/data.table/issues/4358). Thanks to Matthias Gomolka for the request, and both Benjamin Schwendinger and Xianghui Dong for implementing.

## BUG FIXES

1. `by=.EACHI` when `i` is keyed but `on=` different columns than `i`'s key could create an invalidly keyed result, [#4603](https://github.com/Rdatatable/data.table/issues/4603) [#4911](https://github.com/Rdatatable/data.table/issues/4911). Thanks to @myoung3 and @adamaltmejd for reporting, and @ColeMiller1 for the PR. An invalid key is where a `data.table` is marked as sorted by the key columns but the data is not sorted by those columns, leading to incorrect results from subsequent queries.
Expand Down
14 changes: 10 additions & 4 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -2658,15 +2658,21 @@ setnames = function(x,old,new,skip_absent=FALSE) {
invisible(x)
}

setcolorder = function(x, neworder=key(x))
setcolorder = function(x, neworder=key(x), before=NULL, after=NULL) # before/after #4358
{
if (is.character(neworder) && anyDuplicated(names(x)))
stopf("x has some duplicated column name(s): %s. Please remove or rename the duplicate(s) and try again.", brackify(unique(names(x)[duplicated(names(x))])))
# if (!is.data.table(x)) stopf("x is not a data.table")
if (!is.null(before) && !is.null(after))
stopf("Provide either before= or after= but not both")
if (length(before)>1 || length(after)>1)
stopf("before=/after= accept a single column name or number, not more than one")
neworder = colnamesInt(x, neworder, check_dups=FALSE) # dups are now checked inside Csetcolorder below
if (length(before))
neworder = c(setdiff(seq_len(colnamesInt(x, before) - 1L), neworder), neworder)
if (length(after))
neworder = c(setdiff(seq_len(colnamesInt(x, after)), neworder), neworder)
if (length(neworder) != length(x)) {
#if shorter than length(x), pad by the missing
# elements (checks below will catch other mistakes)
# pad by the missing elements (checks inside Csetcolorder catch other mistakes)
neworder = c(neworder, setdiff(seq_along(x), neworder))
}
.Call(Csetcolorder, x, neworder)
Expand Down
18 changes: 13 additions & 5 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -1529,14 +1529,22 @@ DT = data.table(a=1:2,b=3:4,c=5:6)
test(495.1, setcolorder(DT,c(2,1,3)), data.table(b=3:4,a=1:2,c=5:6))
test(495.2, setcolorder(DT,c(2,1,3)), data.table(a=1:2,b=3:4,c=5:6))
test(496, setcolorder(DT,c("c","a","b")), data.table(c=5:6,a=1:2,b=3:4))
test(497, setcolorder(DT,c("d","a","b")), error="specify non existing column*.*d")
test(497.01, setcolorder(DT,c("d","a","b")), error="specify non existing column*.*d")
DT = data.table(a = 1:3, b = 2:4, c = 3:5)
test(498.1, names(setcolorder(DT, "b")), c("b", "a", "c"))
test(498.2, names(setcolorder(DT, c(2, 3))), c("a", "c", "b"))
test(498.3, setcolorder(DT, 1:4), error = "specify non existing column*.*4")
test(497.02, names(setcolorder(DT, "b")), c("b", "a", "c"))
test(497.03, names(setcolorder(DT, c(2, 3))), c("a", "c", "b"))
test(497.04, setcolorder(DT, 1:4), error = "specify non existing column*.*4")
# Test where neworder=NULL, thus ordered by key and index columns
DT = data.table(a = 1:3, b = 2:4, c = 3:5, d = 4:6, key="b")
test(498.4, names(setcolorder(DT)), c("b", "a", "c", "d"))
test(497.05, names(setcolorder(DT)), c("b", "a", "c", "d"))
# new arguments before= and after=, #4358
DT = data.table(a=1, b=2, c=3)
test(498.01, setcolorder(DT, "a", after="c"), data.table(b=2, c=3, a=1))
test(498.02, setcolorder(DT, "a", before="b"), data.table(a=1, b=2, c=3))
test(498.03, setcolorder(DT, 1, after=3), data.table(b=2, c=3, a=1))
test(498.04, setcolorder(DT, 3, before=1), data.table(a=1, b=2, c=3))
test(498.05, setcolorder(DT, 1, before=1, after=1), error="Provide either before= or after= but not both")
test(498.06, setcolorder(DT, 1, before=1:2), error="before=/after= accept a single column name or number, not more than one")

# test first group listens to nomatch when j uses join inherited scope.
x <- data.table(x=c(1,3,8),x1=10:12, key="x")
Expand Down
3 changes: 2 additions & 1 deletion man/setcolorder.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,12 @@
}

\usage{
setcolorder(x, neworder=key(x))
setcolorder(x, neworder=key(x), before=NULL, after=NULL)
}
\arguments{
\item{x}{ A \code{data.table}. }
\item{neworder}{ Character vector of the new column name ordering. May also be column numbers. If \code{length(neworder) < length(x)}, the specified columns are moved in order to the "front" of \code{x}. By default, \code{setcolorder} without a specified \code{neworder} moves the key columns in order to the "front" of \code{x}. }
\item{before, after}{ If one of them (not both) was provided with a column name or number, \code{neworder} will be inserted before or after that column. }
}
\details{
To reorder \code{data.table} columns, the idiomatic way is to use \code{setcolorder(x, neworder)}, instead of doing \code{x <- x[, neworder, with=FALSE]}. This is because the latter makes an entire copy of the \code{data.table}, which maybe unnecessary in most situations. \code{setcolorder} also allows column numbers instead of names for \code{neworder} argument, although we recommend using names as a good programming practice.
Expand Down