-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow a single column to be used as rownames in as.matrix #2702
Changes from 21 commits
0389de1
b1590ae
1323be0
ddaeb6a
0ab7e4f
8477788
ac52d9a
b9eab65
c0cca0d
11da144
5b4bca7
c5ae94b
f07b813
3d4681c
d9a4a54
de48d84
2810538
8348172
12cace8
895554e
a887594
cde74a2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1881,17 +1881,59 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) { | |
# x | ||
#} | ||
|
||
|
||
as.matrix.data.table <- function(x,...) | ||
{ | ||
dm <- dim(x) | ||
cn <- names(x) | ||
as.matrix.data.table <- function(x, rownames, ...) { | ||
rn <- NULL | ||
rnc <- NULL | ||
if (!missing(rownames)) { # Convert rownames to a column index if possible | ||
if (length(rownames) == nrow(x)) { | ||
# rownames argument is a vector of row names, no column in x to drop. | ||
rn <- rownames | ||
rnc <- NULL | ||
} else if (!is.null(rownames) && length(rownames) != 1L) { # vector(0) will throw an error, but NULL will pass through | ||
stop(sprintf("rownames must be a single column in x or a vector of row names of length nrow(x)=%d", nrow(x))) | ||
} else if (!(is.null(rownames) || is.logical(rownames) || is.character(rownames) || is.numeric(rownames))) { | ||
# E.g. because rownames is some sort of object that can't be converted to a column index | ||
stop("rownames must be TRUE, a column index, a column name in x, or a vector of row names") | ||
} else if (!is.null(rownames) && !is.na(rownames) && !identical(rownames, FALSE)) { # Handles cases where rownames is a column name, or key(x) from TRUE | ||
if (identical(rownames, TRUE)) { | ||
if (haskey(x)) { | ||
rownames <- key(x) | ||
if (length(rownames) > 1L) { | ||
warning(sprintf("rownames is TRUE but multiple keys [%s] found for x; defaulting to first column x[,1]", | ||
paste(rownames, collapse = ','), rownames[1L])) | ||
rownames <- 1L | ||
} | ||
} else { | ||
rownames <- 1L | ||
} | ||
} | ||
if (is.character(rownames)) { | ||
rnc <- chmatch(rownames, names(x)) | ||
if (is.na(rnc)) stop(rownames, " is not a column of x") | ||
} else { # rownames is an index already | ||
if (rownames < 1L || rownames > ncol(x)) | ||
stop(sprintf("rownames is %d which is outside the column number range [1,ncol=%d]", rownames, ncol(x))) | ||
rnc <- rownames | ||
} | ||
} | ||
} | ||
# If the rownames argument has been used, and is a single column, | ||
# extract that column's index (rnc) and drop it from x | ||
if (!is.null(rnc)) { | ||
rn <- x[[rnc]] | ||
dm <- dim(x) - c(0, 1) | ||
cn <- names(x)[-rnc] | ||
X <- x[, .SD, .SDcols = cn] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, yes I discovered There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting, I see. That is a shame about the no visible binding note. |
||
} else { | ||
dm <- dim(x) | ||
cn <- names(x) | ||
X <- x | ||
} | ||
if (any(dm == 0L)) | ||
return(array(NA, dim = dm, dimnames = list(NULL, cn))) | ||
return(array(NA, dim = dm, dimnames = list(rn, cn))) | ||
p <- dm[2L] | ||
n <- dm[1L] | ||
collabs <- as.list(cn) | ||
X <- x | ||
class(X) <- NULL | ||
non.numeric <- non.atomic <- FALSE | ||
all.logical <- TRUE | ||
|
@@ -1936,7 +1978,7 @@ as.matrix.data.table <- function(x,...) | |
} | ||
X <- unlist(X, recursive = FALSE, use.names = FALSE) | ||
dim(X) <- c(n, length(X)/n) | ||
dimnames(X) <- list(NULL, unlist(collabs, use.names = FALSE)) | ||
dimnames(X) <- list(rn, unlist(collabs, use.names = FALSE)) | ||
X | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
\name{as.matrix} | ||
\alias{as.matrix} | ||
\alias{as.matrix.data.table} | ||
\title{Convert a data.table to a matrix} | ||
\description{ | ||
Converts a \code{data.table} into a \code{matrix}, optionally using one | ||
of the columns in the \code{data.table} as the \code{matrix} \code{rownames}. | ||
} | ||
\usage{ | ||
\method{as.matrix}{data.table}(x, rownames, ...)} | ||
|
||
\arguments{ | ||
\item{x}{a \code{data.table}} | ||
\item{rownames}{optional, a single column name or column index to use as | ||
the \code{rownames} in the returned \code{matrix}. If \code{TRUE} the | ||
\code{\link{key}} of the \code{data.table} will be used if it is a | ||
single column, otherwise the first column in the \code{data.table} will | ||
be used. Alternative a vector of length \code{nrow(x)} to assign as the | ||
row names of the returned \code{matrix}.} | ||
\item{\dots}{additional arguments to be passed to or from methods.} | ||
} | ||
|
||
\details{ | ||
\code{\link{as.matrix}} is a generic function in base R. It dispatches to | ||
\code{as.matrix.data.table} if its \code{x} argument is a \code{data.table}. | ||
|
||
The method for \code{data.table}s will return a character matrix if there | ||
are only atomic columns and any non-(numeric/logical/complex) column, | ||
applying \code{\link{as.vector}} to factors and \code{\link{format}} to other | ||
non-character columns. Otherwise, the usual coercion hierarchy (logical < | ||
integer < double < complex) will be used, e.g., all-logical data frames | ||
will be coerced to a logical matrix, mixed logical-integer will give an | ||
integer matrix, etc. | ||
|
||
An additional argument \code{rownames} is provided for \code{as.matrix.data.table} | ||
to facilitate conversions to matrices where the \code{\link{rownames}} are stored | ||
in a single column of \code{x}, e.g. the first column after using | ||
\code{\link{dcast.data.table}}. | ||
} | ||
|
||
\value{ | ||
A new \code{matrix} containing the contents of \code{x}. | ||
} | ||
|
||
\seealso{ | ||
\code{\link{data.table}}, \code{\link{as.matrix}}, \code{\link{data.matrix}} | ||
\code{\link{array}} | ||
} | ||
|
||
\examples{ | ||
(dt1 <- data.table(A = letters[1:10], X = 1:10, Y = 11:20)) | ||
as.matrix(dt1) # character matrix | ||
as.matrix(dt1, rownames = "A") | ||
as.matrix(dt1, rownames = 1) | ||
as.matrix(dt1, rownames = TRUE) | ||
|
||
(dt1 <- data.table(A = letters[1:10], X = 1:10, Y = 11:20)) | ||
setkey(dt1, A) | ||
as.matrix(dt1, rownames = TRUE) | ||
} | ||
|
||
\keyword{ array } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we have a style guide on this, but I note that the corresponding CRAN cheat for
[.data.table
symbols are defined in the package environment rather than the function body:https://github.com/Rdatatable/data.table/blob/master/R/data.table.R#L11
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are defined there because they are exported.
rn
won't be used by user.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap -
rn
is internal here, it will contain the vector of rownames to put in the matrix (after all the processing inif (!missing(rownames)) {}
.rnc
will contain the index of the column inx
to be dropped.