From 12839fcb02bc1350a859e58f7aafce552e1f676a Mon Sep 17 00:00:00 2001 From: Sarah Wright Date: Tue, 20 Aug 2024 13:26:49 -0600 Subject: [PATCH] Fix documentation that implies pattern argument can be empty (#563) Fixes #508 Co-authored by: @RobLBaker Co-authored-by: Hadley Wickham --- R/count.R | 14 ++++++++++++++ R/detect.R | 4 +--- R/extract.R | 2 +- R/locate.R | 2 +- R/replace.R | 2 ++ R/split.R | 1 - man/str_count.Rd | 3 +-- man/str_detect.Rd | 4 +--- man/str_extract.Rd | 3 +-- man/str_locate.Rd | 3 +-- man/str_remove.Rd | 4 +--- man/str_replace.Rd | 4 +++- man/str_split.Rd | 3 +-- man/str_subset.Rd | 4 +--- man/str_view.Rd | 4 +--- man/str_which.Rd | 4 +--- 16 files changed, 31 insertions(+), 30 deletions(-) diff --git a/R/count.R b/R/count.R index 8ec0dd38..10e93840 100644 --- a/R/count.R +++ b/R/count.R @@ -4,6 +4,20 @@ #' of `string.` #' #' @inheritParams str_detect +#' @param pattern Pattern to look for. +#' +#' The default interpretation is a regular expression, as described in +#' `vignette("regular-expressions")`. Use [regex()] for finer control of the +#' matching behaviour. +#' +#' Match a fixed string (i.e. by comparing only bytes), using +#' [fixed()]. This is fast, but approximate. Generally, +#' for matching human text, you'll want [coll()] which +#' respects character matching rules for the specified locale. +#' +#' Match character, word, line and sentence boundaries with +#' [boundary()]. The empty string, `""``, is equivalent to +#' `boundary("character")`. #' @return An integer vector the same length as `string`/`pattern`. #' @seealso [stringi::stri_count()] which this function wraps. #' diff --git a/R/detect.R b/R/detect.R index 65e3763e..e5a89159 100644 --- a/R/detect.R +++ b/R/detect.R @@ -17,9 +17,7 @@ #' for matching human text, you'll want [coll()] which #' respects character matching rules for the specified locale. #' -#' Match character, word, line and sentence boundaries with -#' [boundary()]. An empty pattern, "", is equivalent to -#' `boundary("character")`. +#' You can not match boundaries, including `""`, with this function. #' #' @param negate If `TRUE`, inverts the resulting boolean vector. #' @return A logical vector the same length as `string`/`pattern`. diff --git a/R/extract.R b/R/extract.R index 1e7b2fef..99c1862a 100644 --- a/R/extract.R +++ b/R/extract.R @@ -3,7 +3,7 @@ #' `str_extract()` extracts the first complete match from each string, #' `str_extract_all()`extracts all matches from each string. #' -#' @inheritParams str_detect +#' @inheritParams str_count #' @param group If supplied, instead of returning the complete match, will #' return the matched text from the specified capturing group. #' @seealso [str_match()] to extract matched groups; diff --git a/R/locate.R b/R/locate.R index c7363db4..2c1f4bab 100644 --- a/R/locate.R +++ b/R/locate.R @@ -7,7 +7,7 @@ #' Because the `start` and `end` values are inclusive, zero-length matches #' (e.g. `$`, `^`, `\\b`) will have an `end` that is smaller than `start`. #' -#' @inheritParams str_detect +#' @inheritParams str_count #' @returns #' * `str_locate()` returns an integer matrix with two columns and #' one row for each element of `string`. The first column, `start`, diff --git a/R/replace.R b/R/replace.R index 9e12feea..8d7d8485 100644 --- a/R/replace.R +++ b/R/replace.R @@ -18,6 +18,8 @@ #' [fixed()]. This is fast, but approximate. Generally, #' for matching human text, you'll want [coll()] which #' respects character matching rules for the specified locale. +#' +#' You can not match boundaries, including `""`, with this function. #' @param replacement The replacement value, usually a single string, #' but it can be the a vector the same length as `string` or `pattern`. #' References of the form `\1`, `\2`, etc will be replaced with diff --git a/R/split.R b/R/split.R index 0fe98a4b..52f6edf3 100644 --- a/R/split.R +++ b/R/split.R @@ -16,7 +16,6 @@ #' * `str_split_fixed()` splits each string in a character vector into a #' fixed number of pieces, returning a character matrix. #' -#' @inheritParams str_detect #' @inheritParams str_extract #' @param n Maximum number of pieces to return. Default (Inf) uses all #' possible split positions. diff --git a/man/str_count.Rd b/man/str_count.Rd index bfb51012..06c6d01e 100644 --- a/man/str_count.Rd +++ b/man/str_count.Rd @@ -22,8 +22,7 @@ for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.} } \value{ An integer vector the same length as \code{string}/\code{pattern}. diff --git a/man/str_detect.Rd b/man/str_detect.Rd index 965af1a1..27c7a2f3 100644 --- a/man/str_detect.Rd +++ b/man/str_detect.Rd @@ -21,9 +21,7 @@ Match a fixed string (i.e. by comparing only bytes), using for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. -Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +You can not match boundaries, including \code{""}, with this function.} \item{negate}{If \code{TRUE}, inverts the resulting boolean vector.} } diff --git a/man/str_extract.Rd b/man/str_extract.Rd index 44a9d0ef..1c48856c 100644 --- a/man/str_extract.Rd +++ b/man/str_extract.Rd @@ -25,8 +25,7 @@ for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.} \item{group}{If supplied, instead of returning the complete match, will return the matched text from the specified capturing group.} diff --git a/man/str_locate.Rd b/man/str_locate.Rd index 5ad212a0..862be894 100644 --- a/man/str_locate.Rd +++ b/man/str_locate.Rd @@ -25,8 +25,7 @@ for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.} } \value{ \itemize{ diff --git a/man/str_remove.Rd b/man/str_remove.Rd index da2a5f95..a5eceb25 100644 --- a/man/str_remove.Rd +++ b/man/str_remove.Rd @@ -24,9 +24,7 @@ Match a fixed string (i.e. by comparing only bytes), using for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. -Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +You can not match boundaries, including \code{""}, with this function.} } \value{ A character vector the same length as \code{string}/\code{pattern}. diff --git a/man/str_replace.Rd b/man/str_replace.Rd index 3a9b077b..079dba57 100644 --- a/man/str_replace.Rd +++ b/man/str_replace.Rd @@ -26,7 +26,9 @@ in each element of \code{string}. Match a fixed string (i.e. by comparing only bytes), using \code{\link[=fixed]{fixed()}}. This is fast, but approximate. Generally, for matching human text, you'll want \code{\link[=coll]{coll()}} which -respects character matching rules for the specified locale.} +respects character matching rules for the specified locale. + +You can not match boundaries, including \code{""}, with this function.} \item{replacement}{The replacement value, usually a single string, but it can be the a vector the same length as \code{string} or \code{pattern}. diff --git a/man/str_split.Rd b/man/str_split.Rd index 16afd3fe..09ed8777 100644 --- a/man/str_split.Rd +++ b/man/str_split.Rd @@ -31,8 +31,7 @@ for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +\code{\link[=boundary]{boundary()}}. The empty string, \verb{""``, is equivalent to }boundary("character")`.} \item{n}{Maximum number of pieces to return. Default (Inf) uses all possible split positions. diff --git a/man/str_subset.Rd b/man/str_subset.Rd index 61a259b8..a720e625 100644 --- a/man/str_subset.Rd +++ b/man/str_subset.Rd @@ -21,9 +21,7 @@ Match a fixed string (i.e. by comparing only bytes), using for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. -Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +You can not match boundaries, including \code{""}, with this function.} \item{negate}{If \code{TRUE}, inverts the resulting boolean vector.} } diff --git a/man/str_view.Rd b/man/str_view.Rd index bccc7127..bf0e1767 100644 --- a/man/str_view.Rd +++ b/man/str_view.Rd @@ -28,9 +28,7 @@ Match a fixed string (i.e. by comparing only bytes), using for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. -Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +You can not match boundaries, including \code{""}, with this function.} \item{match}{If \code{pattern} is supplied, which elements should be shown? \itemize{ diff --git a/man/str_which.Rd b/man/str_which.Rd index 51250c17..bdb6f296 100644 --- a/man/str_which.Rd +++ b/man/str_which.Rd @@ -21,9 +21,7 @@ Match a fixed string (i.e. by comparing only bytes), using for matching human text, you'll want \code{\link[=coll]{coll()}} which respects character matching rules for the specified locale. -Match character, word, line and sentence boundaries with -\code{\link[=boundary]{boundary()}}. An empty pattern, "", is equivalent to -\code{boundary("character")}.} +You can not match boundaries, including \code{""}, with this function.} \item{negate}{If \code{TRUE}, inverts the resulting boolean vector.} }