Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test.data.table() creates DT in .GlobalEnv #5514

Closed
mattdowle opened this issue Nov 7, 2022 · 7 comments · Fixed by #5515
Closed

test.data.table() creates DT in .GlobalEnv #5514

mattdowle opened this issue Nov 7, 2022 · 7 comments · Fixed by #5515
Milestone

Comments

@mattdowle
Copy link
Member

It shouldn't be touching .GlobalEnv

> require(data.table)  # 1.14.4 and before
> DT
Error: object 'DT' not found   # correct error
> test.data.table()
All 10039 tests (last 2163) in tests/tests.Rraw.bz2 completed ok in 36.8s elapsed (48.6s cpu)
> DT
   a
1: 1

Tests are already run in their own environment to isolate from .GobalEnv. That works well. But test 2036 uses source() which needs a local=TRUE adding.

@mattdowle mattdowle added this to the 1.14.5 milestone Nov 7, 2022
mattdowle added a commit that referenced this issue Nov 8, 2022
….GlobalEnv, closes #5514; remove DTfun fix no longer needed.
@MichaelChirico
Copy link
Member

It would be great if we could lockEnvironment(.GlovalEnv), but for some strange reason R apparently doesn't offer any way to unlockEnvironment() afterwards. There are apparently some ways to hack together an unlockEnvironment() in C, see e.g. here, but I'd rather not do that.

@mattdowle
Copy link
Member Author

mattdowle commented Nov 9, 2022

Good idea. Agree that's strange unlockEnvironment() doesn't exist.
Then how about save.image() before and after and binary compare. That could be a concern in a user's environment perhaps if they had large or sensitive data in .GlobalEnv and they ran test.data.table(). So it could be done in CRAN_Release.cmd and/or GLCI.

mattdowle added a commit that referenced this issue Nov 9, 2022
@mattdowle
Copy link
Member Author

Using 1.14.4 I checked that those commands added to CRAN_Release would have found this DT being written, and that there are no others. No others in dev as of now either.

@MichaelChirico
Copy link
Member

Agree it's probably best to handle in GLCI (with an environment variable) or in CRAN_release, because it's fine to lockEnvironment(.GlobalEnv) as long as the session exits after running the test.

@mattdowle
Copy link
Member Author

mattdowle commented Nov 9, 2022

Good point. In that case it would need to be lockEnvironment(.GlobalEnv, bindings=TRUE) otherwise it appears from reading ?lockEnvironment that existing variables could still be changed with the default bindings=FALSE.

I wonder if lockEnvironment(.GlobalEnv, bindings=TRUE) would prevent .Last and .Random.seed from being created/changed. Those aren't assigned by us but by base R. Those were picked up by the diff method so I excluded them in e956716.

@mattdowle
Copy link
Member Author

Maybe lockEnvironment(.GlobalEnv, bindings=TRUE); unlockBinding(".Last", .GlobalEnv); unlockBinding(".Random.seed") would work; i.e. lock all bindings other than .Last and .Random.seed. They could be created first by dummy calls to the R functions that create them, or setting them to NULL might be enough just to ensure the bindings exist before locking the environment after which new bindings can't be created.

@MichaelChirico
Copy link
Member

otherwise it appears... existing variables could still be changed...

Oh, good catch, yes, I assumed bindings=TRUE was the default.

They could be created first by dummy calls to the R functions that create them, or setting them to NULL

Either way, should be easy enough to play. The latter looks cleaner but the risk is if some R code assume's they're non-NULL or length()>0.

@mattdowle mattdowle modified the milestones: 1.14.7, 1.14.6 Nov 16, 2022
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Jan 25, 2024
# data.table [v1.14.10](https://github.com/Rdatatable/data.table/milestone/20)

## NOTES

1. Maintainer of the package for CRAN releases is from now on Tyson
  Barrett (@TysonStanley),
  [#5710](Rdatatable/data.table#5710).

2. Updated internal code for breaking change of `is.atomic(NULL)` in
  R-devel,
  [#5691](Rdatatable/data.table#5691). Thanks to
  Martin Maechler for the patch.

3. Fix multiple test concerning coercion to missing complex numbers,
  [#5695](Rdatatable/data.table#5695) and
  [#5748](Rdatatable/data.table#5748). Thanks
  to @MichaelChirico and @ben-schwen for the patches.

4. Fix multiple format warnings (e.g., -Wformat)
  [#5712](Rdatatable/data.table#5712),
  [#5781](Rdatatable/data.table#5781),
  [#5880](Rdatatable/data.table#5800),
  [#5786](Rdatatable/data.table#5786). Thanks to
  @MichaelChirico and @jangorecki for the patches.



# data.table [v1.14.8](https://github.com/Rdatatable/data.table/milestone/28?closed=1)  (17 Feb 2023)

## NOTES

1. Test 1613.605 now passes changes to `as.data.frame()` in R-devel,
  [#5597](Rdatatable/data.table#5597). Thanks to
  Avraham Adler for reporting.

2. An out of bounds read when combining non-equi join with `by=.EACHI`
  has been found and fixed thanks to clang ASAN,
  [#5598](Rdatatable/data.table#5598). There
  was no bug or consequence because the read was followed (now preceded)
  by a bounds test.

3. `.rbind.data.table` (note the leading `.`) is no longer exported
  when `data.table` is installed in R>=4.0.0 (Apr 2020),
  [#5600](Rdatatable/data.table#5600). It was
  never documented which R-devel now detects and warns about. It is only
  needed by `data.table` internals to support R<4.0.0; see note 1 in
  v1.12.6 (Oct 2019) below in this file for more details.


# data.table [v1.14.6](https://github.com/Rdatatable/data.table/milestone/27?closed=1)  (16 Nov 2022)

## BUG FIXES

1. `fread()` could leak memory,
  [#3292](Rdatatable/data.table#3292). Thanks
  to @patrickhowerter for reporting, and Jim Hester for the fix. The fix
  requires R 3.4.0 or later. Loading `data.table` in earlier versions
  now highlights this issue on startup, asks users to upgrade R, and
  warns that we intend to upgrade `data.table`'s dependency from 8 year
  old R 3.1.0 (April 2014) to 5 year old R 3.4.0 (April 2017).

## NOTES

1. Test 1962.098 has been modified to pass latest changes to `POSIXt`
  in R-devel.

2. `test.data.table()` no longer creates `DT` in `.GlobalEnv`, a CRAN
  policy violation,
  [#5514](Rdatatable/data.table#5514). No
  other writes occurred to `.GlobalEnv` and release procedures have been
  improved to prevent this happening again.

3. The memory usage of the test suite has been halved,
  [#5507](Rdatatable/data.table#5507).


# data.table [v1.14.4](https://github.com/Rdatatable/data.table/milestone/26?closed=1)  (17 Oct 2022)

## NOTES

1. gcc 12.1 (May 2022) now detects and warns about an always-false
  condition (`-Waddress`) in `fread` which caused a small efficiency
  saving never to be invoked,
  [#5476](Rdatatable/data.table#5476). Thanks to
  CRAN for testing latest versions of compilers.

2. `update.dev.pkg()` has been renamed `update_dev_pkg()` to get out
  of the way of the `stats::update` generic function,
  [#5421](Rdatatable/data.table#5421). This is a
  utility function which upgrades the version of `data.table` to the
  latest commit in development which has passed all tests. As such we
  don't expect any backwards compatibility concerns. Its manual page was
  causing an intermittent hang/crash from `R CMD check` on Windows-only
  on CRAN which we hope will be worked around by changing its name.

3. Internal C code now passes `-Wstrict-prototypes` to satisfy the
  warnings now displayed on CRAN,
  [#5477](Rdatatable/data.table#5477).

4. `write.csv` in R-devel no longer responds to
  `getOption("digits.secs")` for `POSIXct`,
  [#5478](Rdatatable/data.table#5478). This
  caused our tests of `fwrite(, dateTimeAs="write.csv")` to fail on
  CRAN's daily checks using latest daily R-devel. While R-devel
  discussion continues, and currently it seems like the change is
  intended with further changes possible, this `data.table` release
  massages our tests to pass on latest R-devel. The idea is to try to
  get out of the way of R-devel changes in this regard until the new
  behavior of `write.csv` is released and confirmed. Package updates are
  not accepted on CRAN if they do not pass the latest daily version of
  R-devel, even if R-devel changes after the package update is
  submitted. If the change to `write.csv()` stands, then a future
  release of `data.table` will be needed to make `fwrite(,
  dateTimeAs="write.csv")` match `write.csv()` output again in that
  future version of R onwards. If you use an older version of
  `data.table` than said future one in the said future version of R,
  then `fwrite(, dateTimeAs="write.csv")` may not match `write.csv()` if
  you are using `getOption("digits.secs")` too. However, you can always
  check that your installation of `data.table` works in your version of
  R on your platform by simply running `test.data.table()`
  yourself. Doing so would detect such a situation for you: test 1741
  would fail in this case. `test.data.table()` runs the entire suite of
  tests and is always available to you locally. This way you do not need
  to rely on our statements about which combinations of versions of R
  and `data.table` on which platforms we have tested and support; just
  run `test.data.table()` yourself. Having said that, because test 1741
  has been relaxed in this release in order to be accepted on CRAN to
  pass latest R-devel, this won't be true for this particular release in
  regard to this particular test.

    ```R
    $ R --vanilla
    R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
    > DF = data.frame(A=as.POSIXct("2022-10-01 01:23:45.012"))
    > options(digits.secs=0)
    > write.csv(DF)
    "","A"
    "1",2022-10-01 01:23:45
    > options(digits.secs=3)
    > write.csv(DF)
    "","A"
    "1",2022-10-01 01:23:45.012

    $ Rdevel --vanilla
    R Under development (unstable) (2022-10-06 r83040) -- "Unsuffered Consequences"
    > DF = data.frame(A=as.POSIXct("2022-10-01 01:23:45.012"))
    > options(digits.secs=0)
    > write.csv(DF)
    "","A"
    "1",2022-10-01 01:23:45.012
    ```

5. Many thanks to Kurt Hornik for investigating potential impact of a
  possible future change to `base::intersect()` on empty input,
  providing a patch so that `data.table` won't break if the change is
  made to R, and giving us plenty of notice,
  [#5183](Rdatatable/data.table#5183).

6. `datatable.[dll|so]` has changed name to `data_table.[dll|so]`,
  [#4442](Rdatatable/data.table#4442). Thanks to
  Jan Gorecki for the PR. We had previously removed the `.` since `.` is
  not allowed by the following paragraph in the Writing-R-Extensions
  manual. Replacing `.` with `_` instead now seems more consistent with
  the last sentence.

    > ... the basename of the DLL needs to be both a valid file name
      and valid as part of a C entry point (e.g. it cannot contain
      ‘.’): for portable code it is best to confine DLL names to be
      ASCII alphanumeric plus underscore. If entry point R_init_lib is
      not found it is also looked for with ‘.’ replaced by ‘_’.


# data.table [v1.14.2](https://github.com/Rdatatable/data.table/milestone/24?closed=1)  (27 Sep 2021)

## NOTES

1. clang 13.0.0 (Sep 2021) requires the system header `omp.h` to be
  included before R's headers,
  [#5122](Rdatatable/data.table#5122). Many
  thanks to Prof Ripley for testing and providing a patch file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants