Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop direct use of NAMED and LEVELS #6420

Merged
merged 4 commits into from
Aug 28, 2024
Merged

Drop direct use of NAMED and LEVELS #6420

merged 4 commits into from
Aug 28, 2024

Conversation

aitap
Copy link
Contributor

@aitap aitap commented Aug 28, 2024

These are the remaining easy pickings for #6180.

Costs:

  • With verbose = TRUE, assign now only prints whether REFCNT() is >= 1 and >= 2, not its value directly.
  • Rf_charIsASCII is only an "eapi" function, not a bona-fide "api" function, and it is currently only in R-devel. But it doesn't raise NOTEs.

NAMED/REFCNT version information gathered here (shameless plug).

Since data.table now depends on R >= 3.3, the backports are no longer
needed. Moreover, MAYBE_SHARED is currently a function, while
MAYBE_REFERENCED expands to !NO_REFERENCES (which is a function).

In debugging output, show MAYBE_REFERENCED (NAMED > 0) instead of NAMED.
@MichaelChirico
Copy link
Member

thanks for the great resource!

would you mind

  1. splitting this into two PRs? id rather wait for Rf_isASCII to make a release before depending on it
  2. adding a NOTE to NEWS & citing yourself?

re:LEVELS, is there a performance penalty for the new approach?

Copy link

github-actions bot commented Aug 28, 2024

Comparison Plot

Generated via commit d626df2

Download link for the artifact containing the test results: ↓ atime-results.zip

Time taken to finish the standard R installation steps: 11 minutes and 21 seconds

Time taken to run atime::atime_pkg on the tests: 5 minutes and 45 seconds

@aitap
Copy link
Contributor Author

aitap commented Aug 28, 2024

re:LEVELS, is there a performance penalty for the new approach?

Currently, IS_ASCII(x) expands to LEVELS(x) & 64, so a function call and a bitmask test. With marked encodings, a call to getCharCE() and an equality test should be similar in performance. Rf_charIsASCII performs the same bitmask test inside a function call (where LEVELS expands to access internal SEXPREC fields). I think it should all be similar. If you have any pointers, I can try to write a performance test.

Will split and update the PR soon-ish. Thanks for the comments!

aitap added 2 commits August 28, 2024 18:26
getCharCE appeared in R-2.7, making it possible to check for strings
_marked_ as UTF-8 or Latin-1.

There is no marking as ASCII, so fixing IS_ASCII will have to wait for R
>= 4.5.
NEWS.md Outdated Show resolved Hide resolved
@MichaelChirico MichaelChirico merged commit fc9b328 into master Aug 28, 2024
8 checks passed
@MichaelChirico MichaelChirico deleted the nonapi_b_gone branch August 28, 2024 16:42
@tdhock
Copy link
Member

tdhock commented Aug 29, 2024

nice NEWS item!
This is a non-trivial contribution that merits adding Ivan to Authors@R in DESCRIPTION as role="ctb"
Ivan can you please do that in a new PR?

@aitap aitap mentioned this pull request Aug 29, 2024
MichaelChirico added a commit that referenced this pull request Aug 30, 2024
* Drop direct use of NAMED

Since data.table now depends on R >= 3.3, the backports are no longer
needed. Moreover, MAYBE_SHARED is currently a function, while
MAYBE_REFERENCED expands to !NO_REFERENCES (which is a function).

In debugging output, show MAYBE_REFERENCED (NAMED > 0) instead of NAMED.

* Almost drop direct use of LEVELS

getCharCE appeared in R-2.7, making it possible to check for strings
_marked_ as UTF-8 or Latin-1.

There is no marking as ASCII, so fixing IS_ASCII will have to wait for R
>= 4.5.

* NEWS entry for NAMED, LEVELS reduction

* fine-tune NEWS

---------

Co-authored-by: Michael Chirico <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants