Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fwrite gzip on OpenBSD error #5048

Closed
philippechataignon opened this issue Jun 18, 2021 · 3 comments · Fixed by #5049
Closed

fwrite gzip on OpenBSD error #5048

philippechataignon opened this issue Jun 18, 2021 · 3 comments · Fixed by #5049
Assignees
Milestone

Comments

@philippechataignon
Copy link
Contributor

On OpenBSD 6.9, released May 1, 2021, writing gzipped csv fails :

> library(data.table)
> test.data.table()
getDTthreads(verbose=TRUE):
This installation of data.table has not been compiled with OpenMP support.
  omp_get_num_procs()            1
  R_DATATABLE_NUM_PROCS_PERCENT  unset (default 50)
  R_DATATABLE_NUM_THREADS        unset
  R_DATATABLE_THROTTLE           unset (default 1024)
  omp_get_thread_limit()         1
  omp_get_max_threads()          1
  OMP_THREAD_LIMIT               unset
  OMP_NUM_THREADS                unset
  RestoreAfterFork               true
  data.table is using 1 threads with throttle==1024. See ?setDTthreads.
test.data.table() running: /usr/local/lib/R/library/data.table/tests/tests.Rraw.bz2 

**** Suggested package bit64 is not installed. Tests using it will be skipped.
**** Suggested package xts is not installed. Tests using it will be skipped.
**** Suggested package nanotime is not installed. Tests using it will be skipped.
**** Suggested package R.utils is not installed. Tests using it will be skipped.
**** Suggested package yaml is not installed. Tests using it will be skipped.

Test 1658.53 produced 1 errors but expected 0
Expected: 
Observed: Compress gzip error: -9
Test 1760 not run because this session either has no OpenMP or has been limited to one thread (e.g. under UBSAN and ASAN)

Fri Jun 18 15:21:25 2021  endian==little, sizeof(long double)==16, longdouble.digits==64, sizeof(pointer)==8, TZ==unset, Sys.timezone()=='Europe/Paris', Sys.geTest 1658.53 produced 1 errors but expected 0
Expected: 
Observed: Compress gzip error: -9tlocale()=='C', l10n_info()=='MBCS=FALSE; UTF-8=FALSE; Latin-1=FALSE', getDTthreads()=='This installation of data.table has not been compiled with OpenMP support.; omp_get_num_procs()==1; R_DATATABLE_NUM_PROCS_PERCENT==unset (default 50); R_DATATABLE_NUM_THREADS==unset; R_DATATABLE_THROTTLE==unset (default 1024); omp_get_thread_limit()==1; omp_get_max_threads()==1; OMP_THREAD_LIMIT==unset; OMP_NUM_THREADS==unset; RestoreAfterFork==true; data.table is using 1 threads with throttle==1024. See ?setDTthreads.', zlibVersion()==1.2.3 ZLIB_VERSION==1.2.3
Error in test.data.table() : 
  1 error out of 9060. Search tests/tests.Rraw.bz2 for test number 1658.53.

Reproducible example:

> library(data.table)
> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-unknown-openbsd6.9 (64-bit)
Running under: OpenBSD 6.9 GENERIC.MP#3 amd64

Matrix products: default
BLAS:   /usr/local/lib/R/lib/libRblas.so.37.0
LAPACK: /usr/local/lib/R/lib/libRlapack.so.37.0

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.14.0

loaded via a namespace (and not attached):
[1] compiler_4.0.5
> a = data.table(i=1:100)
> fwrite(a, "test.csv.gz")
Error in fwrite(a, "test.csv.gz") : Compress gzip error: -9
Execution halted
@philippechataignon
Copy link
Contributor Author

philippechataignon commented Jun 18, 2021

Error code -9 is internal to fwrite and is returned when deflate does not return Z_STREAM_END.

zlib doc says "In order to complete in one call, avail_out must be at least the value returned by deflateBound (see below).
Then deflate is guaranteed to return Z_STREAM_END.". We use deflateBound, that's why we check Z_STREAM_END as deflate return value.

But OpenBSD has an old zlib library, part of the system (not a package) : version 1.2.3 from 2005 (current is 1.2.11 since 2017).
And in version 1.2.3, deflateBound is incorrect for gzip format (seems fixed in 1.2.3.1 "- Take into account wrapper variations in deflateBound()" ) and returns a too small value : 20 instead of 22 in the above reproducible example.

philippechataignon added a commit that referenced this issue Jun 18, 2021
Actually, buffer size for writing header only uses headerLen. With this
commit, buffer size for writing header uses headerLen only when the size
is bigger than using the buffer size parameter (buffMB).

Resolves #5048
@mattdowle mattdowle added this to the 1.14.1 milestone Jun 21, 2021
@mattdowle
Copy link
Member

mattdowle commented Jun 21, 2021

Many thanks! As you wrote that the PR is an attempt, leaving this issue open but marked 1.14.1 to serve as a reminder. If it is confirmed fixed before release then we can update the news item and close this issue at that point?

@philippechataignon
Copy link
Contributor Author

philippechataignon commented Jun 23, 2021

Confirmed fixed.

@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants