-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not suggests fread fill=TRUE if already used #2727
Comments
If |
Rprofmem.out.zip |
One up this, which can be reproduced using this simple toy example:
Issue seems to be incomplete sampling of possible number of columns. I don't necessarily think |
I have the same issue actually. Is there a workaround for now? |
Find out the max number of fields, create a 'fake' line with that number of fields at the beginning of the file, read the file, scrap that line. |
I have the same issue, but my csv is 3.5 gb. Thank you in advance. |
Having the same issue, is there any work around which doesn't involve patching the start of the file contents? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I have the same issue. |
I'm still getting this same issue (May 2023), it looks like there is a pull request to fix this but it hasn't been implemented. If any of your rows are longer than the longest in some sample it is taking (maybe first 100 rows?) then it will give below warning, even when fill = T is already used, and stop reading at that line. fread throws an error if you try to put an integer with ncol guess. Warning message: Any idea of when this fix will get implemented? I'm using data.table v1.14.8 Here is the verbose read out if that is helpful:
This installation of data.table has not been compiled with OpenMP support. |
I wouldn't say it is closed. The behavior observed on the initially reported issue and current master is far from ideal. Now it suggests to use fill=10, but getting error again suggesting to use fill=11. |
True its raising the suggestion until |
yes, usually people will be happy to have their files loaded, not necessarily to fastest possible way. Then maybe |
|
Sounds good in theory, but unfortunately it allocates 2^31 columns of size 8 byte and kills the R process 😄 |
current fread behavior.
Before printing this warning we should check if
fill=TRUE
was used.The text was updated successfully, but these errors were encountered: