You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> fread("~/Downloads/test.txt")
Error in fread("~/Downloads/test.txt") :
Expecting 7 cols but row 0 contains only 6 cols (sep='|'). Consider fill=true. <<"aa aa aa aa aa aa aa aa !"|1|aa.aa.aa|"aa aa aa aa aa aa aa aa ! 1 aaûaa 1 aa aa1 aa : aa aa aa'aa aa aa aaé aa aa aa aa ! aa aa aa aa !"|aa aa aa aa aa aa aa aa ! ,|>>
This is a valid error message (except that the line number is incorrect) -- the line shown does have 6 fields instead of expected 7; and adding option fill=TRUE does read the file correctly. There is no segfault.
On a Windows machine however, I can confirm that running fread("test.txt") produces a segfault, whereas fread("test.txt", fill=T) reads the data correctly. Here's the log leading up to the segfault:
> fread("Downloads/test.txt", verbose=T)
Input contains no \n. Taking this to be a filename to open
[1] Check arguments
Using 8 threads (omp_get_max_threads()=8, nth=8)
NAstrings = [<<NA>>]
None of the NAstrings look like numbers.
show progress = 1
[2] Opening the file
Opening file Downloads/test.txt
File opened, size = 95.7KB (zd bytes).
Memory mapping ... ok
[3] Detect and skip BOM
[4] Detect end-of-line character(s)
Detected eol as \r\n (CRLF) in that order, the Windows standard.
[6] Skipping initial rows if needed
Positioned on line 1 starting: <<aa|aa aa|aa|aa|aa|aa aa|aa aa>>
[7] Detect separator, quoting rule, and ncolumns
Detecting sep ...
sep=',' with 1 lines of 3 fields using quote rule 2
sep='|' with 100 lines of 7 fields using quote rule 0
Detected 7 columns on line 1. This line is either column names or first data row. Line starts as: <<aa|aa aa|aa|aa|aa|aa aa|aa aa>>
Quote rule picked = 0
[8] Determine column names
All the fields on line 1 are character fields. Treating as the column names.
[9] Detect column types
Number of sampling jump points = 11 because (97943 bytes from row 1 to eof) / (2 * 4859 jump0size) == 10
Type codes (jump 000) : 6266622 Quote rule 0
Bumping quote rule from 0 to 1 due to field 1 on line 7 of sampling jump 1 starting <<"aa : aaAcaa aa aa aa aa aa "aa aa aa" ?"|1|aa-aa.aa|"-"|||>>
Bumping quote rule from 1 to 2 due to field 1 on line 7 of sampling jump 1 starting <<"aa : aaAcaa aa aa aa aa aa "aa aa aa" ?"|1|aa-aa.aa|"-"|||>>
Type codes (jump 001) : 6266622 Quote rule 2
Type codes (jump 010) : 6266622 Quote rule 2
=====
Sampled 1039 rows (handled \n inside quoted fields) at 11 jump points
Bytes from first data row on line 2 to the end of last row: 97943
Line length: mean=61.00 sd=19.55 min=22 max=146
Estimated number of rows: 97943 / 61.00 = 1606
Initial alloc = 3212 rows (1606 + 100%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
=====
[10] Apply user overrides on column types
After 0 type and 0 drop user overrides : 6266622
[11] Allocate memory for the datatable
Allocating 7 column slots (7 - 0 dropped) with 3212 rows
[12] Read the data
I get a segfault when using
fread
on this file: test.txtI use the latest dev version, the problem does not occur with the CRAN version.
The file can be fixed by adding a separator at the end of line 116.
Not sure whether it is related to an existing issue or not.
The text was updated successfully, but these errors were encountered: