-
Notifications
You must be signed in to change notification settings - Fork 166
Problem with unvisible "\r" symbols #58
Comments
"\r" shall be treated like any other character. I expect your problem to be related to the way quotes are written or escaped. try to isolate the CSV line responsible to your problem and send it to us with the options you're using. |
I fixed it by making a transformer and replace \n\r to \n.
|
ok, now i understand your problem. did you try setting the option "rowDelimiter" to "\r\n"? |
I just tried this. ( ^ ) Does not fix the problem. I'm not getting many errors, so perhaps this is unique to the start of the file or something. |
if i could get a test or a sample script to replicate the issue, i'll be happy to fix it. |
@wdavidw @ekussberg: in addition, it would be nice to have some kind of support for dealing with mixed line endings. I know it shouldn't have to be the responsibility of the csv parser, but its a little frustrating when a 3rd party data source randomly starts intermixing line endings in a file and parsing blows up. Pretty much any utility I've written lately has to support a |
i am not against the idea, csv input is such a mess sometimes and we often dont control the source. if you wish to work on it, i'll review the patch. |
Hey, I just hit this same situation with Google Sheets. The header row of my document contains multiline cells. Multiline cells appear to use Linux newlines ( So the first newline csv-parse sees is a Linux newline and it assumes that for the rest of the file. I was actually getting This seems like a Google Drive bug and I will follow up with that. But in the meantime this library should be able to handle this more easily. Perhaps knowing that \r shouldn't appear in data, or can exist at ends of rows. Here is a minimal repro:
|
i dont see your problem
|
You specified an explicit row delimiter. The super cryptic error message In my case I accept dynamic CSV URLs so I can't easily assume all files If the auto-row-delimiter logic actually looked for the first record-terminator instead of just the first newline, it would work perfectly with this file. On Wed, Jun 8, 2016, 07:24 Worms David notifications@github.com wrote:
|
I'm also having this same problem with my CSV file input - I'm really surprised the newline parameter can't accept an argument for "\r\n". |
The same example with automatic row delimiter detection works as well: parse '"qwer\nasdf",dfh\r\n1,2\r\n4,6\r\n7,3\r\n1,"8,120"\r\n3,2', (err, data) ->
console.log err, data Could you provide me with an example as simple as the one above reproducing the error ? @danopia, what do you mean by "first record terminator" in "If the auto-row-delimiter logic actually looked for the first record-terminator instead of just the first newline, it would work perfectly with this file." ? |
Stumbled here after getting I'm converting a Google Sheets doc to CSV (I'm guessing same as @danopia?), and this is what they spit out:
For a simple doc that has 2 columns & 2 rows: Apparently, it thinks there's a 3rd column that's empty, but more importantly the |
What do you mean by "effed". is there an additionnal linefeed after the parse 'address,name,"\n"\r\nsf,test,', rowDelimiter: '\r\n', (err, data) ->
console.log err, data |
@ivanakimov Looking at your sample, I discovered a bug. Not completely sure if this is the problem we're experiencing in this issue. Please test the latest release and re-open an issue if this isnt the case. |
@wdavidw whatever you did, that fixed it for me in The reason I didn't want to experiment with |
I've tried using version 1.1.1, AND 1.1.11, but the problem still persists for me.
This script results in the same error:
|
Came up with a workaround that strips out carriage returns. Works well in my use-case.
|
@jonahbron, error is expected in your case, having different row dlimiters cant be suppoorted |
The stream transformer by @jonahbron is great. Minor note from applications I find that
Update: The Node stream may break the CSV string between a UniformLineEndings.prototype._transform = function (chunk, enc, cb) {
var str = chunk.toString()
if (this.__prior_ending_char == '\r' && str[0] == '\n') str = str.substring(1)
this.__prior_ending_char = str[str.length-1]
str = str.replace(/\r\n|\n|\r/g, '\n')
this.push(str);
cb();
}; |
I am parsing a simple multiline CSV file, that i exported from Google Spreadsheet and i get following error:
Does anyone have idea how to fix it?
The text was updated successfully, but these errors were encountered: