Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More fix cursor for duplicate header #1

Merged
merged 3 commits into from
Aug 22, 2023

Conversation

stuart-marshall
Copy link
Owner

Papaparse's parse method modifies in the input string when headers are modified and/or duplicated. This changes the overall length of the input.
This change in length needs to be accounted for in the adjustment of lastCursor in pushRow. This is so that the outer parseChunk function can properly account for how much of the chunk is leftover and needs to be kept for prefixing onto the next chunk.

There was a previous attempt to fix this issue, but it only worked for the fast parse path. This fix works for both fast parsing and not-fast.

The added test correctly fails without the fix.

I've also included a bug fix for inputLen, which is needed to correctly handle a file that end with a quote and has modified headers or duplicate headers. I'll add a test to prove this works too.

papaparse.js Outdated Show resolved Hide resolved
papaparse.js Outdated Show resolved Hide resolved
@gfoltz
Copy link

gfoltz commented Aug 21, 2023

Also, why not put this branch/PR on the papaparse repo itself? Is this just for early feedback?

@stuart-marshall
Copy link
Owner Author

Yes, wanted internal discussion before submitting to the main repo

@stuart-marshall stuart-marshall merged commit b24698c into master Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants