Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve newline behaviour #59

Closed
robations opened this issue Aug 26, 2015 · 4 comments
Closed

Preserve newline behaviour #59

robations opened this issue Aug 26, 2015 · 4 comments

Comments

@robations
Copy link

For me, the preserve newline behaviour isn't quite working as I expected (tested with the docx extractor).

I have text like this in a docx file:

2 downlighters; door to hall.

Hall
Double glazed window to front;

With preserveLineBreaks I get this output:

2 downlighters; door to hall. Hall
Double glazed window to front;

After outputting some stuff to the console I can see the newlines are there as expected but then they get parsed out.

Taking a look at how preserveLineBreaks is implemented I see it's a big, hairy regex, so not sure what it is doing at first glance. From my naive point of view it would be nicer to get the raw text output, if I need to filter further I can make my own mind. Or if there is a 'clean' function as a configuration option I could use it to override the default behaviour.

@robations
Copy link
Author

I looked at the pull requests and I think this (open) pull request might be addressing the same issue:

#58

@dbashford
Copy link
Owner

I'll get that merged when I get a second. Hopefully tomorrow.

@dbashford
Copy link
Owner

Is merged and published as 1.1.1

@robations
Copy link
Author

(☞゚ヮ゚)☞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants