Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting raw input delimiter #965

Open
phiresky opened this issue Sep 28, 2015 · 9 comments
Open

Allow setting raw input delimiter #965

phiresky opened this issue Sep 28, 2015 · 9 comments

Comments

@phiresky
Copy link

As far as I can tell, this is not currently possible?

main use case:
find | jq -R works

but because filenames can contain newlines that is not safe, so I'd like to use find -print0, but jq does not allow setting \0 as the input delimiter (or setting it at all).

It can be circumvented with

find -print0|jq --slurp --raw-input 'split("\u0000")[]'

but that disables streaming the input

Usage in other programs (for \0):

  • xargs -0 or xargs --null
  • sed -z or sed --null-data
@nicowilliams
Copy link
Contributor

I agree.

@wtlangford
Copy link
Contributor

I have a concern. If we enable setting the delimiter, do we automatically
convert newline characters to \n in that mode? Otherwise we end up with
invalid json strings. I feel this may be an example of an input that
should be processed with sed or something before feeding into jq.

On Mon, Sep 28, 2015 at 2:27 PM Nico Williams [email protected]
wrote:

I agree.


Reply to this email directly or view it on GitHub
#965 (comment).

@phiresky
Copy link
Author

Yes, probably, like slurp

I feel this may be an example of an input that should be processed with sed or something before feeding into jq.

But how would that work? Without stopping streaming?

@nicowilliams
Copy link
Contributor

@wtlangford Strings can contain newlines. Newlines in strings have to be escaped in encoded JSON texts, but here we're not dealing with JSON texts, as the input is raw, and the output of the "parser" is a jv string to feed to the jq VM.

@wtlangford
Copy link
Contributor

Fair enough. I'm convinced.

On Mon, Sep 28, 2015, 15:27 Nico Williams [email protected] wrote:

@wtlangford https://github.com/wtlangford Strings can contain newlines.
Newlines in strings have to be escaped in encoded JSON texts, but here
we're not dealing with JSON texts, as the input is raw, and the output of
the "parser" is a jv string to feed to the jq VM.


Reply to this email directly or view it on GitHub
#965 (comment).

@nkgm
Copy link

nkgm commented Oct 29, 2018

Having the same problem processing zsh history files, which use newlines between records, but may contain escaped newlines within records. I got sed to insert NULs to disambiguate records and then I bumped into this issue. Eventually had to do this backwards, getting sed to replace escaped newlines with NULs and keep newline as record separator in order to keep jq happy. The workaround was easy enough, but it would be really nice if jq would support NUL delimiter as per @phiresky's original comment.

@nicowilliams
Copy link
Contributor

We should add a -0 at least, and preferably also a -F CHAR or some appropriately-named long option.

@pabs3
Copy link
Contributor

pabs3 commented Sep 21, 2021

The -0 option got added already, personally I think that is enough and this issue can be closed now.

BTW, as pointed out in #1271, JSON strings can contain both LF ("\n") and NUL ("\u0000") so -0 is not sufficient for preventing recipients from getting the wrong amount of result strings (as is -r of course).

@Freed-Wu
Copy link

Comes from wader/fq#1019

I also expect jq can be an alternative for perl/sed/awk. fq have imported --raw-output0 to set output seperator. a -0 or --raw-input0 can be good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants