-
Notifications
You must be signed in to change notification settings - Fork 228
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #284 from wader/tofromencodings
interp: Add to/from<encoding> for some common serialzations, encoding…
- Loading branch information
Showing
49 changed files
with
3,340 additions
and
337 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -424,6 +424,251 @@ you currently have to do `fq -d raw 'mp3({force: true})' file`. | |
- `paste` read string from stdin until ^D. Useful for pasting text. | ||
- Ex: `paste | frompem | asn1_ber | repl` read from stdin then decode and start a new sub-REPL with result. | ||
|
||
### Encodings, serializations and hashes | ||
|
||
In an addition to binary formats fq also support reading to and from encodings and serialization formats. | ||
|
||
At the moment fq does not have any dedicated argument for serialization formats but raw string input `-R` slurp `-s` and raw string output `-r` can make things easier. The combination `-Rs` will read all inputs into one string (same as jq). | ||
|
||
Note that `from*` functions output jq values and `to*` takes jq values as input so in some cases not all information will properly preserved, for example for XML element and attribute order might change and text and comment nodes might move and will be merged. | ||
|
||
Some example usages: | ||
|
||
```sh | ||
# read yml and do some query | ||
$ fq -Rs 'fromyaml | ...' file.yml | ||
|
||
# convert YAML to JSON | ||
# note -r for raw string output, without a JSON string with JSON would outputted | ||
$ fq -Rsr 'fromyaml | tojson({indent:2})' file.yml | ||
|
||
# add token to URL | ||
$ echo -n "https://host.org" | fq -Rsr 'fromurl | .user.username="token" | tourl' | ||
https://[email protected] | ||
|
||
$ curl -s https://www.discogs.com/ | fq -Rsr 'fromhtml | .. | select(."-id" == "hot-releases")? | .div[].a."-aria-label"' | ||
Arcade Fire - We | ||
Bell Biv Devoe - Poison | ||
Jazz Sabbath - Vol. 2 | ||
Jonathan Richman And The Modern Lovers* - Modern Lovers 88 | ||
Kylie* - Infinite Disco | ||
Vince Guaraldi Trio - Baseball Theme | ||
|
||
# shows how serialization functions can be used on any string, how to transform values and output som other format | ||
# read decode zip file and start an interactive REPL | ||
$ fq -i . <(curl -sL https://github.com/stefangabos/world_countries/archive/master.zip) | ||
# select from interesting xml file | ||
zip> .local_files[] | select(.file_name == "world_countries-master/data/countries/en/world.xml").uncompressed | repl | ||
# convert xml into jq values | ||
> .local_files[95].uncompressed string> fromxml | repl | ||
# sort countries by and select the first one | ||
>> object> .countries.country | sort_by(."-name") | first | | ||
# see what current input is | ||
>>> object> . | ||
{ | ||
"-alpha2": "af", | ||
"-alpha3": "afg", | ||
"-id": "4", | ||
"-name": "Afghanistan" | ||
} | ||
# remove "-" prefix from keys and convert to YAML and print it | ||
>>> object> with_entries(.key |= .[1:]) | toyaml | print | ||
alpha2: af | ||
alpha3: afg | ||
id: "4" | ||
name: Afghanistan | ||
# exit all REPLs back to shell | ||
>>> object> ^D | ||
>> object> ^D | ||
> .local_files[95].uncompressed string> ^D | ||
zip> ^D | ||
``` | ||
|
||
- `fromxml`/`fromxml($opts)` Parse XML into jq values.<br> | ||
`{seq: true}` preserve element ordering if more than one sibling.<br> | ||
`{array: true}` use nested arrays to represent elements.<br> | ||
- `fromhtml`/`fromhtml($opts)` Parse HTML into jq values.<br> | ||
Same as `fromxml` but less strict and follows html5 parsing rules. Will always have a `html` root with `head` and `body` elements.<br> | ||
`{array: true}` use nested arrays to represent elements.<br> | ||
`{seq: true}` preserve element ordering if more than one sibling.<br> | ||
- `toxml`/`toxml($opts})` Serialize jq value into XML.<br> | ||
`{indent: number}` indent child elements.<br> | ||
Assumes object representation if input is an object, and nested arrays if input is an array.<br> | ||
Will automatically add a root `doc` element if jq value has more then one root element.<br> | ||
If a `#seq` is found on at least one element all siblings will be sort by sequence number. Attributes are always sorted.<br> | ||
|
||
XML elements can be represented as jq values in two ways, as objects (inspired by [mxj](https://github.com/clbanning/mxj) and [xml.com's Converting Between XML and JSON | ||
](https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html)) or nested arrays. Both representations are lossy and might lose ordering of elements, text nodes and comments. In object representation `fromxml`, `fromhtml` and `toxml` support `{seq:true}` option to parse/serialize `{"#seq"=<number>}` attributes to preserve element sibling ordering. | ||
|
||
The object version is denser and convenient to query, the nested arrays version is probably easier to use when generating XML. | ||
|
||
Let's assume `$xml` is this XML document as a string: | ||
```xml | ||
<doc> | ||
<child attr="1"></child> | ||
<child attr="2">text</child> | ||
<other>text</other> | ||
</doc> | ||
``` | ||
|
||
With object representation an element is represented as: | ||
- Attributes as dash prefixed `-<key>` keys. | ||
- Text nodes as `#text`. | ||
- Comment nodes as `#comment` keys. | ||
- For explicit sibling ordering `#seq` keys with a number, can be negative, assumed zero if missing. | ||
- Child element with only text as `<name>` key with text as value. | ||
- Child element with more than just text as `<name>` key with value an object. | ||
- Multiple child element sibling with same name as `name` key with value as array with strings and objects. | ||
```jq | ||
> $xml | fromxml | ||
{ | ||
"doc": { | ||
"child": [ | ||
{ | ||
"-attr": "1" | ||
}, | ||
{ | ||
"#text": "text", | ||
"-attr": "2" | ||
} | ||
], | ||
"other": "text" | ||
} | ||
} | ||
``` | ||
|
||
With nested array representation, an array with these values `["<name>", {attributes...}, [children...]]` | ||
- Index zero is element name. | ||
- Optional first object attributes (including `#text` and `#comment` keys). | ||
- Optional first array are child elements. | ||
# | ||
```jq | ||
> $xml | fromxml({array: true}) | ||
[ | ||
"doc", | ||
[ | ||
[ | ||
"child", | ||
{ | ||
"attr": "1" | ||
} | ||
], | ||
[ | ||
"child", | ||
{ | ||
"#text": "text", | ||
"attr": "2" | ||
} | ||
], | ||
[ | ||
"other", | ||
{ | ||
"#text": "text" | ||
} | ||
] | ||
] | ||
] | ||
``` | ||
Parse and include `#seq` attributes if needed: | ||
```jq | ||
> $xml | fromxml({seq:true}) | ||
{ | ||
"doc": { | ||
"child": [ | ||
{ | ||
"#seq": 0, | ||
"-attr": "1" | ||
}, | ||
{ | ||
"#seq": 1, | ||
"#text": "text", | ||
"-attr": "2" | ||
} | ||
], | ||
"other": { | ||
"#seq": 2, | ||
"#text": "text" | ||
} | ||
} | ||
} | ||
```` | ||
Select values in `<doc>`, remove `<child>`, add a `<new>` element, serialize to xml with 2 space indent and print the string | ||
```jq | ||
> $xml | fromxml.doc | del(.child) | .new = "abc" | {root: .} | toxml({indent: 2}) | println | ||
<root> | ||
<new>abc</new> | ||
<other>text</other> | ||
</root> | ||
``` | ||
- `fromjson` Parse JSON into jq values. | ||
- `tojson`/`tojson($opt)` Serialize jq value into JSON.<br> | ||
`{indent: number}` indent array/object values.<br> | ||
- `fromjq` Parse jq-flavoured JSON into jq values. | ||
- `tojq`/`tojq($opt)` Serialize jq value into jq-flavoured JSON<br> | ||
`{indent: number}` indent array/object values.<br> | ||
jq-flavoured JSON has optional key quotes, `#` comments and can have trailing comma in arrays. | ||
- `fromyaml` Parse YAML into jq values. | ||
- `toyaml` Serialize jq value into YAML. | ||
- `fromtoml` Parse TOML into jq values. | ||
- `totoml` Serialize jq value into TOML. | ||
- `fromcsv`/`fromcvs($opts)` Parse CSV into jq values.<br> | ||
`{comma: string}` field separator, default ",".<br> | ||
`{comment: string}` comment line character, default "#".<br> | ||
- `tocsv`/`tocsv($opts)` Serialize jq value into CSV.<br> | ||
`{comma: string}` field separator, default ",".<br> | ||
- `fromxmlentities` Decode XML entities. | ||
- `toxmlentities` Encode XML entities. | ||
- `fromurlpath` Decode URL path component. | ||
- `tourlpath` Encode URL path component. | ||
- `fromurlencode` Decode URL query encoding. | ||
- `tourlencode` Encode URL to query encoding. | ||
- `fromurlquery` Decode URL query into object. For duplicates keys value will be an array. | ||
- `tourlquery` Encode objet into query string. | ||
- `fromurl` Decode URL into object. | ||
```jq | ||
> "schema://user:pass@host/path?key=value#fragement" | fromurl | ||
{ | ||
"fragment": "fragement", | ||
"host": "host", | ||
"path": "/path", | ||
"query": { | ||
"key": "value" | ||
}, | ||
"rawquery": "key=value", | ||
"scheme": "schema", | ||
"user": { | ||
"password": "pass", | ||
"username": "user" | ||
} | ||
} | ||
``` | ||
- `tourl` Encode object into URL string. | ||
- `fromhex` Decode hexstring to binary. | ||
- `tohex` Encode binay into hexstring. | ||
- `frombase64`/`frombase64($opts)` Decode base64 encodings into binary.<br> | ||
`{encoding:string}` encoding variant: `std` (default), `url`, `rawstd` or `rawurl` | ||
- `tobase64`/`tobase64($opts)` Encode binary into base64 encodings.<br> | ||
`{encoding:string}` encoding variant: `std` (default), `url`, `rawstd` or `rawurl` | ||
- `tomd4` Hash binary using md4. | ||
- `tomd5` Hash binary using md5. | ||
- `tosha1` Hash binary using sha1. | ||
- `tosha256` Hash binary using sha256. | ||
- `tosha512` Hash binary using sha512. | ||
- `tosha3_224` Hash binary using sha3 224. | ||
- `tosha3_256` Hash binary using sha3 256. | ||
- `tosha3_384` Hash binary using sha3 384. | ||
- `tosha3_512` Hash binary using sha3 512. | ||
- `toiso8859_1` Decode binary as ISO8859-1 into string. | ||
- `fromiso8859_1` Encode string as ISO8859-1 into binary. | ||
- `toutf8` Encode string as UTF8 into binary. | ||
- `fromutf8` Decode binary as UTF8 into string. | ||
- `toutf16` Encode string as UTF16 into binary. | ||
- `fromutf16` Decode binary as UTF16 into string. | ||
- `toutf16le` Encode string as UTF16 little-endian into binary. | ||
- `fromutf16le` Decode binary as UTF16 little-endian into string. | ||
- `toutf16be` Encode string as UTF16 big-endian into binary. | ||
- `fromutf16be` Decode binary as UTF16 big-endian into string. | ||
|
||
## Color and unicode output | ||
|
||
fq by default tries to use colors if possible, this can be disabled with `-M`. You can also | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.