Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yq write strips completely blank lines from the output #515

Open
scanfield opened this issue Aug 13, 2020 · 33 comments
Open

yq write strips completely blank lines from the output #515

scanfield opened this issue Aug 13, 2020 · 33 comments

Comments

@scanfield
Copy link

Is your feature request related to a problem? Please describe.

foo:
  bar: 1

  baz: 2

when run through yq w - foo.baz 3

produces

foo:
  bar: 1
  baz: 3

Describe the solution you'd like
Keep my extra blank line (it's better for readability / produces less of a diff)

@warder
Copy link

warder commented Sep 24, 2020

Same story, sorry but this issue looks more like a bug nether then enhancement.
When you process yaml file with yq it corrupts a whole file

@AceHack
Copy link

AceHack commented Oct 10, 2020

Any update on this, this is a really nice feature.

@sathiyams
Copy link

Any update on this ? It's getting difficult when it comes to readability

@mikefarah
Copy link
Owner

This is an effect of the underlying yaml parser, an issue was raised there go-yaml/yaml#627 - the owner said

..the content when re-encoded will not
 have its original textual representation preserved. An effort is made to
 render the data plesantly, and to preserve comments near the data they
 describe, though. 

@arcesino
Copy link

I've been dealing with this issue for a couple of days when updating very large YAML files and found a workaround using diff & patch commands that restores the stripped blank lines in most of the cases. Suppose you have the following YAML file:

doc:
  version: 1.0.0
  name: numbers & letters

numbers:
  - 1

letters:
  - a

we call this file a.yaml. Now let's update the version using yq and store the result in new file a-updated.yaml:

yq e '.doc.version = "1.0.1"' a.yaml > a-updated.yaml

as expected, command above stripped all blank lines so a-updated.yaml looks like:

doc:
  version: 1.0.1
  name: numbers & letters
numbers:
  - 1
letters:
  - a

at this point, the first step to get the blank lines back is to create a diff file that ignores blank lines changes:

diff -U0 -w -b --ignore-blank-lines a.yaml a-updated.yaml > a.diff

a.diff looks like this:

--- a.yaml	2021-04-30 15:28:38.000000000 -0500
+++ a-updated.yaml	2021-04-30 15:18:53.000000000 -0500
@@ -2 +2 @@
-  version: 1.0.0
+  version: 1.0.1

then final step is to patch original file with the diff:

patch a.yaml < a.diff

after that, the original file looks like:

doc:
  version: 1.0.1
  name: numbers & letters

numbers:
  - 1

letters:
  - a

the issue comes when the updated line is right before a blank line. For example, let's add an element to one of the arrays:

yq e '.numbers += 2' a.yaml > a-updated.yaml

the updated file is now:

doc:
  version: 1.0.1
  name: numbers & letters
numbers:
  - 1
  - 2
letters:
  - a

if we generate the diff file as before we'll get the following:

--- a.yaml	2021-04-30 15:30:22.000000000 -0500
+++ a-updated.yaml	2021-04-30 15:35:26.000000000 -0500
@@ -7 +6 @@
-
+  - 2

and patching the original file with diff above results in:

doc:
  version: 1.0.1
  name: numbers & letters

numbers:
  - 1
  - 2
letters:
  - a

notice how the blank line after the new element in numbers array remains stripped while others are back. This is due since the diff file considers the blank line deletion and the addition of the new array element as part of the same diffset so it's not ignored by --ignore-blank-lines.

This is not ideal in any means but in my case it has helped a lot since my files are big and with lots of blank lines. I'm sharing this in case someone else can find it useful too.

@lirlia
Copy link

lirlia commented Jan 25, 2022

Thanks ! I use @arcesino approach for this 1 liner.

filename=xxx
version=xxx

patch "$filename" <<< $(diff -U0 -w -b --ignore-blank-lines $filename <(yq eval ".my.version = \"$version\"" $filename))

@vladimir259
Copy link

vladimir259 commented Feb 25, 2022

Thanks for the idea with diff & patch @arcesino .

I my case the removal of blanks introduced by diff were unfortunately unacceptable, so i had to dig further.

And found a solution.

Approach is following: i remove blanks from the original yaml and create a diff between that and my altered yaml.
The patch then is applied to the original and no new spaces are introduced.

Here an example:

Starting point is my original yaml where the value of key "secrets.TEST" should be updated

---
config:

  # mysql
  DATABASE_PROTOCOL: "mysql"
  # instance fqdn
  DATABASE_HOST: "mysql"

secrets:
  # db password
  DATABASE_PASSWORD: "password"

  # example
  TEST: "foo"

# other values
#[...]

Step 1: updating the value & creating a copy

yq '.secrets.TEST = "NewValue"' sample.yaml > sample.yaml.new

Step 2: removing blanks from the original

yq '.' sample.yaml > sample.yaml.noblanks

Step 3: creating a patch

diff -B sample.yaml.noblanks sample.yaml.new > patch.file

the patch contains then only the value diffs:

$> cat patch.file
11c11
<   TEST: "foo"
---
>   TEST: "NewValue"

Step 4: apply the patch to the original

patch sample.yaml patch.file

Here a screenshot:

image

Utils used:

  • yq 4.20.2
  • patch 2.7.6
  • diff 3.7

OS: debian 11

@clementnuss
Copy link

good idea! I turned that in a fish and bash functions in this Gist:

#fish
function yqblank;
  yq eval "$argv[1]" "$argv[2]" | diff -B "$argv[2]" - | patch "$argv[2]" -o -
end

#bash
yqblank() {
  yq eval $1 $2 | diff -B $2 - | patch $2 -o -
}

this makes it possible to use yq without changing (most) of the blank lines. usage as follows:

yqblank '.' file_name.yml

@raQai
Copy link

raQai commented Apr 29, 2022

@clementnuss I think patch $2 -o - does not work and -o should be removed there.

#bash
yqblank() {
  yq eval $1 $2 | diff -B $2 - | patch $2 -
}

@ryenus
Copy link
Contributor

ryenus commented Apr 30, 2022

@clementnuss I think patch $2 -o - does not work and -o should be removed there.

@raQai, thank you! Just that the arguments have to be quoted properly, also eval/e can be omitted since yq 4.18.1:

#bash
yqblank() {
  yq "$1" "$2" | diff -B "$2" - | patch "$2" -
}

@raQai
Copy link

raQai commented Apr 30, 2022

Oh yeah, I forgot about the quote part 😅
Was on a hurry so thanks for adding this 👍

edit:
I would also like to add, that this still sometimes merges multi line descriptions and arrays into one and it is not able to properly handle comments.

source:
  fruits: [
    Apple,
    Banana,
    Calamansi,
  ]
becomes:
  fruits: [Apple, Banana, Calamansi,]
source:
  fruits: [
    Apple,     # comment 1
    Banana,    # comment 2
    Calamansi, # comment 3
  ]
becomes:
  fruits: [
    Apple, # comment 1
    Banana, # comment 2
    Calamansi, # comment 3
  ]

(I did not verify this on my current machine but that was roughly the result)

edit2:
@arcesino we also ran into the same thing you did with the .info.version update.

Long story short: We still use yq but only to get the line of the .info.version using the line operator and update it using sed.

Something along those lines should work

$ sed -i "$(yq '.info.version | line' "$file")s/$old_val/$new_val/" "$file"

This also returns the correct line if the value of .info.version is broken to the next line

info:
  version: 1.x.x # line 2
info:
  version:
    1.x.x # line 3

@msdobrescu
Copy link

I'm hit by this too. No fix, only workarounds?

@andry81
Copy link

andry81 commented Jul 29, 2022

Approach is following: i remove blanks from the original yaml and create a diff between that and my altered yaml.
The patch then is applied to the original and no new spaces are introduced.

Unfortunately this only works for changes in already existed values. The patch would be with offsetted blank lines if try to add lines to the yaml.

I've already tested that and it does not work as expected for additions:
https://github.com/andry81-devops/gh-workflow/blob/ee5d2d5b6bf59299e39baa16bb85357cf34a8561/bash/github/init-yq-workflow.sh
https://github.com/andry81-devops/gh-workflow/blob/9b9d01a9b60a65d6c3c29f5b4b200409fc6a0aed/bash/cache/accum-content.sh

Search for: yq_edit, yq_diff, yq_patch

So, only the diff-versus-edited-yaml instead of diff-versus-unblanked-yaml looks reliable as @arcesino showed.

@andry81
Copy link

andry81 commented Jul 29, 2022

@arcesino

I've been dealing with this issue for a couple of days when updating very large YAML files and found a workaround using diff & patch commands that restores the stripped blank lines in most of the cases. Suppose you have the following YAML file:

This one has one disadvantage, it does remove comments. And there is no any way to completely correctly retain comments outside the yq utility, because the comments format depends on yaml syntax.

@andry81
Copy link

andry81 commented Aug 8, 2022

I've new implementation of bash scripts which is better of all above.

Implementation: https://github.com/andry81-devops/gh-workflow/blob/master/bash/github/init-yq-workflow.sh
Example of usage: https://github.com/andry81-devops/gh-workflow/blob/master/bash/cache/accum-content.sh

# Usage example:
#
>yq_edit '<prefix-name>' 'edit' "<input-yaml>" "$TEMP_DIR/<output-yaml-edited>" \
  <list-of-yq-eval-strings> && \
  yq_diff "$TEMP_DIR/<output-yaml-edited>" "<input-yaml>" "$TEMP_DIR/<output-diff-edited>" && \
  yq_restore_edited_uniform_diff "$TEMP_DIR/<output-diff-edited>" "$TEMP_DIR/<output-diff-edited-restored>" && \
  yq_patch "$TEMP_DIR/<output-yaml-edited>" "$TEMP_DIR/<output-diff-edited-restored>" "$TEMP_DIR/<output-yaml-edited-restored>" "<output-yaml>"
#
# , where:
#
#   <input-yaml>  - input yaml file path
#   <output-yaml> - output yaml file path
#
#   <output-yaml-edited>          - output file name of edited yaml
#   <output-diff-edited>          - output file name of difference file generated from edited yaml
#   <output-diff-edited-restored> - output file name of restored difference file generated from original difference file
#   <output-yaml-edited-restored> - output file name of restored yaml file stored as intermediate temporary file

Example with test.yml:

# This file is automatically generated
#

content-index:

  timestamp: 1970-01-01T00:00:00Z

  entries:

    - dirs:

        - dir: dir-1/dir-2

          files:

            - file: file-1.dat
              md5-hash:
              timestamp: 1970-01-01T00:00:00Z

            - file: file-2.dat
              md5-hash:
              timestamp:

            - file: file-3.dat
              md5-hash:
              timestamp:

        - dir: dir-1/dir-2/dir-3

          files:

            - file: file-1.dat
              md5-hash:
              timestamp:

            - file: file-2.dat
              md5-hash:
              timestamp:
export GH_WORKFLOW_ROOT='<path-to-gh-workflow-root>' # https://github.com/andry81-devops/gh-workflow

source "$GH_WORKFLOW_ROOT/bash/github/init-yq-workflow.sh"

[[ -d "./temp" ]] || mkdir "./temp"

export TEMP_DIR="./temp"

yq_edit 'content-index' 'edit' "test.yml" "$TEMP_DIR/test-edited.yml" \
  ".\"content-index\".timestamp=\"2022-01-01T00:00:00Z\"" && \
  yq_diff "$TEMP_DIR/test-edited.yml" "test.yml" "$TEMP_DIR/test-edited.diff" && \
  yq_restore_edited_uniform_diff "$TEMP_DIR/test-edited.diff" "$TEMP_DIR/test-edited-restored.diff" && \
  yq_patch "$TEMP_DIR/test-edited.yml" "$TEMP_DIR/test-edited-restored.diff" "$TEMP_DIR/test.yml" "test-patched.yml" || exit $?

PROs:

  • Can restore blank lines together with standalone comment lines: # ...
  • Can restore line end comments: key: value # ...
  • Can detect a line remove/change/add altogether.

CONs:

  • Because of has having a guess logic, may leave artefacts or invalid corrections.
  • Does not restore line end comments, where the yaml data is changed.

Related issues:

@alexklibisz
Copy link

alexklibisz commented Feb 16, 2023

Here is another possible workaround. We basically pre-format the file once with no content changes. Then make the content change. Then compare the pre-formatted and the content-changed versions to get a patch. Then apply the patch to the original file. I've only tried it for simple cases like patching the version in a helm values file. It seems to work well, and also seems to preserve comments.

$ yq --version
yq version 4.9.8
$ # The original file
$ cat values.yaml
# The app name
name: "some-app"

image:
  # The image tag
  tag: "1.2.0"

# Some other comments...
# ...
$ # Don't change anything; just let yq do its default formatting
$ yq eval --exit-status '.' values.yaml | tee out1.yaml
# The app name
name: "some-app"
image:
  # The image tag
  tag: "1.2.0"

# Some other comments...
# ...
$ # Now make the actual change
$ yq eval --exit-status '.image.tag = "1.3.0"' values.yaml | tee out2.yaml
# The app name
name: "some-app"
image:
  # The image tag
  tag: "1.3.0"

# Some other comments...
# ...
$ # Diff the two stripped files to get a minimal diff with no special flags.
$ diff out1.yaml out2.yaml | tee out.patch
5c5
<   tag: "1.2.0"
---
>   tag: "1.3.0"
$ # Apply the patch to the original file, which was unchanged so far.
$ patch values.yaml < out.patch
patching file values.yaml
$ # Inspect the final file. 
$ # Note the version was changed and everything else remained the same.
$ cat values.yaml
# The app name
name: "some-app"

image:
  # The image tag
  tag: "1.3.0"

# Some other comments...
# ...

@andry81
Copy link

andry81 commented Feb 17, 2023

Here is another possible workaround. We basically just pre-strip the newlines and then re-compute the patch by comparing two stripped versions.

It has the same issues with comments and blanks remove.

@alexklibisz
Copy link

Here is another possible workaround. We basically just pre-strip the newlines and then re-compute the patch by comparing two stripped versions.

It has the same issues with comments remove.

I think it works fine with comments. I updated my original post to include comments. LMK if you still see some issue. Maybe I'm overlooking something subtle.

@andry81
Copy link

andry81 commented Feb 18, 2023

I think it works fine with comments. I updated my original post to include comments. LMK if you still see some issue. Maybe I'm overlooking something subtle.

The diff shows position in already edited file:

3c3 means change in 3d line, when actually has changed 6th line:

1: # The app name
2: name: "some-app"
3: 
4: image:
5:   # The image tag
6:   tag: "1.2.0"

Better to use uniform diff to see:

> diff -u out1.yaml out2.yaml | tee out-uniform.patch
--- out1.yaml
+++ out2.yaml
@@ -1,3 +1,3 @@
 name: some-app
 image:
-  tag: "1.2.0"
+  tag: "1.3.0"

To exploit:

values.yaml

# The app name
name: "some-app"

image1:
  # The image1 tag
  tag: "1.2.0"
image2:
  # The image2 tag
  tag: "1.2.0"
> yq -y '.image2.tag = "1.3.0"' values.yaml | tee out2.yaml
name: some-app
image1:
  tag: "1.2.0"
image2:
  tag: "1.3.0"
> patch values.yaml -i out.patch

out.patch

5c5
<   tag: "1.2.0"
---
>   tag: "1.3.0"

values.yaml

# The app name
name: "some-app"

image1:
  # The image1 tag
  tag: "1.3.0"
image2:
  # The image2 tag
  tag: "1.2.0"

This additionally shows why the non uniform diff even without default options is less stable for patching.

@DavidAttar
Copy link

There will be any fixes to this issue in the future?

@anthonyalayo
Copy link

It sounds like there's no workaround?

@chrisgrieser
Copy link

chrisgrieser commented May 1, 2023

prettier is the only yaml formatter I have tried that preserves blank lines correctly

Considering I switched to rome, it feels bit annoying though to have prettier installed just for it's ability to format yaml files :/

@alexklibisz
Copy link

It sounds like there's no workaround?

There are several workarounds mentioned throughout the thread. Look for 👍

@bewuethr
Copy link
Contributor

bewuethr commented Jun 2, 2023

Micro-improvement to the workaround that leaves blank lines alone: I have some YAML files with comments preceded by two blanks, like the SemVer comments left by dependabot when you reference an action by its full commit hash, like

uses: rymndhng/release-on-push-action@aebba2bbce07a9474bf95e8710e5ee8a9e922fe2  # v0.25.0

These blanks also get squashed to just one when you use yq to modify something else.

To prevent, diff has an option -w to ignore all whitespace, resulting in

yq "$1" "$2" | diff -Bw "$2" - | patch "$2" -

@alita1991
Copy link

Hello @bewuethr, I have thoroughly tested the workaround you provided, and it demonstrates excellent functionality, effectively addressing the initial issue. However, I have observed that it does not preserve the newline character that exists after the line modification.

11,12c10
<   tag: ""
< 
---
>   tag: "1.0.0"

@bewuethr
Copy link
Contributor

Hello @bewuethr, I have thoroughly tested the workaround you provided, and it demonstrates excellent functionality, effectively addressing the initial issue. However, I have observed that it does not preserve the newline character that exists after the line modification.

11,12c10
<   tag: ""
< 
---
>   tag: "1.0.0"

That's right, a blank line after a modified line gets removed! I haven't found a better workaround other than moving lines to modify away from a blank line, I'm afraid.

@fulldecent
Copy link

There is an alternate underlying yaml library that claims to encode whitespace. This is a competitor to the library currently used in yq.

https://github.com/pantoniou/libfyaml

@fulldecent
Copy link

This is dumb, but I'm just going to say it. If are are only using whitespace to separate sections and your sections each start with a comment like # some comment, then you can insert the whitespace back in with:

awk '/^---$/{flag=!flag; print; next} flag && /^#/{print ""} {print}'

praveenkumar added a commit to praveenkumar/crc that referenced this issue May 23, 2024
Looks like when yq is run on any yaml file it strip the blank like from
the output so as part of parse mikefarah/yq#515

we use to parse this file to update the go version using the script in
next commit so this commit just to make sure the formatting is right
when it parsed with `yq`

It is generated using `yq --inplace
.github/workflows/windows-artifacts.yml`
praveenkumar added a commit to crc-org/crc that referenced this issue May 23, 2024
Looks like when yq is run on any yaml file it strip the blank like from
the output so as part of parse mikefarah/yq#515

we use to parse this file to update the go version using the script in
next commit so this commit just to make sure the formatting is right
when it parsed with `yq`

It is generated using `yq --inplace
.github/workflows/windows-artifacts.yml`
@rick4096
Copy link

Thanks for the idea with diff & patch @arcesino .

I my case the removal of blanks introduced by diff were unfortunately unacceptable, so i had to dig further.

And found a solution.

Approach is following: i remove blanks from the original yaml and create a diff between that and my altered yaml. The patch then is applied to the original and no new spaces are introduced.

Here an example:

Starting point is my original yaml where the value of key "secrets.TEST" should be updated

---
config:

  # mysql
  DATABASE_PROTOCOL: "mysql"
  # instance fqdn
  DATABASE_HOST: "mysql"

secrets:
  # db password
  DATABASE_PASSWORD: "password"

  # example
  TEST: "foo"

# other values
#[...]

Step 1: updating the value & creating a copy

yq '.secrets.TEST = "NewValue"' sample.yaml > sample.yaml.new

Step 2: removing blanks from the original

yq '.' sample.yaml > sample.yaml.noblanks

Step 3: creating a patch

diff -B sample.yaml.noblanks sample.yaml.new > patch.file

the patch contains then only the value diffs:

$> cat patch.file
11c11
<   TEST: "foo"
---
>   TEST: "NewValue"

Step 4: apply the patch to the original

patch sample.yaml patch.file

Here a screenshot:

image

Utils used:

  • yq 4.20.2
  • patch 2.7.6
  • diff 3.7

OS: debian 11

This approach does not work if the diff ONLY appends new lines, e.g., if doing something like:

	yq "
	  .my.prop.array += {\"Id\": \"$ID\", \"Spec\": \"$SPEC\"}
	" ./helm/values.yaml > $UPDATED_YAML

...the reason is because the line numbers will be completely wrong since the diff was computed using two files without the original blank lines. It will apply but to the wrong location in the original, which is very wrong. You could try to fix this approach by getting patch to ignore blank lines in the context of the hunk, but that apparently is not possible, the --ignore-whitespace option does not tell patch to ignore blank lines in the context, but in the changed lines themselves. The --binary option also does not work, only helping with CRLF on Windows.

Given this limitation, @arcesino 's approach was the only one that actually worked correctly.

@Zigler
Copy link

Zigler commented Jul 26, 2024

Just as a suggestion, one could pre-process insert a tag in to maintain the whitespace or empty lines into a separate place that uses a tag-parser to identify the ordering and where those tag occurances happen in the hierarchy. Then as a post processing step re-insert the whitespace in the right location after processing the original file. This may work better than a diff since it would identify it before the underlying output is created and may handle the cases where the previous diff method may fall short.

@YaoJusheng
Copy link

Hi, I have tried all the suggested methods, but nothing seems to work correctly.

Blank lines, comments, anchor and alias references in yaml files are all destroyed.
Any suggestions for other possible solutions?

Will this support be available in the future? I saw go-yaml#627 closed.

@andry81
Copy link

andry81 commented Aug 7, 2024

Hi, I have tried all the suggested methods, but nothing seems to work correctly.

Can you give an example what does not work with this method: #515 (comment)

@YaoJusheng
Copy link

Hi, I have tried all the suggested methods, but nothing seems to work correctly.

Can you give an example what does not work with this method: #515 (comment)

Sorry, that should be my problem, The yq version I used is 3.x.

After updating to the latest version 4.x, the options of yq itself can already retain the format, except that the blank lines are deleted, which needs to be combined with diff and patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests