Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix misread of source if path is already valid #7720

Closed

Conversation

simonrainerson
Copy link
Contributor

Motivation

When receiving a stream of NARs through the ssh-ng protocol, an already existing path would cause the NAR archive to not be read in the stream, resulting in trying to parse the NAR as a ValidPathInfo. This results in the error message:

error: not an absolute path: 'nix-archive-1'

Context

Usually this problem is avoided by running QueryValidPaths before AddMultipleToStore, but can arise when two parallel nix processes gets the same response from QueryValidPaths. This makes the problem more prominent when running builds in parallel.
Fixes #6253

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests - tests/nixos/*
  • documentation in the manual
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or bug fix: updated release notes

@edolstra
Copy link
Member

edolstra commented Feb 1, 2023

Can you give more information on how to reproduce this? In the (current) daemon protocol, NARs are supposed to be wrapped in a FramedSource / FramedSink, which is intended specifically to allow incomplete NAR reads/writes.

@simonrainerson
Copy link
Contributor Author

Can you give more information on how to reproduce this? In the (current) daemon protocol, NARs are supposed to be wrapped in a FramedSource / FramedSink, which is intended specifically to allow incomplete NAR reads/writes.

We could reproduce it consistently by running two nix copy with ssh-ng at the same time against a nix store in a local VM that is missing some derivation.

One succeeds and the other one gets the error: not an absolute path: 'nix-archive-1' error.

The full command was NIX_SSHOPTS="-p 2221 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" nix copy --to ssh-ng://root@localhost /nix/store/some.drv --derivation -L -vv with the .drv file we wanted to copy.

@abbec
Copy link

abbec commented Feb 1, 2023

Can you give more information on how to reproduce this? In the (current) daemon protocol, NARs are supposed to be wrapped in a FramedSource / FramedSink, which is intended specifically to allow incomplete NAR reads/writes.

It is not really an incomplete NAR parsing as much as a completely absent one. Would a FramedSource be able to handle that? I mean the next read from the source would be to parse the path info since the path already existing is not really an error, right?

@thufschmitt
Copy link
Member

I'm actually not surprised at all (in hindsight, I have to admit I totally overlooked this): This bug had already been reported in the general case (#6730) and fixed (although I apparently forgot to close the issue) while I rewrote the implementation of nix copy for everything but ssh-ng stores (#6612). But it's logical then that ssh-ng stores still suffer from that bug

Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's very sensible 👍

Apart from the small code comment, do you have an easy self-contained reproducer for this? It would be nice to add as a test case

src/libstore/local-store.cc Outdated Show resolved Hide resolved
@edolstra
Copy link
Member

edolstra commented Feb 2, 2023

I think the real bug here is that wopAddMultipleToStore uses a single FramedSource for all paths. Instead it should use a separate FramedSource for each path. But that does require a protocol change.

When receiving a stream of NARs through the ssh-ng protocol, an already
existing path would cause the NAR archive to not be read in the stream,
resulting in trying to parse the NAR as a ValidPathInfo. This results in
the error message:
    error: not an absolute path: 'nix-archive-1'

Fixes NixOS#6253

Usually this problem is avoided by running QueryValidPaths before
AddMultipleToStore, but can arise when two parallel nix processes gets
the same response from QueryValidPaths. This makes the problem more
prominent when running builds in parallel.
@simonrainerson simonrainerson force-pushed the fix-add-to-store-existing branch from 3e8062c to ff81d84 Compare February 14, 2023 12:35
@simonrainerson simonrainerson requested review from thufschmitt and edolstra and removed request for fricklerhandwerk, thufschmitt and edolstra February 14, 2023 12:36
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-44/25546/1

@simonrainerson simonrainerson requested review from thufschmitt and removed request for edolstra March 14, 2023 16:06
@thufschmitt
Copy link
Member

@simonrainerson Do you think you can turn #7720 (comment) into a proper test? The PR is good for me otherwise, but I'd really like this to be a bit more properly tested if that can be done

@simonrainerson
Copy link
Contributor Author

@simonrainerson Do you think you can turn #7720 (comment) into a proper test? The PR is good for me otherwise, but I'd really like this to be a bit more properly tested if that can be done

I'm not very familiar with the code so If you have any pointer to how to set up a LocalStore in a test that would be helpful. Then I think it would pretty straight forward to test for.

@thufschmitt
Copy link
Member

I'm not very familiar with the code so If you have any pointer to how to set up a LocalStore in a test that would be helpful. Then I think it would pretty straight forward to test for.

You can use --store ssh-ng://localhost?remote-store=$TEST_ROOT/remote-store (or --to if you're using nix copy) to get a store located under $TEST_ROOT/remote-store and that you'll connect to using a shim of the ssh-ng protocol. I think that would be enough, but let me know if you need anything more.

@thufschmitt
Copy link
Member

@simonrainerson I've rebased your branch and added a test on https://github.com/tweag/nix/fix-add-to-store-existing. Do you want to incorporate that into this PR? Or I can open a new one if you don't want to bother with it

@simonrainerson
Copy link
Contributor Author

@simonrainerson I've rebased your branch and added a test on https://github.com/tweag/nix/fix-add-to-store-existing. Do you want to incorporate that into this PR? Or I can open a new one if you don't want to bother with it

I get a 404 from that link, feel free to open a pr if you want. I'm back at my computer at the end of the week (or start of next) and can update this PR then. Thanks either way!

@thufschmitt
Copy link
Member

My bad, the correct link is https://github.com/tweag/nix/tree/fix-add-to-store-existing.

I've opened #8805 as a rebased-version of this one. Should auto-merge once the CI is done running. Thanks!

@thufschmitt thufschmitt closed this Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

error: not an absolute path: 'nix-archive-1' with --store builds
6 participants