Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check whether renames worked in atomic hdfs pipes #2119

Merged
merged 1 commit into from
May 22, 2017

Conversation

daveFNbuck
Copy link
Contributor

Description

Using the snakebite client, renaming a file to an existing file fails without raising an exception. Instead, it returns a list of moves and whether each ones succeeded. This means that when using the snakebite client, atomic writes to existing files fail silently versus failing while throwing an error with the hadoopcli client.

There are two ways to fix this. First, we can simply remove the target file if it already exists. Second, we can ensure that we read the snakebite error and throw it if it happens. This PR does both.

For atomic directory writes, this also adds a check for whether the directory was renamed to the target or moved inside an existing target directory.

Motivation and Context

As described in #2118, I was having an issue where jobs that seemed to be running correctly were not resulting in updated files. This resolves #2118.

Have you tested this? If so, how?

I added minicluster unit tests. I have also tested the part of this that removes the target if it exists for the non-directory pipe. The demonstration code works as expected with that added. I added the other fixes after coming up with unit tests.

@mention-bot
Copy link

@daveFNbuck, thanks for your PR! By analyzing the history of the files in this pull request, we identified @Tarrasch, @freider and @lallea to be potential reviewers.

Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubberstamping as green. Didn't look through this in detail, but you seem to have good test coverage.

Using the snakebite client, renaming a file to an existing file fails
without raising an exception. Instead, it returns a list of moves and
whether each ones succeeded. This means that when using the snakebite
client, atomic writes to existing files fail silently versus failing
while throwing an error with the hadoopcli client.

There are two ways to fix this. First, we can simply remove the target
file if it already exists. Second, we can ensure that we read the
snakebite error and throw it if it happens. This PR does both.

For atomic directory writes, this also adds a check for whether the
directory was renamed to the target or moved inside an existing target
directory.
@daveFNbuck
Copy link
Contributor Author

Fixed the merge conflict with the other PR. Thanks for reviewing so quickly. You may want to just check that you think my solution is reasonable. It adds a decent amount of overhead to these operations. I think it's ok because hdfs writes tend to be large and data integrity should always take priority over pipeline speed.

@Tarrasch Tarrasch merged commit a2f63c6 into spotify:master May 22, 2017
@Tarrasch
Copy link
Contributor

Honestly I never used luigi pipes to write to files after I quit Spotify. I reviewed a little bit more. looks good. :)

@daveFNbuck daveFNbuck deleted the fix_hdfs_rename branch June 2, 2017 00:03
@Milage
Copy link

Milage commented Jun 12, 2017

I was having the same problem and am trying out the latest version with this fix. It doesn't seem to be working, but I think that is because it doesn't play well with snakebite, which I'm using as my hdfs client.

It throws a FileNotFoundException rather than a HDFSCliError when trying to remove a file that doesn't exist.

  File "/x/s/hadoop/luigi/trueex/SMARTSLoad.py", line 220, in run
    f.close()
  File "/x/s/hadoop-2.7.3/pythonlibs/luigi/contrib/hdfs/format.py", line 51, in close
    remove(self.path)
  File "/x/s/hadoop-2.7.3/pythonlibs/luigi/contrib/hdfs/clients.py", line 62, in result
    return getattr(get_autoconfig_client(), method_name)(*args, **kwargs)
  File "/x/s/hadoop-2.7.3/pythonlibs/luigi/contrib/hdfs/snakebite_client.py", line 140, in remove
    return list(self.get_bite().delete(self.list_path(path), recurse=recursive))
  File "build/bdist.linux-x86_64/egg/snakebite/client.py", line 508, in delete
    for item in self._find_items(paths, processor, include_toplevel=True):
  File "build/bdist.linux-x86_64/egg/snakebite/client.py", line 1216, in _find_items
    raise FileNotFoundException("`%s': No such file or directory" % path)

Indeed switching the hdfs client back to the default works.

Any suggestions as to the best way to fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HdfsAtomicWritePipe silently fails when target file exists
4 participants