You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lately due to some pipeline issues I've ended up re-running a lot of jobs while their output exists. They seem to run fine but the output doesn't change or even get an updated modified time. I tracked this down to the atomic write pipe just silently failing when the move does nothing. I use the snakebite client with hdfs fallback. I believe this is happening in the snakebite client.
Demonstration code:
import luigi.contrib.hdfs
target = luigi.contrib.hdfs.HdfsTarget('/tmp/test.txt')
with target.open('w') as fobj:
fobj.write('test1')
try:
with target.open('w') as fobj:
fobj.write('test2')
finally:
with target.open() as fobj:
print '\ncontents:', fobj.read()
target.remove()
I would expect to either see test2 printed. At the very least, I'd expect to see an error message if it prints test1, as this means the second write didn't work. Instead I see test1 and no error message. So it looks like I successfully wrote test2 to hdfs when I didn't.
The text was updated successfully, but these errors were encountered:
Lately due to some pipeline issues I've ended up re-running a lot of jobs while their output exists. They seem to run fine but the output doesn't change or even get an updated modified time. I tracked this down to the atomic write pipe just silently failing when the move does nothing. I use the snakebite client with hdfs fallback. I believe this is happening in the snakebite client.
Demonstration code:
I would expect to either see
test2
printed. At the very least, I'd expect to see an error message if it printstest1
, as this means the second write didn't work. Instead I seetest1
and no error message. So it looks like I successfully wrotetest2
to hdfs when I didn't.The text was updated successfully, but these errors were encountered: