Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrestoSql is failing to rename /tmp folder when PerformanceCacheEnabled is set to true #342

Closed
afilipchik opened this issue Mar 27, 2020 · 2 comments

Comments

@afilipchik
Copy link

afilipchik commented Mar 27, 2020

Chasing weird bug. Presto ( has an operation: create table as (select ...). During this operation it creates a tmp folder on GCS and writes intermediate results there. After all files are completed, it moves those files into the destination using rename operation.

2020-03-26T20:29:45.941-0700	INFO	gcsfs-batch-helper-8	com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl	Precondition not met while deleting 'gs:/bucket/mp/user/6c5e9a11-49c5-4558-bbab-525f9e9a5f1d/' at generation 0. Attempt 1. Retrying:
{"code":412,"errors":[{"domain":"global","location":"If-Match","locationType":"header","message":"Precondition Failed","reason":"conditionNotMet"}],"message":"Precondition Failed"}
2020-03-26T20:29:46.008-0700	INFO	gcsfs-batch-helper-8	com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl	Precondition not met while deleting 'gs://bucket/tmp/user_name/6c5e9a11-49c5-4558-bbab-525f9e9a5f1d/' at generation 0. Attempt 2. Retrying:
{"code":412,"errors":[{"domain":"global","location":"If-Match","locationType":"header","message":"Precondition Failed","reason":"conditionNotMet"}],"message":"Precondition Failed"}

The reason is that with setPerformanceCacheEnabled(true) the connector is caching response of the: fs.getFileInfo('/tmp/folder') when /tmp/folder doesn't exist. So, Presto calls this check, sees that folder doesn't exist, creates it, then tries to delete and fails as generationId=0 is passed into the delete operation (and 0 is coming from cached operation which doesn't find the file)

In general, if we check whether the folder exists before creating it, and then execute write and rename before cache expires, rename will fail. Have a very simple test case.

@afilipchik afilipchik changed the title PrestoSql is failing to rename /tmp folder PrestoSql is failing to rename /tmp folder when PerformanceCacheEnabled is set to true Mar 27, 2020
@HunterEl
Copy link

I am seeing the same issue! Would love to see a fix for this. LMK if there is anything I can do to help out!

@medb
Copy link
Contributor

medb commented Mar 27, 2020

We have recently fixed this issue with caching non-existent objects in performance cache.

It was released in GCS connector 2.1.0.

May you try if Presto query fails with the latest GCS connector version, if you are not using it already?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants