You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a table with a large number of partitions in S3. I use partition projection with dynamic id partitioning to query the dataset using Athena, so there are no logical partitions created for the table in the Glue Data Catalog. The first two of the three partition values in the projection are 'injected'. Running a S3F2 deletion job against this table runs successfully, but always returns a count of 0 deletes. I'm assuming that this is because the data catalog is returning an empty list of partitions each time. I want to confirm whether partition projection is supported or not.
The text was updated successfully, but these errors were encountered:
Hi, thanks for opening an issue. Partition Projection is a relatively new feature for Athena, and I don't think we've done any extensive testing on that. What I know, is that partitions are fetched during query generation, and if none are returned a query should be made to fetch all results and Athena should hopefully sort out the mechanism to project them as it would do with any other query similarly for how it would do it with an unpartitioned query - so it should work.
I can do a quick test on my side, in the meanwhile, can you provide some details on your side:
Which version are you using of the S3 Find & Forget solution?
Is this dataset parquet?
You mentioned three partitions are injected - can you provide a sample of that? If you could share a sample of your glue table metadata (so, how you configured the partitions) and their values, I can try to see if I can reproduce on my side with the same schema
When you inspect your job, you should see in the UI/API the number of queries executed. Can you confirm you see a query for that particular dataset? To dig into it, you can check the Job Events using UI/API and search for a QuerySucceeded event
You mention that you see 0 deletions - have you checked with a manual query whether that is a false negative? If you run a query with athena for instance, such as select * from table where customer_id in ('value', 'value1') do you actually see any rows? I guess what I'm asking here is, have you checked your table's dynamic partitioning configuration is correctly setup?
I have a table with a large number of partitions in S3. I use partition projection with dynamic id partitioning to query the dataset using Athena, so there are no logical partitions created for the table in the Glue Data Catalog. The first two of the three partition values in the projection are 'injected'. Running a S3F2 deletion job against this table runs successfully, but always returns a count of 0 deletes. I'm assuming that this is because the data catalog is returning an empty list of partitions each time. I want to confirm whether partition projection is supported or not.
The text was updated successfully, but these errors were encountered: