Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11784. Allow aborting FSO multipart uploads with missing parent directories #7700

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ public enum ResultCodes {
NO_SUCH_MULTIPART_UPLOAD_ERROR,

MISMATCH_MULTIPART_LIST,
MISSING_MULTIPART_KEY_INFO_ERROR,

MISSING_UPLOAD_PARTS,

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,8 @@
import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_SNAPSHOT_DB_MAX_OPEN_FILES_DEFAULT;
import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_SNAPSHOT_CHECKPOINT_DIR_CREATION_POLL_TIMEOUT;
import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_SNAPSHOT_CHECKPOINT_DIR_CREATION_POLL_TIMEOUT_DEFAULT;
import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.BUCKET_NOT_FOUND;
import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.FILE_NOT_FOUND;
import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.VOLUME_NOT_FOUND;
import static org.apache.hadoop.ozone.OzoneConsts.OM_SNAPSHOT_CHECKPOINT_DIR;
import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.*;
import static org.apache.hadoop.ozone.om.service.SnapshotDeletingService.isBlockLocationInfoSame;
import static org.apache.hadoop.ozone.om.snapshot.SnapshotUtils.checkSnapshotDirExist;

Expand Down Expand Up @@ -899,11 +897,25 @@ public String getMultipartKeyFSO(String volume, String bucket, String key, Strin
final long volumeId = getVolumeId(volume);
final long bucketId = getBucketId(volume,
bucket);
long parentId =
OMFileRequest.getParentID(volumeId, bucketId, key, this);

String fileName = OzoneFSUtils.getFileName(key);
long parentId;
try {
parentId = OMFileRequest.getParentID(volumeId, bucketId, key, this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going over the code in a bit more detail but why not use the exception handling way to calculate the parentId as the default way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my suggestion since there might be some codes that depends on the fact getParentID implementation (i.e. parent directory exists).

I'm fine to use the updated way as long as there are no regressions found.

Copy link
Contributor Author

@sokui sokui Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I just follow @ivandika3 suggestion, which is safer. And I tested if I just use the exception handling way to do it, then OMKeyCreateRequestWithFSO#getDBMultipartOpenKey method will throw exception when the MPU is aborted (it breaks testAbortUploadSuccessWithParts test). And this method is used in multiple places. Of course, we can update all these places. I am just not sure if it is safe. Just try to limit the scope of this PR. Pls let me know your thoughts

} catch (final Exception e) {
// It is possible we miss directories and exception is thrown.
// see https://issues.apache.org/jira/browse/HDDS-11784
LOG.warn("Got exception when finding parent id for {}/{}/{}. Use another way to get it",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a common occurance in a concurrent system and might not warrant a WARN log.

Copy link
Contributor Author

@sokui sokui Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per @ivandika3 's suggestion, we probably need to update the logic of directory deletions for incomplete MPU so that we can prevent missing parent directories case happening (or make it in a very low probability): #7566 (comment)

From high level, I feel if this case happens frequently, it means something is not right (either at code level or design level). How can a key exist but its parent directories already got deleted? Thats why I feel it is good to have this warn log here. Pls let me know your thoughts.

volumeId, bucketId, key, e);
final String nonFSOMultipartKey =
getMultipartKey(volume, bucket, key, uploadId);
final OmMultipartKeyInfo multipartKeyInfo =
getMultipartInfoTable().get(nonFSOMultipartKey);
if (multipartKeyInfo == null) {
throw new OMException(MISSING_MULTIPART_KEY_INFO_ERROR);
}
ivandika3 marked this conversation as resolved.
Show resolved Hide resolved
parentId = multipartKeyInfo.getParentID();
}

final String fileName = OzoneFSUtils.getFileName(key);
return getMultipartKey(volumeId, bucketId, parentId,
fileName, uploadId);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
import static org.apache.hadoop.ozone.om.OmMetadataManagerImpl.FILE_TABLE;
import static org.apache.hadoop.ozone.om.OmMetadataManagerImpl.MULTIPARTINFO_TABLE;
import static org.apache.hadoop.ozone.om.OmMetadataManagerImpl.OPEN_FILE_TABLE;
import static org.apache.hadoop.ozone.om.OmMetadataManagerImpl.DIRECTORY_TABLE;

/**
* Response for Multipart Upload Complete request.
Expand All @@ -47,7 +48,7 @@
* 3) Delete unused parts.
*/
@CleanupTableInfo(cleanupTables = {OPEN_FILE_TABLE, FILE_TABLE, DELETED_TABLE,
MULTIPARTINFO_TABLE})
MULTIPARTINFO_TABLE, DIRECTORY_TABLE})
public class S3MultipartUploadCompleteResponseWithFSO
extends S3MultipartUploadCompleteResponse {

Expand Down
Loading