Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HdfsDataSegmentPusher: Properly include scheme, host in output path if necessary. #3577

Merged
merged 1 commit into from
Oct 17, 2016

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Oct 15, 2016

Fixes #3576.

@gianm gianm added the Bug label Oct 15, 2016
@gianm gianm added this to the 0.9.2 milestone Oct 15, 2016
@gianm gianm closed this Oct 15, 2016
@gianm gianm reopened this Oct 15, 2016
@gianm gianm closed this Oct 15, 2016
@gianm gianm reopened this Oct 15, 2016
@pjain1
Copy link
Member

pjain1 commented Oct 16, 2016

👍

@@ -150,7 +151,7 @@ private DataSegment createDescriptorFile(DataSegment segment, Path outDir, final

private ImmutableMap<String, Object> makeLoadSpec(Path outFile)
{
return ImmutableMap.<String, Object>of("type", "hdfs", "path", outFile.toString());
return ImmutableMap.<String, Object>of("type", "hdfs", "path", outFile.toUri().toString());
Copy link
Member

@pjain1 pjain1 Oct 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this not needed before ? It wasoutFile.toString() before #3494 and used to work.
FYI - I see this note in the toString() method of Path class -

// we can't use uri.toString(), which escapes everything, because we want
// illegal characters unescaped in the string, for glob processing, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It helps the new unit test to pass; without it, file:///x/y/z gets translated to file:/x/y/z.

I think that comment on Path actually indicates that adding toUri is a good thing. We're adding a path to a single file and don't want glob processing or any other special processing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@fjy
Copy link
Contributor

fjy commented Oct 17, 2016

👍

@fjy fjy merged commit 0ce33bc into apache:master Oct 17, 2016
gianm added a commit to gianm/druid that referenced this pull request Oct 17, 2016
fjy pushed a commit that referenced this pull request Oct 17, 2016
@himanshug
Copy link
Contributor

@gianm will this always put "hdfs://host:port" in the load spec even if same was not specified in the "storageDirectory"
i'm asking because we sometimes move segments from one hdfs cluster to another while keeping the absolute path of all the segment files same. Earlier, since "host:port" wasn't present in the db so no db updates were required to do this kind of movement.
wondering if that is still the case after this PR?

@gianm gianm deleted the fix-3576 branch October 17, 2016 19:33
@gianm
Copy link
Contributor Author

gianm commented Oct 18, 2016

@himanshug if the storageDirectory has no scheme (just /path/to/blah), it will stay that way in the loadSpec. See the UTs, one of them is for a storageDirectory with no scheme.

@himanshug
Copy link
Contributor

@gianm that sounds good, thanks.

fundead pushed a commit to fundead/druid that referenced this pull request Dec 7, 2016
seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants