Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement services webhdfs #1263

Merged
merged 35 commits into from
Feb 7, 2023
Merged

Conversation

ClSlaid
Copy link
Contributor

@ClSlaid ClSlaid commented Jan 31, 2023

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

This PR implements WebHDFS service support for OpenDAL

@Xuanwo
Copy link
Member

Xuanwo commented Feb 1, 2023

Hi, I'm working on #1260 now which may introduce lots of breaking changes and conflicts. Can you wait for sometime and rebase on my work after #1260 get merged? I expect to finish all work this week.

1. some documentation works
2. fix listing
3. try trouble shoot listing and writing, unsuccessful

Signed-off-by: 蔡略 <[email protected]>
@wcy-fdu
Copy link
Contributor

wcy-fdu commented Feb 3, 2023

I am curious whether webhdfs does not depend on hadoop and java environment?

@ClSlaid
Copy link
Contributor Author

ClSlaid commented Feb 3, 2023

I am curious whether webhdfs does not depend on hadoop and java environment?

The service is purely rust implemented and communicates with Hadoop via RESTful APIs.

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

1. add webhdfs test in ci, running simutaniously with hdfs test
2. add CSRF feature for webhdfs
3. finish documentation for webhdfs

Signed-off-by: 蔡略 <[email protected]>
@ClSlaid
Copy link
Contributor Author

ClSlaid commented Feb 3, 2023

Most work of this service is complete, waiting for #1260 and beyondstorage/setup-hdfs#138

1. default webhdfs port is 9870, controled by dfs.namenode.http-address

Signed-off-by: 蔡略 <[email protected]>
1. set endpoint, hope this works

Signed-off-by: 蔡略 <[email protected]>
@Xuanwo
Copy link
Member

Xuanwo commented Feb 3, 2023

#1260 is about to merge, welcome to play with our new framework!

.github/workflows/service_test_hdfs_and_webhdfs.yml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/services/webhdfs/auth.rs Outdated Show resolved Hide resolved
src/services/webhdfs/backend.rs Outdated Show resolved Hide resolved
src/services/webhdfs/backend.rs Outdated Show resolved Hide resolved
src/services/webhdfs/uri.rs Outdated Show resolved Hide resolved
src/services/webhdfs/backend.rs Outdated Show resolved Hide resolved
src/scheme.rs Outdated Show resolved Hide resolved
src/services/webhdfs/backend.rs Outdated Show resolved Hide resolved
src/services/webhdfs/backend.rs Outdated Show resolved Hide resolved
@Xuanwo
Copy link
Member

Xuanwo commented Feb 6, 2023

Hi, any updates on this PR?

@ClSlaid
Copy link
Contributor Author

ClSlaid commented Feb 6, 2023

Just resolved conflicts with the main branch.
Now applying review opinions.

1. don't run check root when building Backend, run it only when
   Backend::stat instead
2. move messages structs to `message.rs`
3. apply some advices in review

Signed-off-by: 蔡略 <[email protected]>
@Xuanwo
Copy link
Member

Xuanwo commented Feb 6, 2023

Wow~

image

@Xuanwo Xuanwo marked this pull request as ready for review February 6, 2023 11:28
@Xuanwo Xuanwo changed the title feat: Services/webhdfs feat: Implement services webhdfs Feb 6, 2023
@Xuanwo
Copy link
Member

Xuanwo commented Feb 6, 2023

[2023-02-06T11:39:09Z ERROR opendal::services] service=webhdfs operation=write path=043a4379-8ae9-4d85-b26f-ca47341ddd44 size=3135326 -> failed: Unexpected (permanent) at write => building request
    
    Context:
        service: webhdfs
        path: 043a4379-8ae9-4d85-b26f-ca47341ddd44
    
    Source: invalid format
    
thread 'services_webhdfs_write::test_fuzz_offset_reader' panicked at 'write must succeed: Unexpected (permanent) at write => building request

Context:
    service: webhdfs
    path: 043a4379-8ae9-4d85-b26f-ca47341ddd44

hdfs action is not stable so we will able the test case pass. Please check them by hand.

@ClSlaid
Copy link
Contributor Author

ClSlaid commented Feb 6, 2023

hdfs action is not stable so we will able the test case pass. Please check them by hand.

It's strange that environment variable WEBHDFS_NAMENODE_ADDR in CI seems not working, causing backend building failure on endpoint=http:, I'll take a look.

Xuanwo and others added 10 commits February 6, 2023 21:54
Signed-off-by: Xuanwo <[email protected]>
1. fix incorrect documentation in webhdfs
2. remove unnecessary language environments in webhdfs CI

Signed-off-by: 蔡略 <[email protected]>
1. it does not work at all, implement kerberos later

Signed-off-by: 蔡略 <[email protected]>
Signed-off-by: 蔡略 <[email protected]>
@Xuanwo
Copy link
Member

Xuanwo commented Feb 7, 2023

This PR is good enough to get merged! I will merge it after ALL CI passed.

@Xuanwo Xuanwo merged commit fb1189e into apache:main Feb 7, 2023
@Xuanwo Xuanwo mentioned this pull request Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants