Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Decouple hdfs storage from global storage, support config multiple folders for hdfs #1922

Merged
merged 5 commits into from
Dec 21, 2018

Conversation

mzmssg
Copy link
Member

@mzmssg mzmssg commented Dec 20, 2018

No description provided.

@mzmssg mzmssg requested review from YanjieGao and ydye December 20, 2018 13:51
@coveralls
Copy link

Coverage Status

Coverage remained the same at 51.71% when pulling 0ab9b43 on zimiao/decouple_hdfs_storage into 829a018 on master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage remained the same at 51.71% when pulling 0ab9b43 on zimiao/decouple_hdfs_storage into 829a018 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 51.71% when pulling 0ab9b43 on zimiao/decouple_hdfs_storage into 829a018 on master.

@coveralls
Copy link

coveralls commented Dec 20, 2018

Coverage Status

Coverage increased (+0.01%) to 51.722% when pulling 60315d2 on zimiao/decouple_hdfs_storage into 829a018 on master.


After parsing, object model looks like:
```yaml
storage_path: /path/to/folder1,/path/to/folder2,etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

,etc. could remove which maybe misleading user. if the format is the same with hdfs-site.xml, could refer that config intro format

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update some comments

#hadoop-data-node:
# # storage path for hdfs, support comma-delimited list of directories, eg. /path/to/folder1,/path/to/folder2 ...
# # if left empty, will use cluster.common.data-path/hdfs/data
# storage_path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the default value?

why not make it contains config default value and also give user flexibility to config multi path?
storage_path: /datastorage

Copy link
Member Author

@mzmssg mzmssg Dec 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YanjieGao
Default value is cluster_config[common][data-path]/hdfs/data, which might not be /datastorage.
The logic is: if admin give a specific hdfs storage, then use it, if not, use global storage.
Ideally, if we allow introduce other services' config in yaml, then here we could set $cluser_config.common.data-path/hdfs/data, which would be more clear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For user only view this config file can't find cluster.common.data-path position. It will be confused for user. Because not all user know to query github code base to find cluster_config[common][storage] cluster object model config.

It is better to tell user clear the default path is where.

We should assume user maybe only get the context of current config file.

Copy link
Member Author

@mzmssg mzmssg Dec 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I don't think user should see here, the default value is for advanced user or dev. In our design, user should overwrite this value in services-configuration.yaml, which contains the necessary context.

If we introduce some hard-code path here(even only in comments), it will couple this file with other service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, misunderstanding this as end user yaml file & my intent is not hard code (intend is to tell user could find this config default value is this file's data-path config).

In my understand default value is for not advanced user and advanced user will know customized it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not big problem. Could continue

@mzmssg mzmssg changed the title Decouple hdfs storage, support config multiple folders for hdfs Decouple hdfs storage from global storage, support config multiple folders for hdfs Dec 21, 2018
@@ -0,0 +1,44 @@
## Hadoop data node section parser

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem is the same with below, if this doc is just for config. Later could tell user config info at service-config doc or other part

@@ -0,0 +1,44 @@
## Hadoop data node section parser

- [Default Configuration](#D_Config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lack
1 first deployed user how to do: step by step. [step's could refer some deployment doc]
2 upgrading user how to do: step by step. [step's could refer some deployment doc]

Copy link
Member Author

@mzmssg mzmssg Dec 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YanjieGao
I think in our design, this doc shoud follow some standard items, only for per service configuration introduction. Not for maintance or other purpose.
i.e.
https://github.com/Microsoft/pai/blob/master/src/rest-server/config/rest-server.md
https://github.com/Microsoft/pai/blob/master/src/grafana/config/grafana.md

Of course, I will update hdfs docs after this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@mzmssg mzmssg merged commit 1064ab6 into master Dec 21, 2018
@xudifsd xudifsd deleted the zimiao/decouple_hdfs_storage branch May 24, 2019 02:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants