-
Notifications
You must be signed in to change notification settings - Fork 549
Decouple hdfs storage from global storage, support config multiple folders for hdfs #1922
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
## Hadoop data node section parser | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pls link from below doc data path config to this file There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the problem is the same with below, if this doc is just for config. Later could tell user config info at service-config doc or other part |
||
- [Default Configuration](#D_Config) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lack There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @YanjieGao Of course, I will update hdfs docs after this PR There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
- [How to Configure](#HT_Config) | ||
- [Generated Configuraiton](#G_Config) | ||
- [Data Table](#T_config) | ||
|
||
#### Default configuration <a name="D_Config"></a> | ||
|
||
[hadoop-data-node default configuration](hadoop-data-node.yaml) | ||
|
||
#### How to configure cluster section in service-configuraiton.yaml <a name="HT_Config"></a> | ||
|
||
All configurations in this section is optional. If you want to customized these value, you can configure it in service-configuration.yaml. | ||
|
||
- `storage_path` The hdfs storage folders, support comma-delimited list of directories. | ||
if isn't specified, will use `cluster.common.data-path/hdfs/data` | ||
|
||
|
||
|
||
#### Generated Configuration <a name="G_Config"></a> | ||
|
||
After parsing, object model will be a comma-delimited string, every substring is a directory: | ||
```yaml | ||
storage_path: /path/to/folder1,/path/to/folder2,... | ||
``` | ||
|
||
|
||
#### Table <a name="T_Config"></a> | ||
|
||
<table> | ||
<tr> | ||
<td>Data in Configuration File</td> | ||
<td>Data in Cluster Object Model</td> | ||
<td>Data in Jinja2 Template</td> | ||
<td>Data type</td> | ||
</tr> | ||
<tr> | ||
<td>hadoop-data-node.virtualClusters</td> | ||
<td>com["hadoop-data-node"]["storage_path"]</td> | ||
<td>cluster_cfg["hadoop-data-node"]["storage_path"]</td> | ||
<td>Str</td> | ||
</tr> | ||
</table> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the default value?
why not make it contains config default value and also give user flexibility to config multi path?
storage_path: /datastorage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YanjieGao
Default value is
cluster_config[common][data-path]/hdfs/data
, which might not be/datastorage
.The logic is: if admin give a specific hdfs storage, then use it, if not, use global storage.
Ideally, if we allow introduce other services' config in
yaml
, then here we could set$cluser_config.common.data-path/hdfs/data
, which would be more clearThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For user only view this config file can't find cluster.common.data-path position. It will be confused for user. Because not all user know to query github code base to find cluster_config[common][storage] cluster object model config.
It is better to tell user clear the default path is where.
We should assume user maybe only get the context of current config file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't think user should see here, the default value is for advanced user or dev. In our design, user should overwrite this value in
services-configuration.yaml
, which contains the necessary context.If we introduce some hard-code path here(even only in comments), it will couple this file with other service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, misunderstanding this as end user yaml file & my intent is not hard code (intend is to tell user could find this config default value is this file's data-path config).
In my understand default value is for not advanced user and advanced user will know customized it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not big problem. Could continue