Skip to content

Commit

Permalink
1
Browse files Browse the repository at this point in the history
  • Loading branch information
zclllyybb committed Dec 18, 2024
1 parent a0d5805 commit 5c83b2d
Show file tree
Hide file tree
Showing 18 changed files with 105 additions and 82 deletions.
19 changes: 8 additions & 11 deletions docs/table-design/data-partitioning/auto-partitioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,35 +245,32 @@ It can be concluded that the partitions created by Auto Partition share the same
## Conjunct with Dynamic Partition
Doris supports both Auto and Dynamic Partition. In this case, both functions are in effect:
1. Auto Partition will automatically create partitions on demand during data import;
2. Dynamic Partition will automatically create, recycle and dump partitions.
There is no conflict between the two syntaxes, just set the corresponding clauses/attributes at the same time.
Doris supports both Auto and Dynamic Partition. In this case, the functions of both work together on the table:
### Best Practice
1. Auto Partition will automatically create partitions on demand during data import;
2. Dynamic Partition will reclaim and dump partitions as they were originally intended to function.
In scenarios where you need to set a limit on the partition lifecycle, you can **disable the creation of Dynamic Partition, leaving the creation of partitions to be completed by Auto Partition**, and complete the management of the partition lifecycle through the Dynamic Partition's function of dynamically reclaiming partitions:
`dynamic_partition.create_method` property must be set to `AUTO`,That is, the `dynamic_partition.end` attribute will be ignored and the partition creation function will be taken over by Auto Partition.
```sql
create table auto_dynamic(
k0 datetime(6) NOT NULL
)
auto partition by range (date_trunc(k0, 'year'))
(
)
()
DISTRIBUTED BY HASH(`k0`) BUCKETS 2
properties(
"dynamic_partition.enable" = "true",
"dynamic_partition.prefix" = "p",
"dynamic_partition.start" = "-50",
"dynamic_partition.end" = "0", --- Dynamic Partition No Partition Creation
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.time_unit" = "year",
"dynamic_partition.create_method" = "AUTO",
"replication_num" = "1"
);
```
This way we have both the flexibility of Auto Partition and consistency in partition names.
This gives us both partition lifecycle management capabilities in addition to Auto Partition, and partition name consistency.
## Partition Management
Expand Down
3 changes: 2 additions & 1 deletion docs/table-design/data-partitioning/basic-concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,8 @@ PROPERTIES
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "month", --- Both must have the same granularity
"dynamic_partition.start" = "-2", --- Dynamic Partition automatically cleans up partitions that are more than two weeks old
"dynamic_partition.end" = "0", --- Dynamic Partition does not create future partitions. it is left entirely to Auto Partition.
"dynamic_partition.create_method" = "AUTO", --- Required. Partition creation is handled entirely by Auto Partition
"dynamic_partition.end" = "0", --- Will be ignored.
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "8"
);
Expand Down
4 changes: 4 additions & 0 deletions docs/table-design/data-partitioning/dynamic-partitioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ The rules of dynamic partition are prefixed with `dynamic_partition.`:

Whether to enable the dynamic partition feature. Can be specified as `TRUE` or` FALSE`. If not filled, the default is `TRUE`. If it is `FALSE`, Doris will ignore the dynamic partitioning rules of the table.

- `dynamic_partition.create_method`

Whether partition creation is taken over by dynamic or auto partition. The default is `SCHEDULE`, which means dynamic partition creation. When auto partition is enabled at the same time, you must manually specify this item as `AUTO`, at this time, dynamic partition no longer creates partitions for the table, which is taken over by the auto partition function.

- `dynamic_partition.time_unit`(required parameters)

The unit for dynamic partition scheduling. Can be specified as `HOUR`,`DAY`,` WEEK`, `MONTH` and `YEAR`, means to create or delete partitions by hour, day, week, month and year, respectively.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -245,35 +245,32 @@ mysql> show partitions from `DAILY_TRADE_VALUE`;
## 与动态分区联用
Doris 支持自动分区和动态分区同时使用。此时,二者的功能都生效:
1. 自动分区将会自动在数据导入过程中按需创建分区;
2. 动态分区将会自动创建、回收、转储分区。
二者语法功能不存在冲突,同时设置对应的子句/属性即可。
Doris 支持自动分区和动态分区同时使用。此时,二者的功能共同作用于该表上:
### 最佳实践
1. 自动分区将会自动在数据导入过程中按需创建分区;
2. 动态分区将会按原功能回收、转储分区。
需要对分区生命周期设限的场景,可以**将 Dynamic Partition 的创建功能关闭,创建分区完全交由 Auto Partition 完成**,通过 Dynamic Partition 动态回收分区的功能完成分区生命周期的管理:
`dynamic_partition.create_method` 属性必须被设置为 `AUTO`,即 `dynamic_partition.end` 属性将被忽略,分区创建功能全部由 Auto Partition 承担。
```sql
create table auto_dynamic(
k0 datetime(6) NOT NULL
)
auto partition by range (date_trunc(k0, 'year'))
(
)
()
DISTRIBUTED BY HASH(`k0`) BUCKETS 2
properties(
"dynamic_partition.enable" = "true",
"dynamic_partition.prefix" = "p",
"dynamic_partition.start" = "-50",
"dynamic_partition.end" = "0", --- Dynamic Partition 不创建分区
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.time_unit" = "year",
"dynamic_partition.create_method" = "AUTO",
"replication_num" = "1"
);
```
这样我们同时具有了 Auto Partition 的灵活性,且分区名上保持了一致性。
这样我们在 Auto Partition 的基础上同时拥有了分区生命周期管理能力,且分区名上保持了一致性。
## 分区管理
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,8 @@ PROPERTIES
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "month", --- 二者粒度必须相同
"dynamic_partition.start" = "-2", --- 动态分区自动清理超过两周的历史分区
"dynamic_partition.end" = "0", --- 动态分区不创建未来分区,完全交给自动分区
"dynamic_partition.create_method" = "AUTO", --- 必需。分区创建完全由自动分区负责
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "8"
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ under the License.

是否开启动态分区特性。可指定为 `TRUE``FALSE`。如果不填写,默认为 `TRUE`。如果为 `FALSE`,则 Doris 会忽略该表的动态分区规则。

- `dynamic_partition.create_method`

分区创建由动态分区还是自动分区接管。默认为 `SCHEDULE`,即动态分区创建。当同时开启自动分区时,必须手动指定此项为 `AUTO`,此时动态分区不再为该表创建分区,均由自动分区功能接管。

- `dynamic_partition.time_unit`**(必选参数)**

动态分区调度的单位。可指定为 `HOUR``DAY``WEEK``MONTH``YEAR`。分别表示按小时、按天、按星期、按月、按年进行分区创建或删除。
Expand Down Expand Up @@ -354,7 +358,7 @@ Doris FE 中有固定的 dynamic partition 控制线程,持续以特定时间

因此,自动分区表在系统自动维护后,呈现的状态是:
1. `START` 时间之前,除 `reserved_history_periods` 所指定范围以外,**不包含**任何分区;
2. `END` 时间之后,保留所有**手动创建的**分区。
2. `END` 时间之后,保留所有**已存在**分区。
3. 除手动删除或意外丢失的分区外,表包含**特定范围**内的全部分区:
- 如果 `create_history_partition``true`
- 若定义了 `history_partition_num`,则**特定范围**`[max(START, 当前时间 - history_partition_num * time_unit), END]`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -245,38 +245,35 @@ mysql> show partitions from `DAILY_TRADE_VALUE`;
## 与动态分区联用
自 2.1.7 起,Doris 支持自动分区和动态分区同时使用。此时,二者的功能都生效:
1. 自动分区将会自动在数据导入过程中按需创建分区;
2. 动态分区将会自动创建、回收、转储分区。
二者语法功能不存在冲突,同时设置对应的子句/属性即可。
自 2.1.7 起,Doris 支持自动分区和动态分区同时使用。自 2.1.8 后,二者的功能共同作用于该表上:
## 最佳实践
1. 自动分区将会自动在数据导入过程中按需创建分区;
2. 动态分区将会按原功能回收、转储分区。
需要对分区生命周期设限的场景,可以**将 Dynamic Partition 的创建功能关闭,创建分区完全交由 Auto Partition 完成**,通过 Dynamic Partition 动态回收分区的功能完成分区生命周期的管理:
`dynamic_partition.create_method` 属性必须被设置为 `AUTO`,即 `dynamic_partition.end` 属性将被忽略,分区创建功能全部由 Auto Partition 承担。
```sql
create table auto_dynamic(
k0 datetime(6) NOT NULL
)
auto partition by range (date_trunc(k0, 'year'))
(
)
()
DISTRIBUTED BY HASH(`k0`) BUCKETS 2
properties(
"dynamic_partition.enable" = "true",
"dynamic_partition.prefix" = "p",
"dynamic_partition.start" = "-50",
"dynamic_partition.end" = "0", --- Dynamic Partition 不创建分区
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.time_unit" = "year",
"dynamic_partition.create_method" = "AUTO",
"replication_num" = "1"
);
```
这样我们同时具有了 Auto Partition 的灵活性,且分区名上保持了一致性。
这样我们在 Auto Partition 的基础上同时拥有了分区生命周期管理能力,且分区名上保持了一致性。
:::note
在 2.1.7 之前的某些早期版本,该功能未禁止但不建议使用。
在 2.1.7 之前的某些早期版本,该功能未禁止但不建议使用。2.1.7 时 `end` 属性正常生效,无 `create_method` 属性。2.1.8 及以后 `end` 与 `create_method` 行为变更生效
:::
## 分区管理
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ PROPERTIES
<p>

:::tip
该功能自 2.1.7 支持
该功能自 2.1.7 支持,下列 `end``create_method` 行为变更自 2.1.8 生效。
:::

自动分区与动态分区各有其优点,将二者结合可以实现分区的灵活按需创建和自动回收:
Expand Down Expand Up @@ -233,7 +233,8 @@ PROPERTIES
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "month", --- 二者粒度必须相同
"dynamic_partition.start" = "-2", --- 动态分区自动清理超过两周的历史分区
"dynamic_partition.end" = "0", --- 动态分区不创建未来分区,完全交给自动分区
"dynamic_partition.create_method" = "AUTO", --- 必需。分区创建完全由自动分区负责
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "8"
);
Expand Down Expand Up @@ -358,7 +359,7 @@ ALTER TABLE example_range_tbl ADD PARTITION p201704 VALUES LESS THAN("2020-05-0

## 分区检索

`partitions` 表函数(自 2.1.5 支持)和 `information_schema.partitions` 系统表(自 2.1.7 支持)记录了集群的分区信息。在自动管理分区时,可以通过对应表提取分区信息使用:
`partitions` 表函数和 `information_schema.partitions` 系统表记录了集群的分区信息。在自动管理分区时,可以通过对应表提取分区信息使用:

```sql
--- 在 Auto Partition 表中找对应值所属的分区
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ under the License.

是否开启动态分区特性。可指定为 `TRUE``FALSE`。如果不填写,默认为 `TRUE`。如果为 `FALSE`,则 Doris 会忽略该表的动态分区规则。

- `dynamic_partition.create_method` (自 2.1.8 起支持)

分区创建由动态分区还是自动分区接管。默认为 `SCHEDULE`,即动态分区创建。当同时开启自动分区时,必须手动指定此项为 `AUTO`,此时动态分区不再为该表创建分区,均由自动分区功能接管。

- `dynamic_partition.time_unit`**(必选参数)**

动态分区调度的单位。可指定为 `HOUR``DAY``WEEK``MONTH``YEAR`。分别表示按小时、按天、按星期、按月、按年进行分区创建或删除。
Expand Down Expand Up @@ -218,7 +222,7 @@ under the License.
p20210523
```

### 示例
## 示例

1. 表 tbl1 分区列 k1 类型为 DATE,创建一个动态分区规则。按天分区,只保留最近 7 天的分区,并且预先创建未来 3 天的分区。

Expand Down Expand Up @@ -354,7 +358,7 @@ Doris FE 中有固定的 dynamic partition 控制线程,持续以特定时间

因此,自动分区表在系统自动维护后,呈现的状态是:
1. `START` 时间之前,除 `reserved_history_periods` 所指定范围以外,**不包含**任何分区;
2. `END` 时间之后,保留所有**手动创建的**分区。
2. `END` 时间之后,保留所有**已存在**分区。
3. 除手动删除或意外丢失的分区外,表包含**特定范围**内的全部分区:
- 如果 `create_history_partition``true`
- 若定义了 `history_partition_num`,则**特定范围**`[max(START, 当前时间 - history_partition_num * time_unit), END]`
Expand Down Expand Up @@ -386,7 +390,7 @@ p20200521: ["2020-05-21", "2020-05-22")

如果此时将分区粒度改为 MONTH,则系统会尝试创建范围为 `["2020-05-01", "2020-06-01")` 的分区,而该分区的分区范围和已有分区冲突,所以无法创建。而范围为 `["2020-06-01", "2020-07-01")` 的分区可以正常创建。因此,2020-05-22 到 2020-05-30 时间段的分区,需要自行填补。

## 查看动态分区表调度情况
### 查看动态分区表调度情况

通过以下命令可以进一步查看当前数据库下,所有动态分区表的调度情况:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -245,35 +245,36 @@ mysql> show partitions from `DAILY_TRADE_VALUE`;
## 与动态分区联用
自 3.0.3 起,Doris 支持自动分区和动态分区同时使用。此时,二者的功能都生效:
1. 自动分区将会自动在数据导入过程中按需创建分区;
2. 动态分区将会自动创建、回收、转储分区。
二者语法功能不存在冲突,同时设置对应的子句/属性即可。
自 3.0.3 起,Doris 支持自动分区和动态分区同时使用。自 3.0.4 后,二者的功能共同作用于该表上:
### 最佳实践
1. 自动分区将会自动在数据导入过程中按需创建分区;
2. 动态分区将会按原功能回收、转储分区。
需要对分区生命周期设限的场景,可以**将 Dynamic Partition 的创建功能关闭,创建分区完全交由 Auto Partition 完成**,通过 Dynamic Partition 动态回收分区的功能完成分区生命周期的管理:
`dynamic_partition.create_method` 属性必须被设置为 `AUTO`,即 `dynamic_partition.end` 属性将被忽略,分区创建功能全部由 Auto Partition 承担。
```sql
create table auto_dynamic(
k0 datetime(6) NOT NULL
)
auto partition by range (date_trunc(k0, 'year'))
(
)
()
DISTRIBUTED BY HASH(`k0`) BUCKETS 2
properties(
"dynamic_partition.enable" = "true",
"dynamic_partition.prefix" = "p",
"dynamic_partition.start" = "-50",
"dynamic_partition.end" = "0", --- Dynamic Partition 不创建分区
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.time_unit" = "year",
"dynamic_partition.create_method" = "AUTO",
"replication_num" = "1"
);
```
这样我们同时具有了 Auto Partition 的灵活性,且分区名上保持了一致性。
这样我们在 Auto Partition 的基础上同时拥有了分区生命周期管理能力,且分区名上保持了一致性。
:::note
在 3.0.3 之前的某些早期版本,该功能未禁止但不建议使用。3.0.3 时 `end` 属性正常生效,无 `create_method` 属性。3.0.4 及以后 `end` 与 `create_method` 行为变更生效
:::
## 分区管理
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,10 @@ PROPERTIES
<TabItem value="自动分区+动态分区" label="自动分区+动态分区">
<p>

:::tip
该功能自 3.0.3 支持,下列 `end``create_method` 行为变更自 3.0.4 生效。
:::

自动分区与动态分区各有其优点,将二者结合可以实现分区的灵活按需创建和自动回收:

```sql
Expand Down Expand Up @@ -229,7 +233,8 @@ PROPERTIES
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "month", --- 二者粒度必须相同
"dynamic_partition.start" = "-2", --- 动态分区自动清理超过两周的历史分区
"dynamic_partition.end" = "0", --- 动态分区不创建未来分区,完全交给自动分区
"dynamic_partition.create_method" = "AUTO", --- 必需。分区创建完全由自动分区负责
"dynamic_partition.end" = "123", --- 将被忽略
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "8"
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ under the License.

是否开启动态分区特性。可指定为 `TRUE``FALSE`。如果不填写,默认为 `TRUE`。如果为 `FALSE`,则 Doris 会忽略该表的动态分区规则。

- `dynamic_partition.create_method`(自 3.0.4 起支持)

分区创建由动态分区还是自动分区接管。默认为 `SCHEDULE`,即动态分区创建。当同时开启自动分区时,必须手动指定此项为 `AUTO`,此时动态分区不再为该表创建分区,均由自动分区功能接管。

- `dynamic_partition.time_unit`**(必选参数)**

动态分区调度的单位。可指定为 `HOUR``DAY``WEEK``MONTH``YEAR`。分别表示按小时、按天、按星期、按月、按年进行分区创建或删除。
Expand All @@ -97,7 +101,6 @@ under the License.

动态分区的起始偏移,为负数。根据 `time_unit` 属性的不同,以当天(星期/月)为基准,分区范围在此偏移之前的分区将会被删除。如果不填写,则默认为 `-2147483648`,即不删除历史分区。此偏移之后至当前时间的历史分区如不存在,是否创建取决于 `dynamic_partition.create_history_partition`


:::caution
注意,若用户设置了history_partition_num(>0),创建动态分区的起始分区就会用max(start, -history_partition_num),删除历史分区的时候仍然会保留到start的范围,其中start < 0。
:::
Expand Down Expand Up @@ -355,7 +358,7 @@ Doris FE 中有固定的 dynamic partition 控制线程,持续以特定时间

因此,自动分区表在系统自动维护后,呈现的状态是:
1. `START` 时间之前,除 `reserved_history_periods` 所指定范围以外,**不包含**任何分区;
2. `END` 时间之后,保留所有**手动创建的**分区。
2. `END` 时间之后,保留所有**已存在**分区。
3. 除手动删除或意外丢失的分区外,表包含**特定范围**内的全部分区:
- 如果 `create_history_partition``true`
- 若定义了 `history_partition_num`,则**特定范围**`[max(START, 当前时间 - history_partition_num * time_unit), END]`
Expand Down Expand Up @@ -387,7 +390,7 @@ p20200521: ["2020-05-21", "2020-05-22")

如果此时将分区粒度改为 MONTH,则系统会尝试创建范围为 `["2020-05-01", "2020-06-01")` 的分区,而该分区的分区范围和已有分区冲突,所以无法创建。而范围为 `["2020-06-01", "2020-07-01")` 的分区可以正常创建。因此,2020-05-22 到 2020-05-30 时间段的分区,需要自行填补。

## 查看动态分区表调度情况
### 查看动态分区表调度情况

通过以下命令可以进一步查看当前数据库下,所有动态分区表的调度情况:

Expand Down
Loading

0 comments on commit 5c83b2d

Please sign in to comment.