diff --git a/docs/developer_tsdb_migration_guidelines.md b/docs/developer_tsdb_migration_guidelines.md
index 1017a156227..edd5a947e46 100644
--- a/docs/developer_tsdb_migration_guidelines.md
+++ b/docs/developer_tsdb_migration_guidelines.md
@@ -19,21 +19,27 @@ Integration is one of the biggest sources of input data to elasticsearch. Enabli
# Steps for migrating an existing package
-1. **Datastream having type `logs` can be excluded from TSDB migration.**
+1. **Datastream having type `logs` are excluded from TSDB migration.**
+2. **Modify the `kibana.version` to 8.8.0 within the manifest.yml file of the package.**
+ ```
+ conditions:
+ kibana.version: "^8.8.0"
+ ```
2. **Add the changes to the manifest.yml file of the datastream as below to enable the timeseries index mode**
```
elasticsearch:
index_mode: "time_series"
```
- If your datastream has more number of dimension fields, you can modify this limit by modifying index.mapping.dimension_fields.limit value as below
+ Should your datastream contain an increased count of dimension fields, you have the option to adjust this restriction by altering the index.mapping.dimension_fields.limit value as indicated below. The default [maximum limit](https://github.com/elastic/elasticsearch/blob/6417a4f80f32ace48b8ad682ad46b19b57e49d60/server/src/main/java/org/elasticsearch/index/mapper/MapperService.java#L114) stands at 21.
```
elasticsearch:
index_mode: "time_series"
index_template:
settings:
- # Defaults to 16
+ # Defaults to 21
index.mapping.dimension_fields.limit: 32
```
+
3. **Identifying the dimensions in the datastream.**
Read about dimension fields [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/tsds.html#time-series-dimension). It is important that dimensions or a set of dimensions that are part of a datastream uniquely identify a timeseries. Dimensions are used to form _tsid which then is used for routing and index sorting. Read about the ways to add field a dimension [here](https://github.com/elastic/integrations/blob/main/docs/generic_guidelines.md#specify-dimensions])
@@ -46,40 +52,51 @@ Integration is one of the biggest sources of input data to elasticsearch. Enabli
From the context of integrations that are related to products that are deployed on-premise, there exist certain fields that are part of every package and they are potential candidates of becoming dimension fields
- * host.ip
- * service.address
- * agent.id
+ * `host.name`
+ * `service.address`
+ * `agent.id`
+ * `container.id`
+
+ For products that are capable of running both on-premise and in a public cloud environment (by being deployed on public cloud virtual machines), it is recommended to annotate the ECS fields listed below as dimension fields.
+ * `host.name`
+ * `service.address`
+ * `container.id`
+ * `cloud.account.id`
+ * `cloud.provider`
+ * `cloud.region`
+ * `cloud.availability_zone`
+ * `agent.id`
+ * `cloud.instance.id`
+
+ For products operating as managed services within cloud providers like AWS, Azure, and GCP, it is advised to label the fields listed below as dimension fields.
+ * `cloud.account.id`
+ * `cloud.region`
+ * `cloud.availability_zone`
+ * `cloud.provider`
+ * `agent.id `
- When metrics are collected from a resource running in the cloud or in a container, certain fields are potential candidates of becoming dimension fields
-
- * host.ip
- * service.address
- * agent.id
- * cloud.project.id
- * cloud.instance.id
- * cloud.provider
- * container.id
-
- *Warning: Choosing an insufficient number of dimension fields may lead to data loss*
-
- *Hint: Fields having type [keyword](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html#keyword-field-type) in your datastream are very good candidates of becoming dimension fields*
-
4. **Annotating the integration specific fields as dimension**
`files.yml` file has the field mappings specific to a datastream of an integration. This step is needed when the dimension fields in ECS is not sufficient enough to create a unique [_tsid](https://www.elastic.co/guide/en/elasticsearch/reference/current/tsds.html#tsid) value for the documents stored in elasticsearch. Annotate the field with `dimension: true` to tag the field as dimension field.
+ Adding an inline comment prior to the dimension annotation is advised, detailing the rationale behind the choice of a particular field as a dimension field.
+
```
- name: wait_class
type: keyword
- description: Every wait event belongs to a class of wait events.
+ # Multiple events are generated based on the values of wait_class. Hence, it is a dimension
dimension: true
+ description: Every wait event belongs to a class of wait events.
```
*Notes:*
- * *There exists a limit on how many dimension fields can have. By default this value is 16. Out of this, 8 are reserved for ecs fields.*
+ * *There exists a limit on how many dimension fields can have. By default this value is [21](https://github.com/elastic/elasticsearch/blob/6417a4f80f32ace48b8ad682ad46b19b57e49d60/server/src/main/java/org/elasticsearch/index/mapper/MapperService.java#L114)).*
* *Dimension keys have a hard limit of 512b. Documents are rejected if this limit is reached.*
- * *Dimension values have a hard limit of 1024b. Documents are rejected if this limit is reached*
+ * *Dimension values have a hard limit of 1024b. Documents are rejected if this limit is reached*
+
+ **Warning:** Choosing an insufficient number of dimension fields may lead to data loss
+ **Hint:** Fields having type [keyword](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html#keyword-field-type) in your datastream are very good candidates of becoming dimension fields
5. **Annotating Metric Types values for all applicable fields**
@@ -104,7 +121,7 @@ Integration is one of the biggest sources of input data to elasticsearch. Enabli
- After migration, verify if the dashboard is rendering the data properly. If certain visualisation do not work, consider migrating to [Lens](https://www.elastic.co/guide/en/kibana/current/lens.html)
- Certain aggregation functions are not supported when a field is having a metric_type ‘counter’. Example avg(). Replace such aggregation functions with a supported aggregation type such as max().
+ Certain aggregation functions are not supported when a field is having a metric_type `counter`. Example `avg()`. Replace such aggregation functions with a supported aggregation type such as `max()` or `min()`.
- It is recommended to compare the number of documents within a certain time frame before enabling the TSDB and after enabling TSDB index mode. If the count differs, please check if there exists a field that is not annotated as dimension field.
@@ -124,10 +141,6 @@ A field that holds millions of unique values may not be an ideal candidate for b
**Identification of Write Index**: When mappings are modified for a datastream, index rollover happens and a new index is created under the datastream. Even if there exists a new index, the data continues to go to the old index until the timestamp matches `index.time_series.start_time` of the newly created index.
-**Automatic Rollover**: Automatic datastream rollover does not happen when fields are tagged and untagged as dimensional fields. Also, automatic datastream rollover does not happen when the value of index.mapping.dimension_fields.limit is modified.
-
-When a package upgrade with the above mentiond change is applied, the changes are made only on the index template. This means, the user need to wait until `index.time_series.end_time` of the current write index before seeing the change, following a package upgrade.
-
An enhancement [request](https://github.com/elastic/kibana/issues/150549) for Kibana is created to indicate the write index. Until then, refer to the index.time_series.start_time of indices and compare with the current time to identify the write index.
*Hint: In the Index Management UI, against a specific index, if the docs count column values regularly increase for an Index, it can be considered as the write index*
@@ -142,6 +155,10 @@ Reference : https://github.com/elastic/elasticsearch/issues/93539
- Currently, there are several limits around the number of dimensions.
Reference : https://github.com/elastic/elasticsearch/issues/93564
+- Other known issues: https://github.com/elastic/integrations/issues/5233. Refer the section - New Issues Identified, TSDB Issues reported earlier.
+
# Reference to existing package already migrated
-Oracle integration TSDB enablement: [PR Link](https://github.com/elastic/integrations/pull/5307)
+- [Oracle integration](https://github.com/elastic/integrations/tree/main/packages/oracle)
+- [Redis integrations](https://github.com/elastic/integrations/tree/main/packages/redis)
+- [AWS Redshift integration](https://github.com/elastic/integrations/tree/main/packages/aws/data_stream/redshift)