Replies: 3 comments 14 replies
-
I was going to propose that for multiple devices there should be stored with different deviceIds in IoTDB. Also do I think that one device will most probably send one measurement using a fixed measure, so this ideally should be implemented as a tag assigned to a measure. While generally IoTDB can handle an unbounded amount of timeseries and an unbounded amount of measuements for each of these, my gut-feeling does tell me that option 1 would be the better way to approach this. |
Beta Was this translation helpful? Give feedback.
-
First of all thank you for getting that started. One concept you did not mention are the "template" Device Template "IoTDB supports the device template function, enabling different entities of the same type to share metadata, reduce the memory usage of metadata, and simplify the management of numerous entities and measurements.". +1 for using tags. this is IMHO the right path. |
Beta Was this translation helpful? Give feedback.
-
Actually, we're developping on relational model, but it may need several months to be finished. For current tree model of IoTDB, I suggest that you put all your tag dimensions' value into the path, like Then, if you want to group the data for all dimensions, you can use The only question for this |
Beta Was this translation helpful? Give feedback.
-
I'm currently working on supporting Apache IoTDB as an additional option to store our time series data in StreamPipes.
My experience with IoTDB is limited to reading the documentation, performing some small evaluations and the implementation effort I've spent sofar.
Thus, the following is based on my current understanding of IoTDB's concept and may therefore be erroneous or incomplete.
What is already possible
As of now, we have a very rudimentary support for Apache IoTDB as time series storage in our development branch.
Using IoTDB as time series storage is possible by setting a set of environment variables.
This effects that all time series data persisted in StreamPipes is persisted in the IoTDB.
Under the hood this works the following:
Let's assume we have created a data stream in StreamPipes with help of the machine data simulator
and persist the data in StreamPipes into a measure called
flowrate
.StreamPipes now has a so called
DataLakeMeasure
with the nameflowrate
and the following schema:timestamp
: LongsensorId
: Stringmass_flow
: Floatvolume_flow
: Floattemperature
: Floatdensity
: Floatsensor_fault_flags
: BooleanEvery event of our data stream is written into aligned time series of one device.
The device name in our scenario equals the name of the data lake measure (
flowrate
) and the following path is used within IoTDB:root.streampipes.flowrate
.This is how the data looks like:
In addition, the count mechanism to get an overview about how many events exists per data lake measure is implemented as well:
Using the following query:
Well, so far so good. Getting here was straightforward and IoTDB is easy to work with, it's a really cool piece of software!
Where problems arise
The approach outlined above is straightforward and works well with StreamPipes.
However, there are two aspects we haven't considered yet that make things more challenging and are currently unclear how to support:
The latter is not directly relevant, but if anyone has a viable suggestion, I'd really appreciate it.
So let's have a deeper look on dimension properties.
In StreamPipes dimension properties refer to event properties (event fields) that represent dimensions.
In the example given above,
sensorId
would typically be modeled as a dimension property.As such
sensorId
has discrete value space containing values likeflowrate01
andflowrate02
.Dimension properties allow users, e.g., to group data along the provided dimensions in the data explorer. See an example configuration from the StreamPipes UI in the screenshot below.
In general there can be more than one dimension property such as
locationId
andsensorId
and users can potentially change an existing adapter and remove or add a dimension.If our
flowrate
data stream is persisted as above, there is no way to group data based onsensorId
best to my knowledge,e.g., to calculate the average temperature per
sensorId
. Or is there anything I'm missing?One possible solution to this, is to split the data of our data stream into multiple devices in IoTDB and work with tags.
The idea is now to have a device per value of a dimension property, in this case
flowrate.flowrate01
andflowrate.flowrate02
.This allows us to tag the corresponding time series, e.g.,
flowrate.flowrate01.temperature
with a corresponding tag:sensorId=flowrate01
.This brings us the desired capability of grouping values along dimensions:
Please excuse the different tag name and values in this screenshot, but I think you get the point.
As an alternative, we could also imagine to not use tags and use the
GROUP BY LEVEL
statement.The good side here is, that this allows to calculate aggregations based on dimension values, it also has some downsides.
First of all, it would break with the current relationship of a data lake measure having one measurement in the time series storage.
Modeling as described here would result potentially in a huge amount of time series since there must be one for each combination of dimension property values.
In addition, it would make our queries more complex since we are not able to directly query all data for a data lake measure or to count all records for one data lake measure without further computations.
What are your thoughts about this considerations?
Do I use the concepts in the right way?
Are there any alternatives we could achieve the same results?
What if an adapter is modified? E.g., a dimension field is removed?
What should be possible in the end
Beyond the scenario above we require the following functionalities to be performed via queries from IoTDB which may rise further considerations/issues and should be considered ideally in finding a solution:
Select *
)Beta Was this translation helpful? Give feedback.
All reactions