Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Lake measurements endpoint returns no data series when no data between startDate and endDate #1329

Closed
bossenti opened this issue Feb 20, 2023 · 7 comments
Assignees
Labels
backend Everything that is related to the StreamPipes backend bug Something isn't working wontfix This will not be worked on
Milestone

Comments

@bossenti
Copy link
Contributor

Apache StreamPipes version

dev (current development state)

Affected StreamPipes components

Backend

What happened?

The endpoint for data lake measurements allows to filter the data records when querying for a specific measurement.
Hereby, you define a time window for which you want to get data by specifying startDate and endDate.
If you choose both parameters so that there are no records matching your criterion in the data lake, the API returns no data series in the response like the whole measurement would not exist.

How to reproduce?

Query the API endpoint /api/v4/datalake/measurements/<identifier> and extend the query by ?startDate=valueStart&endDate=valueEnd.
Choose valueStart and valueEnd in such way that there are no entries for targeted measurement in the corresponding time frame.
Eventually, the API returns a response with no data series indicating the whole measurement does not exist.

Expected behavior

Return the metadata of the data series but with an empty array for rows so that it clearly indicates that the actual measurement exists but there is to data available for the given query.

Additional technical information

No response

Are you willing to submit a PR?

None

@bossenti bossenti added bug Something isn't working backend Everything that is related to the StreamPipes backend labels Feb 20, 2023
@bossenti bossenti added this to the 1.0.0 milestone Feb 20, 2023
@SteveYurongSu
Copy link
Member

I'd like to take a look :)

Please assign it to me @bossenti

@SteveYurongSu
Copy link
Member

I have done some experiments these days and found that the reason why the API returns a response indicating the whole measurement does not exist when no data between startDate and endDate:

The InfluxDB client always returns a response with no metadata when querying a timeline with no data between startDate and endDate (even if the timeline exists).

So, there seems to be some difficulty in achieving the expected behavior. In fact, I have some immature solutions in my mind. But in my solutions, at least one query call of Influxdb needs to be added in DataExplorerQueryV4#executeQuery, which may also cause additional problems...

I'm not sure what to do next now. @bossenti What do you think? 😂😂

@bossenti
Copy link
Contributor Author

Thanks for investigating @SteveYurongSu!

Hm, if that's the standard behavior of the influx client, this might be fine as it is🤔

What would be the expected behavior for you?

@SteveYurongSu
Copy link
Member

In order to confirm whether the behavior of InfluxDB is reasonable, I checked the behavior of IoTDB (v0.13.4) when querying a existed timeline with no data between startDate and endDate, and finally found that their behavior is really different.

The IoTDB server returned the metadata of the timeline but with an empty array for rows in result dataset. It clearly indicated that the actual timelines exist but there was no data available for the given query.

The behavior of IoTDB was what exactly we expected for. 🤔

Steps to check the behavior of IoTDB:

IoTDB> insert into root.sg.d(time, s1,s2) values (0, 1, 2)
Msg: The statement is executed successfully.
IoTDB> insert into root.sg.d(time, s1,s2) values (1, 1, 2)
Msg: The statement is executed successfully.
IoTDB> insert into root.sg.d(time, s1,s2) values (2, 1, 2)
Msg: The statement is executed successfully.
IoTDB> insert into root.sg.d(time, s1,s2) values (30, 1, 2)
Msg: The statement is executed successfully.
IoTDB> insert into root.sg.d(time, s1,s2) values (40, 1, 2)
Msg: The statement is executed successfully.
IoTDB> insert into root.sg.d(time, s1,s2) values (50, 1, 2)
Msg: The statement is executed successfully.
IoTDB> insert into root.sg.d(time, s1,s2) values (60, 1, 2)
Msg: The statement is executed successfully.
IoTDB> show timeseries
+------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+
|  timeseries|alias|storage group|dataType|encoding|compression|tags|attributes|deadband|deadband parameters|
+------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+
|root.sg.d.s1| null|      root.sg|   FLOAT| GORILLA|     SNAPPY|null|      null|    null|               null|
|root.sg.d.s2| null|      root.sg|   FLOAT| GORILLA|     SNAPPY|null|      null|    null|               null|
+------------+-----+-------------+--------+--------+-----------+----+----------+--------+-------------------+
Total line number = 2
It costs 0.042s
IoTDB> select ** from root
+-----------------------------+------------+------------+
|                         Time|root.sg.d.s1|root.sg.d.s2|
+-----------------------------+------------+------------+
|1970-01-01T08:00:00.000+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.001+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.002+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.030+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.040+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.050+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.060+08:00|         1.0|         2.0|
+-----------------------------+------------+------------+
Total line number = 7
It costs 0.073s
IoTDB> select s1, s2 from root.sg.d where time > 10 and time < 20
+----+------------+------------+
|Time|root.sg.d.s1|root.sg.d.s2|
+----+------------+------------+
+----+------------+------------+
Empty set.
It costs 0.103s
IoTDB> select s1, s2, s3 from root.sg.d where time > 10 and time < 20
+----+------------+------------+
|Time|root.sg.d.s1|root.sg.d.s2|
+----+------------+------------+
+----+------------+------------+
Empty set.
It costs 0.012s
IoTDB> select s1, s2, s3 from root.sg.d where time > 10
+-----------------------------+------------+------------+
|                         Time|root.sg.d.s1|root.sg.d.s2|
+-----------------------------+------------+------------+
|1970-01-01T08:00:00.030+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.040+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.050+08:00|         1.0|         2.0|
|1970-01-01T08:00:00.060+08:00|         1.0|         2.0|
+-----------------------------+------------+------------+
Total line number = 4
It costs 0.012s

On the other hand, I think it may be ok if InfluxDB's standard behavior does not affect the implementation of streampipes. What do you think? @bossenti

@bossenti
Copy link
Contributor Author

bossenti commented Mar 1, 2023

Okay good to know 🤔
Frankly, I would refrain from building here something custom to get the expected behavior since we might exchange the influxdb in future...
Let's start a discussion about this on the mailing list soon 🙂
So if you are fine with it @SteveYurongSu I would close this as wont fix.

@tenthe
Copy link
Contributor

tenthe commented Mar 1, 2023

+1 for starting a discussion about the pros and cons of inlfluxdb and comparing it with iotdb

@SteveYurongSu
Copy link
Member

So if you are fine with it @SteveYurongSu I would close this as wont fix.

I'm fine with the influxdb's behaviour. Let's close it as wont-fix. 🙂 @bossenti

+1 for starting a discussion about the pros and cons of inlfluxdb and comparing it with iotdb

+1 from my side too :) @tenthe

@bossenti bossenti added the wontfix This will not be worked on label Mar 1, 2023
@bossenti bossenti closed this as completed Mar 1, 2023
@apache apache locked and limited conversation to collaborators Mar 6, 2023
@bossenti bossenti converted this issue into discussion #1394 Mar 6, 2023
@bossenti bossenti modified the milestones: 1.0.0, 0.92.0 May 4, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
backend Everything that is related to the StreamPipes backend bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants