Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse connector incorrectly transforms Timestamp constants to DateTime literals #20958

Open
Dnnd opened this issue Sep 25, 2023 · 2 comments
Labels

Comments

@Dnnd
Copy link

Dnnd commented Sep 25, 2023

Presto-clickhouse connector transforms a TIMESTAMP WITH TIME ZONE constant like from_iso8601_timestamp('2023-09-20T00:00:00+09:00') to TIMESTAMP '2023-09-19 15:00:00.000' literal in native Clickhouse SQL. TIMESTAMP in Clickhouse is an alias to the DateTime type, which doesn't offer sub-second precision. This leads to a parsing error in Clickhouse.

Your Environment

  • Presto version used: 0.283
  • Data source and connector used: presto-clickhouse
  • Deployment (Cloud or On-prem): on-perm
  • Clickhouse server version: 23.6.2

A Clickhouse table with the following schema (native Clickhouse DDL syntax):

CREATE TABLE default.t
(
    `ts` DateTime,
    `value` Float32
)
ENGINE = MergeTree
ORDER BY ts

Presto-clickhouse connector is configured like this:

connector.name=clickhouse
clickhouse.connection-url=jdbc:clickhouse://localhost:8123
clickhouse.connection-user=default

Expected behavior

I expect the following query to complete without errors (Presto SQL):

select count(*) 
FROM clickhouse.default.t 
where ts >= from_iso8601_timestamp('2023-09-20T00:00:00+09:00');

Current Behavior

Query returns an error:

Query 20230925_102132_00013_842zf failed: ClickHouse exception, code: 1002, host: localhost, port: 8123; Code: 6. DB::Exception: Cannot parse string '2023-09-19 15:00:00.000' as DateTime: syntax error at position 19 (parsed just '2023-09-19 15:00:00'): While processing SELECT count() FROM default.t WHERE ts >= toDateTime('2023-09-19 15:00:00.000'). (CANNOT_PARSE_TEXT) (version 23.6.2.18 (official build))
Stacktrace
2023-09-25T19:21:33.041+0900    ERROR   SplitRunner-15-131      com.facebook.presto.execution.executor.TaskExecutor     Error processing Split 20230925_102132_00013_842zf.1.0.0.0-0 com.facebook.presto.plugin.clickhouse.ClickHouseSplit@140367c (start = 3.9098605869769E7, wall = 21 ms, cpu = 0 ms, wait = 0 ms, calls = 1): JDBC_ERROR: ClickHouse exception, code: 1002, host: localhost, port: 8123; Code: 6. DB::Exception: Cannot parse string '2023-09-19 15:00:00.000' as DateTime: syntax error at position 19 (parsed just '2023-09-19 15:00:00'): While processing SELECT count() FROM default.t WHERE ts >= toDateTime('2023-09-19 15:00:00.000'). (CANNOT_PARSE_TEXT) (version 23.6.2.18 (official build))

2023-09-25T19:21:33.059+0900    ERROR   remote-task-callback-20 com.facebook.presto.execution.StageExecutionStateMachine        Stage execution 20230925_102132_00013_842zf.1.0 failed
com.facebook.presto.spi.PrestoException: ClickHouse exception, code: 1002, host: localhost, port: 8123; Code: 6. DB::Exception: Cannot parse string '2023-09-19 15:00:00.000' as DateTime: syntax error at position 19 (parsed just '2023-09-19 15:00:00'): While processing SELECT count() FROM default.t WHERE ts >= toDateTime('2023-09-19 15:00:00.000'). (CANNOT_PARSE_TEXT) (version 23.6.2.18 (official build))

        at com.facebook.presto.plugin.clickhouse.ClickHouseRecordCursor.handleSqlException(ClickHouseRecordCursor.java:236)
        at com.facebook.presto.plugin.clickhouse.ClickHouseRecordCursor.<init>(ClickHouseRecordCursor.java:95)
        at com.facebook.presto.plugin.clickhouse.ClickHouseRecordSet.cursor(ClickHouseRecordSet.java:59)
        at com.facebook.presto.spi.RecordPageSource.<init>(RecordPageSource.java:40)
        at com.facebook.presto.split.RecordPageSourceProvider.createPageSource(RecordPageSourceProvider.java:48)
        at com.facebook.presto.spi.connector.ConnectorPageSourceProvider.createPageSource(ConnectorPageSourceProvider.java:52)
        at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:80)
        at com.facebook.presto.operator.TableScanOperator.getOutput(TableScanOperator.java:263)
        at com.facebook.presto.operator.Driver.processInternal(Driver.java:428)
        at com.facebook.presto.operator.Driver.lambda$processFor$9(Driver.java:311)
        at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:732)
        at com.facebook.presto.operator.Driver.processFor(Driver.java:304)
        at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1079)
        at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:165)
        at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:603)
        at com.facebook.presto.$gen.Presto_0_283_1fa586a____20230925_100436_1.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: ru.yandex.clickhouse.except.ClickHouseUnknownException: ClickHouse exception, code: 1002, host: localhost, port: 8123; Code: 6. DB::Exception: Cannot parse string '2023-09-19 15:00:00.000' as DateTime: syntax error at position 19 (parsed just '2023-09-19 15:00:00'): While processing SELECT count() FROM default.t WHERE ts >= toDateTime('2023-09-19 15:00:00.000'). (CANNOT_PARSE_TEXT) (version 23.6.2.18 (official build))

        at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.getException(ClickHouseExceptionSpecifier.java:91)
        at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:55)
        at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:28)
        at ru.yandex.clickhouse.ClickHouseStatementImpl.checkForErrorAndThrow(ClickHouseStatementImpl.java:875)
        at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:616)
        at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117)
        at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100)
        at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95)
        at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90)
        at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.executeQuery(ClickHousePreparedStatementImpl.java:110)
        at com.facebook.presto.plugin.clickhouse.ClickHouseRecordCursor.<init>(ClickHouseRecordCursor.java:92)
        ... 17 more
Caused by: java.lang.Throwable: Code: 6. DB::Exception: Cannot parse string '2023-09-19 15:00:00.000' as DateTime: syntax error at position 19 (parsed just '2023-09-19 15:00:00'): While processing SELECT count() FROM default.t WHERE ts >= toDateTime('2023-09-19 15:00:00.000'). (CANNOT_PARSE_TEXT) (version 23.6.2.18 (official build))

        at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:53)
        ... 26 more

Possible Solution

I believe that this happens because of the incorrect handling of TimestampWithTimeZoneType /Timestamp constants in ClickHousePushdownUtils.java:

if (type instanceof TimestampType) {
return getTimestampLiteralAsString(session, (long) node.getValue());
}
if (type instanceof TimestampWithTimeZoneType) {
return getTimestampLiteralAsString(session, new SqlTimestampWithTimeZone((long) node.getValue()).getMillisUtc());
}

getTimestampLiteralAsString produces literals like TIMESTAMP '2023-09-25 00:00:00.000':

private static String getTimestampLiteralAsString(ConnectorSession session, long millisUtc)
{
SqlTimestamp sqlTimestamp = new SqlTimestamp(millisUtc, MILLISECONDS);
return "TIMESTAMP '" + sqlTimestamp.toString() + "'";
}

ClickHouse transforms TIMESTAMP '2023-09-25 00:00:00.000' to toDateTime('2023-09-25 00:00:00.000'), which triggers a parsing error.

Switching from type casting (TIMESTAMP '...') to parseDateTimeBestEffort function works for me so far:

   private static String getTimestampLiteralAsString(ConnectorSession session, long millisUtc)
    {
        SqlTimestamp sqlTimestamp = new SqlTimestamp(millisUtc, MILLISECONDS);
        return "parseDateTimeBestEffort('" + sqlTimestamp.toString() + "')";
    }
@tdcmeehan
Copy link
Contributor

It looks like toDateTime doesn't support milliseconds in ClickHouse. It might make sense to truncate the milliseconds value when the underlying type is DateTime (as ClickHouse doesn't have support for milliseconds in DateTime 1), and use toDateTime64 when the underlying type is DateTime642.

CC: @ellison840611

Footnotes

  1. https://clickhouse.com/docs/en/sql-reference/data-types/datetime

  2. https://clickhouse.com/docs/en/sql-reference/data-types/datetime64

@Apidcloud
Copy link
Contributor

Apidcloud commented Jan 8, 2025

I also just noticed prestodb doesn't show a column of type DateTime64(3) from clickhouse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🆕 Unprioritized
Development

No branches or pull requests

3 participants