Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clickhouse] add DateTime64 type support #24344

Merged

Conversation

Apidcloud
Copy link
Contributor

@Apidcloud Apidcloud commented Jan 9, 2025

Description

This adds support to the missing type DateTime64 from ClickHouse.

Motivation and Context

I noticed DateTime64 is not supported. As mentioned in #20958, DateTime64 supports milliseconds, whereas DateTime doesn't.

Impact

We are currently blocked from using prestodb because every entry we have in our tables has a column createdAt with the type DateTime64. Milliseconds make sense when we are dealing with user/system events.

Test Plan

Tests were added showing the creation of a table with a DateTime64 column (mapped from timestamp in prestodb), insertion of timestamps into a DateTime64 column with 0 or multiple millisecond digits, and finally the selection of them.

Release Notes

Clickhouse Connector Changes
* Add ``DateTime64`` type support. :pr:`24344`

@Apidcloud Apidcloud requested a review from a team as a code owner January 9, 2025 17:12
@Apidcloud Apidcloud requested a review from presto-oss January 9, 2025 17:12
Copy link

linux-foundation-easycla bot commented Jan 9, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: Apidcloud / name: Luís Fernandes (7a3feea)

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to add some tests to verify the mapping works correctly.

Based on your PR description, it sounds like you may have been trying to create tables in clickhouse through presto? If so, you won't be able to use the same column type names that clickhouse uses. You'll need to use the mapped type. i.e. DateTime64 should map to timestamp. See TestClickhouseDistributedQueries for example tests.

@Apidcloud
Copy link
Contributor Author

Apidcloud commented Jan 9, 2025 via email

@steveburnett
Copy link
Contributor

Thanks for the release note! Minor rephrasing suggestion.

== RELEASE NOTES ==

Clickhouse Connector Changes
* Add ``DateTime64`` type support. :pr:`24344`

@Apidcloud
Copy link
Contributor Author

Changed the release notes. Today im still trying to add the tests. And while i was able to get the create table and insert to kind of work, the data that appears on clickhouse side is different than when I do it through the UI on :8080.

@Apidcloud
Copy link
Contributor Author

Somehow in the tests, this timestamp string (insert into t1 VALUES (timestamp '2025-01-08 12:35:59.079')) becomes 1736292959079 (Tuesday, 7 January 2025 23:35:59.079), but when I do it on the UI at :8080 the same timestamp string becomes 1736336159079 (Wednesday, 8 January 2025 11:35:59.079). How come?

@Apidcloud
Copy link
Contributor Author

Ah, it seems the timezone in tests is totally off. How can i change that? I'm based in Vienna, whereas tests are running in Bahia Mexico lol
ui-timezone
test-timezone

@ZacBlanco
Copy link
Contributor

ZacBlanco commented Jan 10, 2025

You don't need to change the time zone in the tests - it should be left as is. I would recommend using the presto-cli for local testing rather than the UI. You can set your session timezone by specifying a JVM argument to the CLI. If you want your testing environment to match the tests you can set the JVM parameter -Duser.timezone=America/Bahia_Banderas

@Apidcloud
Copy link
Contributor Author

Apidcloud commented Jan 10, 2025

I'm trying to convert the hour I want in UTC to the test timezone and use that in the insert statement. I assumed it would transform it to the UTC counterpart, but it went back almost 24h. What am I missing?

String tableT = "t1";

        // Get the current time zone
        String testTimeZone = TimeZone.getDefault().getID();

        ZonedDateTime originalTimestamp = ZonedDateTime.parse("2025-01-08T12:34:56.179Z", DateTimeFormatter.ISO_ZONED_DATE_TIME);
        ZonedDateTime adjustedTimestamp = originalTimestamp.withZoneSameInstant(ZoneId.of(testTimeZone));

        // adjustedTimestampString is correct: 2025-01-08 06:34:56.179"
        String adjustedTimestampString = adjustedTimestamp.format(DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS"));

        assertUpdate("CREATE TABLE " + tableT + " (ts timestamp not null)");
        assertUpdate("INSERT INTO " + tableT + " (ts) VALUES (timestamp '" + adjustedTimestampString + "')", 1);
        // After insert it is wrong. 2025-01-07 17:34:56.179 
        // In docker, the timezone is UTC within clickhouse and the entry shows 2025-01-07 17:34:56.179

@Apidcloud
Copy link
Contributor Author

Apidcloud commented Jan 10, 2025

After further checking, the test presto client session is defined in TestingSession.DEFAULT_TIME_ZONE_KEY as Pacific/Apia (UTC+13). But like this the select query is wrong, am I'm also wondering why. Specifically, why is it 6 hours more than the Pacific/Apia?
Screenshot 2025-01-10 at 22 25 47
Screenshot 2025-01-10 at 22 25 10

@Test
    public void testInsertAndSelectFromDateTimeTables() {
        String tableT = "t1";

        // this would get America/Bahia if we were to use it
        String testTimeZone = TimeZone.getDefault().getID();

        ZonedDateTime originalTimestamp = ZonedDateTime.parse("2025-01-08T12:34:56.179Z", DateTimeFormatter.ISO_ZONED_DATE_TIME);
        // presto client test session is set to Pacific/Apia
        ZonedDateTime adjustedTimestamp = originalTimestamp.withZoneSameInstant(ZoneId.of(TestingSession.DEFAULT_TIME_ZONE_KEY.getId()));

        // Pacific/Apia becomes 2025-01-09 01:34:56.179 (correct)
        // America/Bahia becomes 2025-01-08 06:34:56.179" (correct)
        String adjustedTimestampString = adjustedTimestamp.format(DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS"));

        assertUpdate("CREATE TABLE " + tableT + " (ts timestamp not null)");
        assertUpdate("INSERT INTO " + tableT + " (ts) VALUES (timestamp '" + adjustedTimestampString + "')", 1);
        // wrong for America/Bahia. 2025-01-07 17:34:56.179. After insert it's wrong. in docker, the timezone is UTC within clickhouse and the entry shows 2025-01-07 17:34:56.179
        // But correct when using Pacific/Apia! 2025-01-08 12:34:56.179
        
        assertQuery(
                "SELECT * FROM " + tableT + " LIMIT 100",
                "VALUES (timestamp  '2025-01-08T12:34:56.179')");
        // select is correct when using America/Bahia. Actual is exactly 2025-01-08T12:34:56.179
        // But wrong when using Pacific/Apia, even though the clickhouse server is now correct with 2025-01-08 12:34:56.179
        // Actual comes as 2025-01-09T07:34:56.179
    }

Comments in the source code that might hint something:
image

@Apidcloud Apidcloud force-pushed the fix/add-clickhouse-datetime64-support branch from af93218 to 3421111 Compare January 12, 2025 14:22
@Apidcloud Apidcloud requested a review from ZacBlanco January 12, 2025 15:23
@Apidcloud
Copy link
Contributor Author

Finally seems to be working. Can you tell me the process of creating the tar.gz prestodb file available for download, so that we can start using this version tomorrow?

@Apidcloud Apidcloud changed the title [Clickhouse] add datetime64 support [Clickhouse] add DateTime64 type support Jan 12, 2025
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement to the connector. Mostly just style nits. Mainly, I am concerned about the performance. I think it would be great to implement this without any string parsing. Have you attempted any other solutions?

Also, our clickhouse-jdbc client dependency is quite out of date. We might be able to upgrade it to the latest com.clickhouse:clickhouse-jdbc:0.7.2 which may provide more options for timestamps, though I haven't looked further into it.

@Apidcloud Apidcloud requested a review from ZacBlanco January 13, 2025 09:26
@Apidcloud
Copy link
Contributor Author

Apidcloud commented Jan 13, 2025

Thanks for the prompt review. Already pushed the changes.

I do agree the clickhouse jdbc is really outdated, and I tried clickhouse jdbc 0.3.2 at some point. But the server wouldn't start because of this:
Screenshot 2025-01-13 at 11 55 57
I can also remember some issues creating the tpch tables in TestclickHouseDistributedQueries but can't reproduce it again.

Either way, I would try to update clickhouse jdbc to the new org com.clickhouse 0.7.2 in a subsequent MR, as it would most likely have other problems.

Edit:
Also added a table to the docs

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc! Nit of formatting in the header, the table looks great!

presto-docs/src/main/sphinx/connector/clickhouse.rst Outdated Show resolved Hide resolved
@Apidcloud
Copy link
Contributor Author

Apidcloud commented Jan 13, 2025

Can you retrigger the CI? It seems one of the test suites timed out

Edit: Thanks, it seems to be ok now

steveburnett
steveburnett previously approved these changes Jan 13, 2025
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull updated branch, new local doc build, looks good. Thanks!

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. two doc comments.

presto-docs/src/main/sphinx/connector/clickhouse.rst Outdated Show resolved Hide resolved
presto-docs/src/main/sphinx/connector/clickhouse.rst Outdated Show resolved Hide resolved
@Apidcloud Apidcloud force-pushed the fix/add-clickhouse-datetime64-support branch from c640cde to a3437bc Compare January 15, 2025 09:25
@Apidcloud
Copy link
Contributor Author

Rebased and squashed commits

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit of phrasing. Thanks!

presto-docs/src/main/sphinx/connector/clickhouse.rst Outdated Show resolved Hide resolved
@Apidcloud Apidcloud force-pushed the fix/add-clickhouse-datetime64-support branch from a3437bc to 7a3feea Compare January 15, 2025 14:44
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull updated branch, new local doc build, looks good. Thanks!

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great work!

@Apidcloud Apidcloud requested a review from jaystarshot January 16, 2025 07:32
@tdcmeehan tdcmeehan merged commit d1710d3 into prestodb:master Jan 16, 2025
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants