Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement hive table engine #31104

Merged
merged 199 commits into from
Jan 14, 2022
Merged

Implement hive table engine #31104

merged 199 commits into from
Jan 14, 2022

Conversation

taiyang-li
Copy link
Contributor

@taiyang-li taiyang-li commented Nov 5, 2021

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Implement hive table engine to access apache hive from clickhouse. Related RFC: #29245

Detailed description / Documentation draft:
More detailed description, please refer to: docs/en/engines/table-engines/integrations/hive.md

Thanks for the fundamental work #16349 of @sundy-li

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Nov 5, 2021
@taiyang-li taiyang-li changed the title Implement hive table engine Implement hive table engine, which enable clickhouse query hive data Nov 5, 2021
@taiyang-li taiyang-li changed the title Implement hive table engine, which enable clickhouse query hive data Implement hive table engine Nov 5, 2021
@robot-ch-test-poll3 robot-ch-test-poll3 added the submodule changed At least one submodule changed in this PR. label Nov 5, 2021
@kssenii kssenii self-assigned this Nov 5, 2021
src/Interpreters/Context.h Outdated Show resolved Hide resolved
@lgbo-ustc
Copy link
Contributor

lgbo-ustc commented Jan 7, 2022

@kssenii
Copy link
Member

kssenii commented Jan 7, 2022

@kssenii It's strange, we cannot reproduce the failures on local for test_hive_query and 01273_arrow_dictionaries_load. The results are OK on local. Do you have any idea?

About failing functional test -- did you try build with sanitizer? As I see functional tests under release build are ok including the one you mentioned. So I think you just need to check under asan build. You can also download asan build from build check of this pr (in details).
And about failing integration test -- most likely it is just influenced by neighbouring tests, did you try to run it not as single test but all tests together in this test.py?

@kssenii we have used LRUResourceCache instead of LRUCache in ExternalDataSourceCache

Ok, I'll take a look next week, this week is still holidays in Russia and we are not working.

@kssenii I just fix two stateless test in this pr, in commit 63c07ff and e1a713a

👍🏻

M(650, SNAPPY_UNCOMPRESS_FAILED) \
M(651, SNAPPY_COMPRESS_FAILED) \
M(652, NO_HIVEMETASTORE) \
M(635, CANNOT_POLL) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please return correct order of error codes, it must be ascending.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

template <typename T>
struct TrivialLRUResourceCacheReleaseFunction
{
void operator()(std::shared_ptr<T>) { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be make it noexcept?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lgbo-ustc
Copy link
Contributor

@kssenii It's strange, we cannot reproduce the failures on local for test_hive_query and 01273_arrow_dictionaries_load. The results are OK on local. Do you have any idea?

About failing functional test -- did you try build with sanitizer? As I see functional tests under release build are ok including the one you mentioned. So I think you just need to check under asan build. You can also download asan build from build check of this pr (in details). And about failing integration test -- most likely it is just influenced by neighbouring tests, did you try to run it not as single test but all tests together in this test.py?

@kssenii we have used LRUResourceCache instead of LRUCache in ExternalDataSourceCache

Ok, I'll take a look next week, this week is still holidays in Russia and we are not working.

@kssenii I just fix two stateless test in this pr, in commit 63c07ff and e1a713a

👍🏻

I think the failure of 01273_arrow_dictionaries_load is related to some other bugs. It seems go wrong from exporting the data using arrow dictionary, at CHColumnToArrowColumn::chChunkToArrowTable

@kssenii
Copy link
Member

kssenii commented Jan 10, 2022

I think the failure of 01273_arrow_dictionaries_load is related to some other bugs. It seems go wrong from exporting the data using arrow dictionary, at CHColumnToArrowColumn::chChunkToArrowTable

But if the test started failing in this PR stably, then the reason in changes from this pr, and even if the changes are correct, the test need to be fixed anyway.
If it was a flacky test and it does not stably test in this pr then need to check it.

@lgbo-ustc
Copy link
Contributor

lgbo-ustc commented Jan 11, 2022

The bug is in arrowSchemaToCHHeader which is never called before this pr in the test case 01273_arrow_dictionaries_load. There is a nullptr accessed in readColumnWithNumericData at line 86,

std::shared_ptr<arrow::Buffer> buffer = chunk->data()->buffers[1];

the buffer is a nullptr.

@taiyang-li
Copy link
Contributor Author

taiyang-li commented Jan 11, 2022

@kssenii We need your help. Although we known the cause of this crash, but we have no idea how to fix it.

@lgbo-ustc
Copy link
Contributor

lgbo-ustc commented Jan 11, 2022

Not sure if this is OK as following, when buffer is nullptr, skip it. Although it works.

template <typename NumericType, typename VectorType = ColumnVector<NumericType>>
static ColumnWithTypeAndName readColumnWithNumericData(std::shared_ptr<arrow::ChunkedArray> & arrow_column, const String & column_name)
{
    auto internal_type = std::make_shared<DataTypeNumber<NumericType>>();
    auto internal_column = internal_type->createColumn();
    auto & column_data = static_cast<VectorType &>(*internal_column).getData();
    column_data.reserve(arrow_column->length());

    for (size_t chunk_i = 0, num_chunks = static_cast<size_t>(arrow_column->num_chunks()); chunk_i < num_chunks; ++chunk_i)
    {
        std::shared_ptr<arrow::Array> chunk = arrow_column->chunk(chunk_i);
        /// buffers[0] is a null bitmap and buffers[1] are actual values
        std::shared_ptr<arrow::Buffer> buffer = chunk->data()->buffers[1];
        if (!buffer)
        {
            if (chunk->length())
                throw Exception(ErrorCodes::BAD_ARGUMENTS, "Invalid arrow chunk. chunk->data()->buffers[1] is nullptr and chunk->length() > 0");
            // just skip this buffer
            continue;
        }

        const auto * raw_data = reinterpret_cast<const NumericType *>(buffer->data());
        column_data.insert_assume_reserved(raw_data, raw_data + chunk->length());
    }
    return {std::move(internal_column), std::move(internal_type), column_name};
}

@kssenii
Copy link
Member

kssenii commented Jan 11, 2022

The issue is fixed in #33529 thanks to @Avogar

@taiyang-li
Copy link
Contributor Author

taiyang-li commented Jan 14, 2022

@kssenii The failed stateless test is not introduced by this PR: 00900_long_parquet_load. It maybe caused by this bug: #32227

When arrow merged this pr: https://github.com/apache/arrow/pull/11855/files, 00900_long_parquet_load will be ok

@kssenii kssenii merged commit 5da673c into ClickHouse:master Jan 14, 2022
@xqliang
Copy link

xqliang commented Jan 24, 2022

Not support kerberos yet?

    <hdfs>
        <hadoop_kerberos_keytab>/path/to/keytab</hadoop_kerberos_keytab>
        <hadoop_kerberos_principal>xxx</hadoop_kerberos_principal>
        <hadoop_security_authentication>kerberos</hadoop_security_authentication>
        <libhdfs3_conf>/path/to/hdfs-site.xml</libhdfs3_conf>
    </hdfs>

@taiyang-li
Copy link
Contributor Author

@xqliang Not supported yet, we will implemented soon, maybe you can also participate in this work.

@yangshike
Copy link

does not supported hive with HA of hdfs?

SELECT *
FROM hive('thrift://xxx.xxx.xxx.xxx:9083', 'ods_qn', 'ods_prod_on_off_line_msg_es_df', 'bizid Nullable(String), corpid Nullable(Int32),time Nullable(Int64),reasontype Nullable(Int32),weworkid Nullable(Int64), type Nullable(Int8),pt String', pt)

Code: 210. DB::Exception: Received from localhost:9000. DB::Exception: Unable to connect to HDFS: InvalidParameter: Cannot create namenode proxy, does not contain host or port. (NETWORK_ERROR)

@taiyang-li
Copy link
Contributor Author

HDFS HA mode is supported. Can you create a new issue with detail information?

@yangshike
Copy link

#35515

@yangshike
Copy link

通过clickhouse 连接 ha hdfs 的没问题,现在有问题的是 clickhouse连接 hive 报错,提示hdfs ha问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature submodule changed At least one submodule changed in this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.