Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(master) reading parquet file from url does not work #34147

Closed
alexey-milovidov opened this issue Jan 29, 2022 · 0 comments · Fixed by #34405
Closed

(master) reading parquet file from url does not work #34147

alexey-milovidov opened this issue Jan 29, 2022 · 0 comments · Fixed by #34405
Assignees
Labels
potential bug To be reviewed by developers and confirmed/rejected.

Comments

@alexey-milovidov
Copy link
Member

Describe what's wrong

ip-172-31-13-60.eu-central-1.compute.internal :) SELECT count() FROM url('https://datasets.clickhouse.com/hits.parquet')

SELECT count()
FROM url('https://datasets.clickhouse.com/hits.parquet')

Query id: 5946ca89-f28e-4eac-a820-daf421de65f0


Thread 1 "clickhouse" received signal SIGSEGV, Segmentation fault.
std::__1::basic_streambuf<char, std::__1::char_traits<char> >::xsputn (this=0x7fffffff5650, __s=0x17d68000 <error: Cannot access memory at address 0x17d68000>, __n=140737338223488) at ../contrib/libcxx/include/streambuf:464
464     ../contrib/libcxx/include/streambuf: No such file or directory.
(gdb) bt
#0  std::__1::basic_streambuf<char, std::__1::char_traits<char> >::xsputn (this=0x7fffffff5650, __s=0x17d68000 <error: Cannot access memory at address 0x17d68000>, __n=140737338223488) at ../contrib/libcxx/include/streambuf:464
#1  0x000000000a6ea291 in std::__1::basic_streambuf<char, std::__1::char_traits<char> >::sputn (this=0x7fffffff5650, __s=0x17d473c8 <arrow::global_state+8> "XW\366\a", __n=140737338223488) at ../contrib/libcxx/include/streambuf:229
#2  std::__1::__pad_and_output<char, std::__1::char_traits<char> > (__s=..., __s@entry=..., __ob=<optimized out>, __ob@entry=0x17d473c8 <arrow::global_state+8> "XW\366\a", __op=0x17d473c8 <arrow::global_state+8> "XW\366\a", 
    __oe=__oe@entry=0x80000ee19f48 <error: Cannot access memory at address 0x80000ee19f48>, __iob=..., __fl=<optimized out>) at ../contrib/libcxx/include/locale:1407
#3  0x000000000a6ea0c1 in std::__1::__put_character_sequence<char, std::__1::char_traits<char> > (__os=..., __str=0x17d473c8 <arrow::global_state+8> "XW\366\a", __len=<optimized out>) at ../contrib/libcxx/include/ostream:729
#4  0x00000000152eb239 in std::__1::operator<< <char, std::__1::char_traits<char>, std::__1::allocator<char> > (__os=..., __str=...) at ../contrib/libcxx/include/ostream:1056
#5  Poco::Net::HTTPBasicCredentials::authenticate (this=0x7fffffff6238, request=...) at ../contrib/poco/Net/src/HTTPBasicCredentials.cpp:96
#6  0x00000000103a506b in DB::detail::ReadWriteBufferFromHTTPBase<std::__1::shared_ptr<DB::UpdatableSession> >::call (this=this@entry=0x7ffff7065100, uri_=..., response=..., method_=...) at ../src/IO/ReadWriteBufferFromHTTP.h:176
#7  0x00000000103a43ed in DB::detail::ReadWriteBufferFromHTTPBase<std::__1::shared_ptr<DB::UpdatableSession> >::initialize (this=this@entry=0x7ffff7065100) at ../src/IO/ReadWriteBufferFromHTTP.h:290
#8  0x00000000103a2945 in DB::detail::ReadWriteBufferFromHTTPBase<std::__1::shared_ptr<DB::UpdatableSession> >::nextImpl (this=0x7ffff7065100) at ../src/IO/ReadWriteBufferFromHTTP.h:397
#9  0x000000000a6fed29 in DB::ReadBuffer::next (this=0x7ffff7065100) at ../src/IO/ReadBuffer.h:62
#10 DB::ReadBuffer::eof (this=0x7ffff7065100) at ../src/IO/ReadBuffer.h:96
#11 DB::ReadBuffer::read (this=0x7ffff7065100, to=0x7ffff084b900 "", n=65536) at ../src/IO/ReadBuffer.h:173
#12 DB::ReadBuffer::readBig (this=0x7ffff7065100, to=0x7ffff084b900 "", n=65536) at ../src/IO/ReadBuffer.h:200
#13 0x00000000134032b1 in DB::RandomAccessFileFromSeekableReadBuffer::Read (this=<optimized out>, nbytes=140737229052340, out=0x0) at ../src/Processors/Formats/Impl/ArrowBufferedStreams.cpp:85
#14 0x000000001340338b in DB::RandomAccessFileFromSeekableReadBuffer::Read (this=0x7ffff701dd08, nbytes=65536) at ../src/Processors/Formats/Impl/ArrowBufferedStreams.cpp:91
#15 0x0000000015d16dfe in arrow::io::RandomAccessFile::ReadAt(long, long) ()
#16 0x0000000015a17292 in parquet::SerializedFile::ParseMetaData() ()
#17 0x0000000015a14820 in parquet::ParquetFileReader::Contents::Open(std::__1::shared_ptr<arrow::io::RandomAccessFile>, parquet::ReaderProperties const&, std::__1::shared_ptr<parquet::FileMetaData>) ()
#18 0x0000000015a14e93 in parquet::ParquetFileReader::Open(std::__1::shared_ptr<arrow::io::RandomAccessFile>, parquet::ReaderProperties const&, std::__1::shared_ptr<parquet::FileMetaData>) ()
#19 0x000000001593c0b7 in parquet::arrow::FileReaderBuilder::Open(std::__1::shared_ptr<arrow::io::RandomAccessFile>, parquet::ReaderProperties const&, std::__1::shared_ptr<parquet::FileMetaData>) ()
#20 0x000000001593c43d in parquet::arrow::OpenFile(std::__1::shared_ptr<arrow::io::RandomAccessFile>, arrow::MemoryPool*, std::__1::unique_ptr<parquet::arrow::FileReader, std::__1::default_delete<parquet::arrow::FileReader> >*) ()
#21 0x00000000134941d6 in DB::getFileReaderAndSchema (in=..., file_reader=..., schema=..., format_settings=..., is_stopped=...) at ../src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp:121
#22 0x0000000013494642 in DB::ParquetSchemaReader::readSchema (this=<optimized out>) at ../src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp:168
#23 0x00000000133c206d in DB::readSchemaFromFormat(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::optional<DB::FormatSettings> const&, std::__1::function<std::__1::unique_ptr<DB::ReadBuffer, std::__1::default_delete<DB::ReadBuffer> > ()>, std::__1::shared_ptr<DB::Context const>, std::__1::unique_ptr<DB::ReadBuffer, std::__1::default_delete<DB::ReadBuffer> >&) (format_name=..., format_settings=..., read_buffer_creator=..., context=..., buf_out=...) at ../src/Formats/ReadSchemaUtils.cpp:49
#24 0x00000000133c2e08 in DB::readSchemaFromFormat(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::optional<DB::FormatSettings> const&, std::__1::function<std::__1::unique_ptr<DB::ReadBuffer, std::__1::default_delete<DB::ReadBuffer> > ()>, std::__1::shared_ptr<DB::Context const>) (format_name=..., format_settings=..., read_buffer_creator=..., context=...) at ../src/Formats/ReadSchemaUtils.cpp:65
#25 0x0000000012f45d3d in DB::IStorageURLBase::getTableStructureFromData (format=..., uri=..., compression_method=..., headers=..., format_settings=..., context=...) at ../src/Storages/StorageURL.cpp:104
#26 0x0000000012f458a0 in DB::IStorageURLBase::IStorageURLBase (this=this@entry=0x7ffff711e800, uri_=..., context_=..., table_id_=..., format_name_=..., format_settings_=..., columns_=..., constraints_=..., comment=..., compression_method_=..., headers_=..., http_method_=..., 
    partition_by_=...) at ../src/Storages/StorageURL.cpp:67
#27 0x0000000012f49524 in DB::StorageURL::StorageURL (this=0x7ffff711e800, uri_=..., table_id_=..., format_name_=..., format_settings_=..., columns_=..., constraints_=..., comment=..., context_=..., compression_method_=..., headers_=..., http_method_=..., partition_by_=...)
    at ../src/Storages/StorageURL.cpp:534
#28 0x0000000011d5366f in shared_ptr_helper<DB::StorageURL>::create<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::StorageID, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::nullopt_t const&, DB::ColumnsDescription const&, DB::ConstraintsDescription, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<DB::Context const>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::vector<std::__1::tuple<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::tuple<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&> (args=..., 
    args=..., args=..., args=..., args=..., args=..., args=..., args=..., args=..., args=..., args=...) at ../base/base/../base/shared_ptr_helper.h:17
#29 0x0000000011d52de3 in DB::TableFunctionURL::getStorage (this=this@entry=0x7ffff0800518, source=..., format_=..., columns=..., global_context=..., table_name=..., compression_method_=...) at ../src/TableFunctions/TableFunctionURL.cpp:69
#30 0x0000000011d52210 in DB::ITableFunctionFileLike::executeImpl (this=0x7ffff0800518, context=..., table_name=...) at ../src/TableFunctions/ITableFunctionFileLike.cpp:98
#31 0x00000000122fa958 in DB::ITableFunction::execute (this=0x7ffff0800518, ast_function=..., context=..., table_name=..., cached_columns=..., use_global_context=<optimized out>) at ../src/TableFunctions/ITableFunction.cpp:26
#32 0x000000001240b1ab in DB::Context::executeTableFunction (this=<optimized out>, table_expression=...) at ../src/Interpreters/Context.cpp:1039
#33 0x00000000129ce8f8 in DB::JoinedTables::getLeftTableStorage (this=<optimized out>) at ../src/Interpreters/JoinedTables.cpp:195
#34 0x00000000127b5190 in DB::InterpreterSelectQuery::InterpreterSelectQuery (this=0x7ffff080b000, query_ptr_=..., context_=..., input_pipe_=..., storage_=..., options_=..., required_result_column_names=..., metadata_snapshot_=..., prepared_sets_=...)
    at ../src/Interpreters/InterpreterSelectQuery.cpp:316
#35 0x00000000127b4714 in DB::InterpreterSelectQuery::InterpreterSelectQuery (this=0x0, query_ptr_=..., context_=..., options_=..., required_result_column_names_=...) at ../src/Interpreters/InterpreterSelectQuery.cpp:158
#36 0x00000000129aca1f in std::__1::make_unique<DB::InterpreterSelectQuery, std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context>&, DB::SelectQueryOptions&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&> (__args=..., __args=..., __args=..., __args=...) at ../contrib/libcxx/include/memory:2068
#37 DB::InterpreterSelectWithUnionQuery::buildCurrentChildInterpreter (this=this@entry=0x7ffff6c3f8e0, ast_ptr_=..., current_required_result_column_names=...) at ../src/Interpreters/InterpreterSelectWithUnionQuery.cpp:218
#38 0x00000000129ab684 in DB::InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery (this=<optimized out>, query_ptr_=..., context_=..., options_=..., required_result_column_names=...) at ../src/Interpreters/InterpreterSelectWithUnionQuery.cpp:140
#39 0x0000000012766514 in std::__1::make_unique<DB::InterpreterSelectWithUnionQuery, std::__1::shared_ptr<DB::IAST>&, std::__1::shared_ptr<DB::Context>&, DB::SelectQueryOptions const&> (__args=..., __args=..., __args=...) at ../contrib/libcxx/include/memory:2068
#40 0x0000000012765838 in DB::InterpreterFactory::get (query=..., context=..., options=...) at ../src/Interpreters/InterpreterFactory.cpp:120
#41 0x0000000012b62640 in DB::executeQueryImpl (begin=begin@entry=0x7ffff6c3b0b0 "SELECT count() FROM url('https://datasets.clickhouse.com/hits.parquet')", end=<optimized out>, context=..., internal=false, stage=DB::QueryProcessingStage::Complete, istr=0x0)
    at ../src/Interpreters/executeQuery.cpp:632
#42 0x0000000012b60f0d in DB::executeQuery (query=..., context=..., internal=false, stage=DB::QueryProcessingStage::Complete) at ../src/Interpreters/executeQuery.cpp:985
#43 0x0000000013316f2c in DB::LocalConnection::sendQuery (this=this@entry=0x7ffff7040a00, query=..., query_id=..., stage=2) at ../src/Client/LocalConnection.cpp:94
#44 0x00000000132d7964 in DB::ClientBase::processOrdinaryQuery (this=this@entry=0x7fffffffc4c0, query_to_execute=..., parsed_query=...) at ../src/Client/ClientBase.cpp:645
#45 0x00000000132d6f61 in DB::ClientBase::processParsedSingleQuery (this=this@entry=0x7fffffffc4c0, full_query=..., query_to_execute=..., parsed_query=..., echo_query_=..., report_error=<optimized out>) at ../src/Client/ClientBase.cpp:1315
#46 0x00000000132d6805 in DB::ClientBase::processTextAsSingleQuery (this=this@entry=0x7fffffffc4c0, full_query=...) at ../src/Client/ClientBase.cpp:610
#47 0x00000000132ddcff in DB::ClientBase::processQueryText (this=this@entry=0x7fffffffc4c0, text=...) at ../src/Client/ClientBase.cpp:1467
#48 0x00000000132df0fc in DB::ClientBase::runInteractive (this=<optimized out>) at ../src/Client/ClientBase.cpp:1602
#49 0x000000000a7a8c50 in DB::LocalServer::main (this=0x7fffffffc4c0) at ../programs/local/LocalServer.cpp:479
#50 0x0000000015327aa6 in Poco::Util::Application::run (this=0x7fffffffc4c0) at ../contrib/poco/Util/src/Application.cpp:334
#51 0x000000000a7aea60 in mainEntryClickHouseLocal (argc=1, argv=0x7ffff7001170) at ../programs/local/LocalServer.cpp:812
#52 0x000000000a6e9ae1 in main (argc_=<optimized out>, argv_=<optimized out>) at ../programs/main.cpp:378
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential bug To be reviewed by developers and confirmed/rejected.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants