-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add csv importer #3795
add csv importer #3795
Conversation
5c61bcf
to
179ffdd
Compare
Enum.map(uploads, fn upload -> | ||
%{"filename" => filename, "s3_path" => s3_path} = upload | ||
|
||
".csv" = Path.extname(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll revisit this after implementing imports. I think we'd do this sort of check there. If not, I'll make it into an ArgumentError to cancel the job early.
I'll need to add some documentation but the implementation itself and the tests are ready for a review. |
# uses https://min.io | ||
# docker run -d --rm -p 9000:9000 -p 9001:9001 --name minio minio/minio server /data --console-address ":9001" | ||
# docker exec minio mc alias set local http://localhost:9000 minioadmin minioadmin | ||
# docker exec minio mc mb local/imports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments describe how to run MinIO locally for these tests. I'm using Docker for both MinIO and ClickHouse. ClickHouse accesses MinIO through Docker host (172.17.0.1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 We'll most likely need to refinement because import date range will have to be known up front, but it can be tackled in a follow-up.
|
||
@moduletag :minio | ||
|
||
# uses https://min.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also add minio setup to Makefile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the tests @cnkk suggested using testcontainers
But for development it definitely would be useful to update Makefile as well. I'll do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added Makefile changes in eb905f3 but testcontainers seems a bit more difficult 34b7b0b
For some reason the previous approach of accessing MinIO via Docker host doesn't work anymore locally but keeps working in the CI.
- testcontainers reuse default bridge network, containers started by testcontainrers get 172.17.0.1/16 addresses, so the network is shared
- on CI ClickHouse container can access MinIO via the host on 172.17.0.1:mapped-port but cannot access it on its ip address and "real port" like 172.17.0.6:9000
- on my laptop (Docker Desktop) the situation is reversed: 172.17.0.6:9000 is accessible from ClickHouse, where 172.17.0.6 is the IP address for MinIO, but 172.17.0.1:mapped-port is not
- switching back to "manual" container created with
make minio
ClickHouse can access MinIO at 172.17.0.1:6000 in both CI and Docker Desktop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the CI tests pass, I think this can be left for later. I don't think anyone would be running minio
tests other than me and CI anyway :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, at least for me, tests work fine locally with testcontainer
based setup provided I setup CH via make clickhouse
.
%{"table" => table, "site_id" => site_id, "import_id" => import_id} | ||
) | ||
|
||
Date.range(min_date, max_date) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that can be tackled in a follow-up, but eventually we'll need to always know import date range before executing it. We have to determine the boundaries up front so that imported stats do not overlap with native stats. We might also provide an option to let user adjust the import date boundaries.
7ba2c01
to
0c191cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✨
Left some minor stuff.
|
||
@moduletag :minio | ||
|
||
# uses https://min.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, at least for me, tests work fine locally with testcontainer
based setup provided I setup CH via make clickhouse
.
Co-authored-by: Adrian Gruntkowski <[email protected]>
Changes
This PR implements the CSV importer from S3.
Tests
Changelog
Documentation
Dark mode