Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document InfluxDB I/O procedures #75

Merged
merged 10 commits into from
May 12, 2024
Merged

Document InfluxDB I/O procedures #75

merged 10 commits into from
May 12, 2024

Conversation

matkuliak
Copy link
Contributor

@matkuliak matkuliak commented May 7, 2024

About

Main documentation for CrateDB Toolkit's InfluxDB I/O subsystem, based on influxio.

Preview

References

Copy link
Member

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this article. Apologies that I have quite a few suggestions to reshape it.

  1. Please design the tutorial to be self-contained for users approaching it without much ado, similar to what I had suggested on the MQTT tutorial.

    An option could be to use Docker across the board, even for invoking ctk load table, so readers/users will effectively not have to install anything but a container subsystem, because all software will be invoked there, both services and client tools.

  2. In the context of the CrateDB Guide, please omit the focus on influxio, but just reference it at the end of the article. If you want to write an article about influxio, it should be a dedicated and separate one, but also not copy its README.

  3. As I suggested to remove many fragments from this documentation page, it would be sad if they would be lost. So, I provided guidance over there where those updates should have been submitted to.

  4. The CI job currently fails, because influxdb.md needs to be added to another page's toctree.

    checking consistency... /path/to/docs/integrate/etl/influxdb.md: WARNING: document isn't included in any toctree
    

Comment on lines 15 to 29
(setup-influxio)=
## Setup

`influxio` can be installed with pip:

:::{code} console
pip install influxio
:::

You will also need the [cratedb-toolkit](https://github.com/crate-workbench/cratedb-toolkit/tree/main/cratedb_toolkit/io#installation)
to load the data into your CrateDB instance. It can be installed with:

:::{code} console
pip install --upgrade 'cratedb-toolkit[all]'
:::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When aiming to use CrateDB Toolkit, installing influxio separately is not needed. The right installation command would be pip install 'cratedb-toolkit[influxdb]'.

In general, in the main section of the article, I would omit all details about influxio, and only present the corresponding CLI command of CrateDB Toolkit.

At the end of the article, I would link to the documentation of influxio, conveying the message that it offers a few more commands which help in different scenarios like the one you've picked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, there is no pip install command inside. When aiming for conveying that tutorial "the Docker way", like proposed at #75 (comment), we will probably not need any of those. JFYI.

docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
docs/integrate/etl/influxdb.md Show resolved Hide resolved
Comment on lines 36 to 44
:::{code} console
influx setup \
--username user1 \
--password 49jk8FQB$]1 \
--token token123 \
--org example \
--bucket testdrive \
--force
:::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That command would fail when no InfluxDB service is running.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find corresponding commands to start the services using Docker on https://influxio.readthedocs.io/development.html.

docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
@amotl amotl changed the title Adding docs on Influxio Document InfluxDB I/O procedures May 7, 2024
docs/integrate/etl/index.md Outdated Show resolved Hide resolved
@amotl

This comment was marked as resolved.

@matkuliak

This comment was marked as resolved.

@matkuliak
Copy link
Contributor Author

matkuliak commented May 7, 2024

It may be sensible to add a "start services" section here, which conveys how to run both services using Docker. We can't present it for each and every platform, so Docker just yields the best platform coverage.

Added the Docker variant, but this is only influxio right? That doesn't come with ctk or crash. Which are needed here:

export CRATEDB_SQLALCHEMY_URL=crate://crate@localhost:4200/testdrive/demo
ctk load table influxdb2://example:token123@localhost:8086/testdrive/demo

crash --command "SELECT * FROM testdrive.demo;"

So I should still instruct to install those, yes?

docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
docs/integrate/etl/influxdb.md Outdated Show resolved Hide resolved
Comment on lines 57 to 60
:::{code} console
export CRATEDB_SQLALCHEMY_URL=crate://crate@localhost:4200/testdrive/demo
ctk load table influxdb2://example:token123@localhost:8086/testdrive/demo
:::
Copy link
Member

@amotl amotl May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bonus / Subsequent iteration: CrateDB Toolkit is also available per OCI image.
See also #75 (comment).

Comment on lines +64 to +66
:::{code} console
crash --command "SELECT * FROM testdrive.demo;"
:::
Copy link
Member

@amotl amotl May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the OCI image of Toolkit also includes crash out of the box.
See also #75 (comment).

@amotl
Copy link
Member

amotl commented May 7, 2024

influxio does not come with ctk or crash?

ctk pulls in influxio when installed correspondingly.

pip install 'cratedb-toolkit[influxdb]'

It should be included into the Docker image on behalf of the all extra, no?

$ alias influxio="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest influxio"
$ influxio --version
influxio, version 0.2.0
$ alias crash="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest crash"
$ crash --version
0.31.4

@amotl
Copy link
Member

amotl commented May 7, 2024

Please also use such aliases to provide optimal UX to readers of the documentation when it comes to invoking tools, and why not just everything. ;]

Those are aliases for all required programs in a nutshell, also alphanumerically sorted:

alias crash="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest crash"
alias ctk="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest ctk"
alias influx="docker run --rm -it --network=host influxdb:2 influx"
alias influxio="docker run --rm -it ghcr.io/crate-workbench/cratedb-toolkit:latest influxio"

Optionally, some (or just all?) commands would probably also need to use --network=host, like the influx one, to provide maximum conveniency to users.

Feel free to do anything of that on a later iteration, based on your energy and spirits. The current version is more than good enough to have it, and can also be improved later. Don't forget to squash your commits, into a single one with a meaningful message before merging. Thanks!

@matkuliak
Copy link
Contributor Author

That's good, thank you very much. Running it all on docker would be ideal for sure.
Will continue on this tomorrow and try to wrap it up.

@matkuliak
Copy link
Contributor Author

matkuliak commented May 8, 2024

Hi again. I think it should be good now. I did it a bit differently than in the docs, although it's still all docker. Looked easier to me this way. What do you think?

@matkuliak
Copy link
Contributor Author

matkuliak commented May 8, 2024

Copy link
Member

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful. Thanks!

@amotl amotl merged commit 244ba7d into main May 12, 2024
3 checks passed
@amotl amotl deleted the mm/influxdb-import branch May 12, 2024 00:19
@amotl
Copy link
Member

amotl commented May 12, 2024

Hi. I did not review the HTML preview, apologies. Apparently, there is something wrong on the folding/nesting of elements, see https://cratedb.com/docs/guide/integrate/etl/influxdb.html. Can I ask you to investigate and fix it?

@matkuliak matkuliak mentioned this pull request May 12, 2024
2 tasks
@matkuliak
Copy link
Contributor Author

I missed that too, thanks for checking. #78 will help

@amotl
Copy link
Member

amotl commented May 29, 2024

Sorry for eventually merging this too early. It looks like the commits have not been squashed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants