-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add exec
onramp/offramp
#20
base: main
Are you sure you want to change the base?
Conversation
does not make sense. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed in the meeting earlier today: let's define how stdout and stderr output from the exec commands are treated by the onramp. Could be useful to forward them as distinct outputs from onramp, such that users can connect them differently for their pipeline needs.
Also, would be good to set the command executed as part of the event's origin uri. Users can access that metadata from tremor-script and use it as needed: https://docs.tremor.rs/tremor-script/functions/origin/.
I've come up with a few possible ways to run a given command from the config for the offramp implementation. The current RFC doesn't specify if stuff like pipes and redirections are going to be supported, so I'll try to take them into account and see if it's possible. The solutions are actually more complicated because it depends on the Operating System it's running on, but I'll talk about how it would work on Unix systems:
I've been taking a look at how others do this:
In conclusion, if we want to support stuff like pipes the only viable way to do it is with |
I'm bringing this issue up because I would like to try implementing it to warm up for the GSoC. After trying to find out how the command could be ran, another thing to take into account is whether we should use The more correct way would be using Can we assume all commands input by the user will parse correctly with |
Take a look at this PR for an implementation of an exec onramp that utilizes rust-subprocess crate. The same approach was meant to be used for the offramp. You'll find there some answers to the questions related to redirection and whether data needs to be stringified. As far as escaping input goes, I believe the Twitter example was just sending empty status for each event rather than embedding an event in the actual command. A useful implementation that carries the similar spirit can be found in Telegraf's continuous exec output and example configuration here. The non-continuous version can be found here, though I believe we didn't mean to differentiate between the two logically but via configuration in Tremor. |
Huh I like how you think. So you just create a shell script file with the command and execute that with whatever shell is configured. Though I don't see that much of a necessity for rust-subprocess here because the offramp is simpler in the sense that you don't have to poll the command. You just execute it and wait for its termination, which AFAIK can be done with In that PR the UTF-8 part isn't really discussed so I would still like to get some opinions on that.
Oh I completely misunderstood it then. Using |
Data should really be an octet stream on which codec and processors can be applied. That PR is unfinished but the latest iteration we were working on is not forcing UTF-8 data. rust-subprocess was used due to different types of execution like continuous or periodical for which std::process was not sufficient. There's a case to be made for an offramp that's expensive to start on every event that one should be able to run continuously but that's likely not rev0 of this offramp. cc: @tremor-rs/tremor-core |
And now that I know no replacing/escaping is needed, why create a file instead of just using Also, I like the idea of a continuous script as well. As you said, I wouldn't implement it in the first attempt but I'll definitely take it into account. And I'll give rust-subprocess a try. |
Those are good questions -- needless to say, take my answers as reflection of the type of thinking behind the work in that PR rather than conclusive in terms of the final solution for your work. A file allows you to support executing scripts that are multi-line, have functions defined, loops, supporting multiple commands in one stream, etc. This allows for having Tremor execute rather complex scripts (say you're hitting the API, passing a complex script and asking for it to be executed once). Nothing prevents you from passing in one command to be executed in such an onramp but designing solely for For example, a common scenario with Telegraf's Args indeed aren't exposed as top-level configuration option in that PR because they apply for execution of the shell command, rather than the user-provided script. Args aren't empty -- they will contain the filename and on Windows (not reflexted in that PR) they will have an additional item |
So does |
I see what you mean. That totally makes sense for the example you provided. There is a case, though, where you do want to have the script as an executable file. Take a script like this. |
No description provided.