Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live variables #8

Open
dumblob opened this issue May 18, 2021 · 11 comments
Open

Live variables #8

dumblob opened this issue May 18, 2021 · 11 comments
Labels
enhancement New feature or request

Comments

@dumblob
Copy link

dumblob commented May 18, 2021

When trying to come up with a concise and explicit way of splitting/copying & merging of streams I came to conclusion that either there'll need to be support for "live variables" as e.g. Mech lang does (where every variable is basically by default live - very cool concept allowing to cut down SLOC count by about an order of magnitude in fullstack apps - but the problem currently is performance as this requires quite novel methods of optimization of the generated assembly which is not yet researched enough).

Or there'll need to be some select/poll/kpoll support and subsequently support for recognition from which stream the value came. This recognition of origin is sometimes done with the typing system, but that might be too cumbersome for Til. In Til I'd probably try to abuse the extraction syntax 😉.

Maybe even rudimentary things like this could be the starting point (because they can be presumably wrapped by a procedure):

select (range 0 10) (range 5 100) (io.in) | transform x {
  ....
} | case (>x 10) {
  ...
}

The latter solution sounds more reasonable for Til. It partially overlaps with own pipe primitives.

@cleberzavadniak
Copy link
Contributor

Maybe this could be solved wih processes. I'm starting to feel streams and processes should hang out more together... (Maybe with some syntatic sugar to wrap a command in a new process kind of seamlessly.)

@cleberzavadniak cleberzavadniak added the enhancement New feature or request label May 20, 2021
@dumblob
Copy link
Author

dumblob commented May 20, 2021

If the plumbing of tubes, ehm streams, is going to be explicit and support both compile-time plumbing (i.e. streams known in compile-time) and runtime plumbing (i.e. streams created at wish in any amount in runtime), then I'm fine with any solution - processes sound generally good. But as always - the devil is in the detail. So if you have any specific API in mind, I'm all ears 😉.

@dumblob
Copy link
Author

dumblob commented May 20, 2021

Btw. speaking of "streams and processes shall hang out more together" I'd refer you to my analysis and "vision" I described in (a very long) thread vlang/v#1868 .

TL;DR each Go routine will be a standalone "processor" (i.e. not a process, because a process can stop itself unlike processors which can only be added/removed as whole and themself can't influence whether they're running or not - this has a major advantage that a scheduler can spawn 0-N of such Go routines depending e.g. on load and just connect them using channel multiplexers). I.e. an infinitely running loop with it's content being a select/poll/kpoll (in case of consumer) over a one-way channel as well as being a chan.push() (in case of producer) over a one-way channel. Any such processor can of course spawn any new processors any time (and by closing a channel also indirectly instruct the scheduler to remove a processor if the processor also closed all its channels).

In other words tailored & tamed dataflow programming with user-customizable scheduler. It's actually a bit similar to actors, but both more high level (thus safer and faster & more efficient - compared to e.g. traditional actors) and bit more limited in terms of API (which is good - it allows fully automated infinite scaling with zero lines of additional code).


Actors are defined upon single messages unlike processors above for which a message would be just one sample from the channel. It's thus much more performant and provids much stronger safety guarantees while offering comparable level of expresiveness and overall dynamics.

@cleberzavadniak
Copy link
Contributor

With Pids being able to read from and write to streams, that would be easy to implement this:

proc f (x) {
  receive | foreach msg {
    send $parent_process "$x / $msg"
  }
}
range 5 | [spawn f whatever] | foreach msg { ... }

Someone may think about some kind of syntatic sugar around the spawn call, but I would be okay with this explicit version. And it's nice to simply complement Pids instead of creating new syntax.

@cleberzavadniak
Copy link
Contributor

Of course, how to reference the parent process is one concern and I personally would prefer to simply call send without any Pid instead of injecting a variable into any new scopes.

The last message solution also kind of address the in-Til generators, with:

proc g (x) {
  range $x | send
}
[spawn g 7] | foreach msg { ... }

@cleberzavadniak
Copy link
Contributor

(Except all that would make the Pid be treated as a command. Not sure about this part...)

@cleberzavadniak
Copy link
Contributor

About the select suggestion, using Pids would feel more natural and I would prefer if the command was very explicit about how it's actually handling data from these multiple streams: maybe you want to "concatenate" data, but maybe you want to "zip" values together or simply "fan out".

stream.cat [spawn reader1 "file1.txt"] [spawn reader1 "file2.txt"] | foreach line {...}
stream.zip [spawn reader1 "file1.txt"] [spawn reader1 "file2.txt"] | foreach line {...}
stream.fan_out [spawn reader1 "file1.txt"] [spawn reader1 "file2.txt"] | foreach line {...}

(It's relevant to note that in Til, data is always being pulled, not pushed through pipes, also.)

cat would consume in order, zip would always wait until there is a matching on the right side (probably, IDK) and fan_out would just consume from whoever has available data at the moment (and that sounds a little bit like a "select").

@dumblob
Copy link
Author

dumblob commented May 20, 2021

cat would consume in order, zip would always wait until there is a matching on the right side (probably, IDK) and fan_out would just consume from whoever has available data at the moment (and that sounds a little bit like a "select").

In a sense this sounds closer to what I've described above. It puts more emphasis on the muxers/demuxers whereas actors and my "processors" above put more emphasis on the nodes (producers & consumers) and their life.

Overall though it feels too simplistic to be used for general purpose muxing/demuxing (which is not linear and despite this supports arbitrary graphs it feels still too linear). I like seeing this "linear" principle in smaller areas where all participating procedures conform to "do only one thing and do it really well" (basically everywhere where pipes are being used in bourne shell scripting but nowhere else).

Let's shed some light on why I think this doesn't conceive what will be needed (if not now, then later - so better to discuss it now at the design phase 😉).

  1. This solution you outlined doesn't seem to support reshaping pipe(lines) in runtime without interruption. That's why I spoke about processors and not about muxers/demuxers because the rule of thumb says if we wanted to manually handle muxing/demuxing, it'll make it (much) harder to do it dynamically without interruption. Consider your last example:

    stream.fan_out [spawn reader1 "file1.txt"] [spawn reader1 "file2.txt"] | foreach line {...}
    

    Now while reading file1.txt file2.txt which happen to be infinitely long files - e.g. /dev/random - I want to add to the same muxer (i.e. to the running | pipe) a third reader which reads yet another file. How will I do it?

  2. Note also that this way (using stream.zip ... etc.) doesn't seem to address the trap/interrupt functionality (useful for exceptions etc.) and presumably might make it hard to explicitly syntactically visualize it (i.e. when looking at the code above I'm really unsure how any errors are being handled if they appear e.g. in one of those spawned readers. Could this trap/interrupt aspect be improved upon?

  3. Last but not least one of the main motivations behind live variables is anything interactive (mainly human interactivity is meant by this but can be anything as in a real world everything is in flux - sometimes with higher frequencies and sometimes with lower). Imagine building UI with stream.cat stream.zip stream.fan_out. Would you like it that way?

    What everything would you probably adjust (see Mech lang main web site where in the middle the robotic arm is interactive - there are sliders etc. - and uses live variables for that purpose)? I guess it wouldn't feel right as it is now. That's why flyd didn't become enough popular - because it focuses on muxing/demuxing and not on single samples (or batches of those) in the stream. For reference see a quick overview I summarized in Stateful variables (streams, live variables, reactivity, ... whatever we call it) HigherOrderCO/Kind#169 (comment) .

Maybe the actor and processor models are worth investigating 😉 (as they don't emphasize the connections/muxing/demuxing but rather the nodes - closer to the "data over operations" mantra).

@cleberzavadniak
Copy link
Contributor

  1. "I want to add to the same muxer (i.e. to the running | pipe) a third reader which reads yet another file. How will I do it?" - you just stop reading and reorganize things. If you plan to do it, probably is already saving the Pids to variables. Again: in Til streams, data is pulled: if you stop pulling, the stream simply stop producing, but is still there to keep producing again later.
  2. That's why I created error.handler semantics: in a Unix shell there isn't a nice way either to identify individual errors in a pipeline - all you got is $?, basically. And that's not without a reason: the pipeline itself is like a program per se. At least "throwing" an Error allows each pipeline component to convey more context and meaning.
  3. Well, there are going to exist plenty of other ways of dealing with UI...

@dumblob
Copy link
Author

dumblob commented May 26, 2021

  1. "I want to add to the same muxer (i.e. to the running | pipe) a third reader which reads yet another file. How will I do it?" - you just stop reading and reorganize things. If you plan to do it, probably is already saving the Pids to variables. Again: in Til streams, data is pulled: if you stop pulling, the stream simply stop producing, but is still there to keep producing again later.

Sure and sorry for not being clear with my intentions. My point is two fold.

The pipeline itself seems to not accept any "events" from outside. So I can't temporarily interrupt it and "reorganize things" (this is the first problem). I can only stop the whole pipeline and thus can't continue where I left thereafter (this is the second problem).

The second problem is the trap/interrupt problem which you tried to adress with error.handler but that seems to address only "events" from inside of the pipeline (which I'm not interested in 😉).

Any insights how to approach it without eval magic (shuffling with pids, orchestrating a lot in each of the participant of the pipeline, etc.)?

@dumblob
Copy link
Author

dumblob commented Jun 3, 2021

Btw. I'm playing with the idea of cutting down on "live variables" functionality and going for a simple built-in signal-slot mechanism (simple especially syntactically) working across os-level thread boundaries. I don't know yet...

@dumblob dumblob mentioned this issue Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants