-
-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schedule the cycle detector with higher priority using the inject queue #3507
Conversation
@sblessing there's a lot of good information about this change in the issue, can you update your commit comment to include a write up of the why for this change? eventually that information will get lost or send someone off having to go look at this issue. I don't want to lose that information and as such want it to be part of the commit comment itself. |
@sblessing additionally, can you write up a user facing description of this change (the what and why) that a pony user might be interested in so that it can be used in the release notes for this? you can add those release notes as a comment on this PR. thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good overall. some minor comments/changes (and a question for the core team). also, as @SeanTAllen mentioned, please update the commit comment with more context about the change so it doesn't get lost (or become hard to find).
once this is merged, i have an idea for how to accomplish the batching i mentioned in the issue.
thanks @sblessing for finding and fixing this unnecessary performance limitation of my changes to the cycle detector stuff (and the detailed issue about it to prompt the discussion and additional thought). and welcome to the (large) group of folks who've found such silliness on my part. 8*P
src/libponyrt/actor/actor.c
Outdated
@@ -86,8 +86,10 @@ static bool well_formed_msg_chain(pony_msg_t* first, pony_msg_t* last) | |||
static void send_unblock(pony_ctx_t* ctx, pony_actor_t* actor) | |||
{ | |||
// Send unblock before continuing. | |||
(void)ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, i didn't catch this earlier, but instead of this approach, can you change send_unblock
to only take actor
as an argument and update the call sites accordingly?
|
||
DTRACE2(ACTOR_ALLOC, (uintptr_t)ctx->scheduler, (uintptr_t)actor); | ||
return actor; | ||
} | ||
|
||
PONY_API void ponyint_destroy(pony_ctx_t* ctx, pony_actor_t* actor) | ||
{ | ||
(void)ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ponylang/core how do you folks feel about changing the signature of ponyint_destroy
to remove the ctx
argument? (asking because it's marked as PONY_API
meaning it's a breaking change)
src/libponyrt/sched/scheduler.c
Outdated
{ | ||
last_cd_tsc = current_tsc; | ||
|
||
// cycle detector should now be on the queue | ||
if(actor == NULL) | ||
if(actor == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm assuming this additional trailing space
was accidental. if yes, please undo this change.
@@ -1143,7 +1145,9 @@ pony_ctx_t* ponyint_sched_init(uint32_t threads, bool noyield, bool pin, | |||
ponyint_mpmcq_init(&inject); | |||
ponyint_asio_init(asio_cpu); | |||
|
|||
return pony_ctx(); | |||
inject_context = pony_ctx(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add code to ponyint_sched_shutdown
to set inject_context = NULL
to clean things up.
fea2604
to
9f759f2
Compare
I have done the changes requested by @dipinhora. Will amend the commit message and write the user facing documentation later today @SeanTAllen. @dipinhora please loop me in on the batch processing thoughts. I have some ideas as well and I am also happy to implement those! |
…the inject queue (ponylang#3501) This commit treats the cycle detector as special system actor that is scheduled using the global inject queue. Consequently, the cycle detector is subject to work stealing with a higher priority between all scheduler threads. Prior to this change, the cycle detector was treated like any other application actor. Moreover, scheduler thread 0 decided when the cycle detector should be activated for it to potentially do some work. In some cases, this may have caused pathological behavior, where the time interval --ponycdinterval for 'poking' the cycle detector was theoretically unbounded. With this commit, the cycle detector always uses the global non-runtime context (inject_context), whenever messages are sent to it, independent of the sending actor's context. This guarantees that it will always be pushed to the global inject queue and will then be pop_global'd by the first runtime thread to become available. This has the following effect: a) get the cycle detector to be scheduled closer to --ponycdinterval b) have the cycle detector to be scheduled with higher priority as it can be 'pop_global'd' by any scheduler thread from the inject queue. c) not defer garbage collection of large actor graphs for long running programs right to the end (or close to) a quiescent system. d) locality is not that important for the cycle detector as for application actors, thus scheduling a previously blocked cycle detector using a runtime thread on a first-come-first serve basis should have no negative effect on performance.
user facing documentation and commit message amended @SeanTAllen ! |
thanks @sblessing. welcome back. @dipinhora this can be merged once you sign off. |
This addresses #3501.
PR #2709 introduced the cycle detector being lazily scheduled once a certain amount of cpu ticks (
--ponycdinterval
) has passed. Scheduler thread 0 would then 'wake up' the the cycle detector by sending it aCHECK_BLOCKED
message. The cycle detector would then be appended to back of scheduler-0's queue. The cycle detector would then only be scheduled if eitherThis could create pathological behavior where cycle detection is actually deferred to a point in time where the program would be almost quiescent, especially for programs that create lots of actors right at program start, such that the distance for the cycle detector to reach the head of the scheduling queue might be relatively long.
This change treats the cycle detector as a special system actor that is maintained on the schedulers global inject queue. That is, it is always scheduled using its original start up main context (which is not related to any runtime thread), making sure that a
CHECK_BLOCK
message (and any other message it would receive after being blocked) will cause it to be pushed to the global inject queue. Every scheduler thread become available would first try to consume an item from this queue before attempting to get an actors from its local queue.This should have the following effect on runtime system behavior: