Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading #8112

Merged
merged 51 commits into from
Sep 2, 2019
Merged

Multithreading #8112

merged 51 commits into from
Sep 2, 2019

Conversation

waj
Copy link
Member

@waj waj commented Aug 23, 2019

This is a big PR that adds initial and experimental support for multithreading. It works side by side with the current single thread implementation and must be enabled with the preview_mt flag.

There is an upcoming blog post explaining details about the implementation and decisions made on this work.

Things covered by this PR:

  • One scheduler per thread (workers): each thread has a queue of runnable fibers. Each fiber is assigned to a worker when it starts and currently it lives in the same worker until it finishes (no job stealing)
  • One event loop per thread: whenever a thread becomes idle (no runnable fibers left) it waits on the event loop for events (IO, timers). The same event loop also receives notifications about new fibers created or awaken from other threads.
  • New implementation of a thread safe Channel: re-created from scratch because with the previous implementation it was harder to make a multi thread select operation.
  • Atomic initialization of constants and class variables: to avoid race conditions
  • Thread safe Mutex

Things not covered by this PR:

  • Job stealing: naive approaches to implement this are detrimental to performance
  • Soundness of types: union types might become unsound in MT mode
  • Thread safety of std classes: for example, Array is not thread safe and must be protected by a mutex in case it's shared between fibers

waj and others added 30 commits August 20, 2019 14:25
Move event_loop ivars to Thread
Use libevent pthread support
…ed. It now supports MT for send/receive and select operations.
…lation is fixed). Use simpler but effective RR scheduling instead.
…nqueue_self as Scheduler#enqueue now always sends the fiber to the local queue.
…dcoded value (otherwise the constructor of Time requires the value of UNIX_EPOCH already set to validate the parameters)
waj added 2 commits August 26, 2019 15:04
Only positive integer values are accepted or empty string (that switches back to the default number of workers). Any other value will make the program to crash.
src/crystal/scheduler.cr Outdated Show resolved Hide resolved
src/crystal/scheduler.cr Show resolved Hide resolved
src/thread.cr Outdated Show resolved Hide resolved
src/fiber.cr Outdated Show resolved Hide resolved
src/crystal/thread_local_value.cr Outdated Show resolved Hide resolved
src/compiler/crystal/codegen/once.cr Outdated Show resolved Hide resolved
src/crystal/spin_lock.cr Show resolved Hide resolved
src/crystal/spin_lock.cr Show resolved Hide resolved
src/process.cr Show resolved Hide resolved
private def self.event_base
Thread.current.event_base
end

private def self.loop_fiber
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used anywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... that's not true. It's used in many places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't found it in this PR nor with grep -rn in master.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm talking about this very private class method, not the instance method and variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instance variable belongs to Thread. This static method is defined in Crystal::EventLoop module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but sorry I didn't undestand the code URL you mentioned, where EventLoop.loop_fiber is used then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same file, line 26

loop_fiber.resume

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok sorry, thanks. I didn't know this language feature, to call a static method like that.

@asterite asterite mentioned this pull request Aug 27, 2019
22 tasks
module Crystal::EventLoop
{% unless flag?(:preview_mt) %}
def self.after_fork
Thread.current.event_base.reinit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Thread.current.event_base.reinit
event_base.reinit

Copy link
Contributor

@jgaskins jgaskins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic! I'm excited about how much thought has been given to making it easy on application developers to get parallel execution while mitigating thread-safety issues!

src/crystal/rw_lock.cr Show resolved Hide resolved
@@ -58,8 +58,11 @@ end
#
# 2.times { ch.receive }
# ```
def spawn(*, name : String? = nil, &block)
def spawn(*, name : String? = nil, same_thread = false, &block)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a small thing but sooooo crucial. I use spawn all the time to move work out-of-band but I rely on it waiting until the current fiber is yielded.

Something that's been in the back of my mind every time I've done it is "this isn't going to work when Crystal goes multithreaded" but this one argument, along with fibers staying within their threads, solves that problem! 💯 🙌🏼

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make sure nobody changes the contract from "stays on the same thread" to "is initially queued for the same thread" in case job stealing is implemented :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgaskins I would recommend using Channel to wait for the result. The same_thread is more to ensure the execution context than the scheduling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just notice that same_thread should be typed as Bool here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@j8r Type restrictions are optional in Crystal. In this way one can pass a nilable thing and it will work. Restricting it to Bool will make it less flexible, forcing you to do things like !!exp.

I think we should embrace the way Crystal lets you omit type annotations. If something is not strictly required by the language then we shouldn't always push for it.

@crystal-lang crystal-lang deleted a comment Aug 28, 2019
@crystal-lang crystal-lang deleted a comment Aug 28, 2019
@crystal-lang crystal-lang deleted a comment Aug 28, 2019
Copy link
Member

@bcardiff bcardiff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: :shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.