Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Ballista scheduler with DataFusion #64

Closed
andygrove opened this issue Apr 25, 2021 · 1 comment
Closed

Integrate Ballista scheduler with DataFusion #64

andygrove opened this issue Apr 25, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@andygrove
Copy link
Member

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

The Ballista scheduler breaks a query down into stages based on changes in partitioning in the plan, where each stage is broken down into tasks that can be executed concurrently.

Rather than trying to run all the partitions at once, Ballista executors process n concurrent tasks at a time and then request new tasks from the scheduler.

This approach would help DataFusion scale better and it would be ideal to use the same scheduler to scale across cores in DataFusion and across nodes in Ballista.

Describe the solution you'd like

Implement an extensible scheduler in DataFusion and have Ballista extend it to provide distributed execution.

Describe alternatives you've considered
None

Additional context
None

@andygrove andygrove added the enhancement New feature or request label Apr 25, 2021
tustvold added a commit that referenced this issue Jul 17, 2022
* Add optional serde support to datafusion-proto (#2889)

* Add public methods for JSON serde (#64)

* Misc suggestions

* Update datafusion/proto/Cargo.toml

Co-authored-by: Raphael Taylor-Davies <[email protected]>

Co-authored-by: Raphael Taylor-Davies <[email protected]>

* Fixes

* Fixup Cargo.toml

* Format Cargo.toml

Co-authored-by: Brent Gardner <[email protected]>
waitingkuo referenced this issue in waitingkuo/arrow-datafusion Aug 1, 2022
@alamb
Copy link
Contributor

alamb commented Jun 11, 2023

I am not sure if this is tracking anything actionable and hasn't had any updates in a few years, so closing it. Please feel free to reopen if i got that wrong

@alamb alamb closed this as completed Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants