-
Notifications
You must be signed in to change notification settings - Fork 7
Road Map
==
Description coming shortly
==
Description coming shortly
==
Description coming shortly
==
Description coming shortly
==
One of the core requirement for any distributed system is its ability to share the load. This essentially implies delegation of work between available workers. In YARN such workers are represented through Application Containers. However, while performing work an Application Container may decide (based on variety of things) that the load is too high for it to handle on its own and it may choose to delegate part of its load to another YARN Application. Such application may or may not be running in the same YARN Cluster. What further complicates things is that in certain cases its hard to predetermine in advance how many Application Containers one would need to adequately process the load. Take a reverse Map/Reduce paradigm (e.g., Monte Carlo Simulation) where the input data is rather small but the computation performed on such data produces larger amounts of data which may need to be analyzed in real time and if so may result in production of more data to be analyzed essentially creating a non-deterministic work tree where the size of this tree and its growth is controlled by each branch spinning off (or not) another branch based on some condition.
Given such uncertainty it would be very difficult to impossible to maintain consistent system load within a single cluster while expecting timely responses to such computations. In other words we may need to start borrowing additional computation and IO resources from another cluster(s) (stand-by cluster).
While its already possible to accomplish by simply creating and launching a new YARN Application within an Application Container, the goal of this road-map item is to simplify such distribution requirement through a higher level strategy so it could be exposed as a simple method call.
==