-
Notifications
You must be signed in to change notification settings - Fork 18
MapReduce in Java for multi core
With xpresso you can easily define and run MapReduce jobs on a multi-core machine to parallelize time-consuming crunching.
Let's assume that we have a list of elements we want to process:
list<String> elements = x.list("Map","aNd","ReDuce","arE","aWEsome");
The processing of each element takes a long time (say, 10 seconds), so we want to parallelize the processing on our multicore machine. Let the desired processing of each element be as follows: if the element starts with an "a", then put it in uppercase and join it with other uppercase elements using "~" as separator; if the element doesn't start with an "a", then put it to lowercase and join it with other lowercase words.
Let's define the Mapper and Reducer:
static Mapper<String,String> mapper = new Mapper<String,String>() {
public void map(String input) {
x.Time.sleep(10); //the processing of each element takes a long time :-)
if (x.String(input).startsWith("a")) {
yield("upper", input.toUpperCase());
} else {
yield("lower", input.toLowerCase());
}
}
};
static Reducer<String,String> reducer = new Reducer<String,String>() {
public void reduce(tuple2<String,list<String>> input) {
yield(input.key,x.String("~").join(input.value));
}
};
Our mapper does the transformation of the case as described above, and our reducer joins the values with the "~".
Our MapReduce setup is now ready, so let's start crunching:
x.timer.start();
x.print(x.<String,String,String>MapReduce(elements).map(mapper).reduce(reducer), x.timer.stop());
Console:
{upper:AND~AWESOME~ARE, lower:reduce~map}
10.013s
As you can see, the processing of all 5 elements took only about 10 seconds, while as we have defined above the processing of each single element takes 10 seconds.