Skip to content

Commit

Permalink
Merge github.com:apache/spark
Browse files Browse the repository at this point in the history
Conflicts:
	core/src/main/scala/org/apache/spark/CacheManager.scala
	core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala
	core/src/main/scala/org/apache/spark/storage/BlockManager.scala
	core/src/main/scala/org/apache/spark/storage/MemoryStore.scala
	core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
  • Loading branch information
andrewor14 committed Mar 7, 2014
2 parents ca258a4 + 40566e1 commit d6e3b4a
Show file tree
Hide file tree
Showing 146 changed files with 3,569 additions and 1,229 deletions.
32 changes: 32 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -396,3 +396,35 @@ INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.


========================================================================
For sbt and sbt-launch-lib.bash in sbt/:
========================================================================

// Generated from http://www.opensource.org/licenses/bsd-license.php
Copyright (c) 2011, Paul Phillips.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the author nor the names of its contributors may be
used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
17 changes: 3 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Apache Spark

Lightning-Fast Cluster Computing - <http://spark.incubator.apache.org/>
Lightning-Fast Cluster Computing - <http://spark.apache.org/>


## Online Documentation

You can find the latest Spark documentation, including a programming
guide, on the project webpage at <http://spark.incubator.apache.org/documentation.html>.
guide, on the project webpage at <http://spark.apache.org/documentation.html>.
This README file only contains basic setup instructions.


Expand Down Expand Up @@ -92,21 +92,10 @@ If your project is built with Maven, add this to your POM file's `<dependencies>

## Configuration

Please refer to the [Configuration guide](http://spark.incubator.apache.org/docs/latest/configuration.html)
Please refer to the [Configuration guide](http://spark.apache.org/docs/latest/configuration.html)
in the online documentation for an overview on how to configure Spark.


## Apache Incubator Notice

Apache Spark is an effort undergoing incubation at The Apache Software
Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of
all newly accepted projects until a further review indicates that the
infrastructure, communications, and decision making process have stabilized in
a manner consistent with other successful ASF projects. While incubation status
is not necessarily a reflection of the completeness or stability of the code,
it does indicate that the project has yet to be fully endorsed by the ASF.


## Contributing to Spark

Contributions via GitHub pull requests are gladly accepted from their original
Expand Down
9 changes: 6 additions & 3 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,20 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent</artifactId>
<version>1.0.0-incubating-SNAPSHOT</version>
<version>1.0.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

<groupId>org.apache.spark</groupId>
<artifactId>spark-assembly_2.10</artifactId>
<name>Spark Project Assembly</name>
<url>http://spark.incubator.apache.org/</url>
<url>http://spark.apache.org/</url>
<packaging>pom</packaging>

<properties>
<spark.jar>${project.build.directory}/scala-${scala.binary.version}/${project.artifactId}-${project.version}-hadoop${hadoop.version}.jar</spark.jar>
<spark.jar.dir>scala-${scala.binary.version}</spark.jar.dir>
<spark.jar.basename>${project.artifactId}-${project.version}-hadoop${hadoop.version}.jar</spark.jar.basename>
<spark.jar>${project.build.directory}/${spark.jar.dir}/${spark.jar.basename}</spark.jar>
<deb.pkg.name>spark</deb.pkg.name>
<deb.install.path>/usr/share/spark</deb.install.path>
<deb.user>root</deb.user>
Expand Down
11 changes: 11 additions & 0 deletions assembly/src/main/assembly/assembly.xml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,15 @@
<include>**/*</include>
</includes>
</fileSet>
<fileSet>
<directory>
${project.parent.basedir}/assembly/target/${spark.jar.dir}
</directory>
<outputDirectory>/</outputDirectory>
<includes>
<include>${spark.jar.basename}</include>
</includes>
</fileSet>
</fileSets>

<dependencySets>
Expand All @@ -75,6 +84,8 @@
<excludes>
<exclude>org.apache.hadoop:*:jar</exclude>
<exclude>org.apache.spark:*:jar</exclude>
<exclude>org.apache.zookeeper:*:jar</exclude>
<exclude>org.apache.avro:*:jar</exclude>
</excludes>
</dependencySet>
</dependencySets>
Expand Down
18 changes: 16 additions & 2 deletions bagel/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,29 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent</artifactId>
<version>1.0.0-incubating-SNAPSHOT</version>
<version>1.0.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

<groupId>org.apache.spark</groupId>
<artifactId>spark-bagel_2.10</artifactId>
<packaging>jar</packaging>
<name>Spark Project Bagel</name>
<url>http://spark.incubator.apache.org/</url>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
Expand Down
14 changes: 7 additions & 7 deletions bagel/src/main/scala/org/apache/spark/bagel/Bagel.scala
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ object Bagel extends Logging {

/**
* Runs a Bagel program.
* @param sc [[org.apache.spark.SparkContext]] to use for the program.
* @param sc org.apache.spark.SparkContext to use for the program.
* @param vertices vertices of the graph represented as an RDD of (Key, Vertex) pairs. Often the
* Key will be the vertex id.
* @param messages initial set of messages represented as an RDD of (Key, Message) pairs. Often
Expand All @@ -38,10 +38,10 @@ object Bagel extends Logging {
* @param aggregator [[org.apache.spark.bagel.Aggregator]] performs a reduce across all vertices
* after each superstep and provides the result to each vertex in the next
* superstep.
* @param partitioner [[org.apache.spark.Partitioner]] partitions values by key
* @param partitioner org.apache.spark.Partitioner partitions values by key
* @param numPartitions number of partitions across which to split the graph.
* Default is the default parallelism of the SparkContext
* @param storageLevel [[org.apache.spark.storage.StorageLevel]] to use for caching of
* @param storageLevel org.apache.spark.storage.StorageLevel to use for caching of
* intermediate RDDs in each superstep. Defaults to caching in memory.
* @param compute function that takes a Vertex, optional set of (possibly combined) messages to
* the Vertex, optional Aggregator and the current superstep,
Expand Down Expand Up @@ -131,7 +131,7 @@ object Bagel extends Logging {

/**
* Runs a Bagel program with no [[org.apache.spark.bagel.Aggregator]], default
* [[org.apache.spark.HashPartitioner]] and default storage level
* org.apache.spark.HashPartitioner and default storage level
*/
def run[K: Manifest, V <: Vertex : Manifest, M <: Message[K] : Manifest, C: Manifest](
sc: SparkContext,
Expand All @@ -146,7 +146,7 @@ object Bagel extends Logging {

/**
* Runs a Bagel program with no [[org.apache.spark.bagel.Aggregator]] and the
* default [[org.apache.spark.HashPartitioner]]
* default org.apache.spark.HashPartitioner
*/
def run[K: Manifest, V <: Vertex : Manifest, M <: Message[K] : Manifest, C: Manifest](
sc: SparkContext,
Expand All @@ -166,7 +166,7 @@ object Bagel extends Logging {

/**
* Runs a Bagel program with no [[org.apache.spark.bagel.Aggregator]],
* default [[org.apache.spark.HashPartitioner]],
* default org.apache.spark.HashPartitioner,
* [[org.apache.spark.bagel.DefaultCombiner]] and the default storage level
*/
def run[K: Manifest, V <: Vertex : Manifest, M <: Message[K] : Manifest](
Expand All @@ -180,7 +180,7 @@ object Bagel extends Logging {

/**
* Runs a Bagel program with no [[org.apache.spark.bagel.Aggregator]],
* the default [[org.apache.spark.HashPartitioner]]
* the default org.apache.spark.HashPartitioner
* and [[org.apache.spark.bagel.DefaultCombiner]]
*/
def run[K: Manifest, V <: Vertex : Manifest, M <: Message[K] : Manifest](
Expand Down
39 changes: 27 additions & 12 deletions core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,29 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent</artifactId>
<version>1.0.0-incubating-SNAPSHOT</version>
<version>1.0.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<packaging>jar</packaging>
<name>Spark Project Core</name>
<url>http://spark.incubator.apache.org/</url>
<url>http://spark.apache.org/</url>

<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<profiles>
<profile>
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
Expand All @@ -46,14 +60,6 @@
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-ipc</artifactId>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
Expand Down Expand Up @@ -133,6 +139,15 @@
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
<version>3.2.6</version>
<!-- see also exclusion for lift-json; this is necessary since it depends on
scala-library and scalap 2.10.0, but we use 2.10.3, and only override
scala-library -->
<exclusions>
<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scalap</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>it.unimi.dsi</groupId>
Expand Down Expand Up @@ -225,7 +240,7 @@
</goals>
<configuration>
<exportAntProperties>true</exportAntProperties>
<tasks>
<target>
<property name="spark.classpath" refid="maven.test.classpath" />
<property environment="env" />
<fail message="Please set the SCALA_HOME (or SCALA_LIBRARY_PATH if scala is on the path) environment variables and retry.">
Expand All @@ -238,7 +253,7 @@
</not>
</condition>
</fail>
</tasks>
</target>
</configuration>
</execution>
</executions>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,13 @@
* limitations under the License.
*/

package org.apache.spark.api.java.function
package org.apache.spark.api.java.function;

import java.lang.{Double => JDouble, Iterable => JIterable}
import java.io.Serializable;

/**
* A function that returns zero or more records of type Double from each input record.
*/
// DoubleFlatMapFunction does not extend FlatMapFunction because flatMap is
// overloaded for both FlatMapFunction and DoubleFlatMapFunction.
abstract class DoubleFlatMapFunction[T] extends WrappedFunction1[T, JIterable[JDouble]]
with Serializable {
// Intentionally left blank
public interface DoubleFlatMapFunction<T> extends Serializable {
public Iterable<Double> call(T t) throws Exception;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.api.java.function;

import java.io.Serializable;

/**
* A function that returns Doubles, and can be used to construct DoubleRDDs.
*/
public interface DoubleFunction<T> extends Serializable {
public double call(T t) throws Exception;
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@
* limitations under the License.
*/

package org.apache.spark.api.java.function
package org.apache.spark.api.java.function;

import scala.reflect.ClassTag
import java.io.Serializable;

/**
* A function that returns zero or more output records from each input record.
*/
abstract class FlatMapFunction[T, R] extends Function[T, java.lang.Iterable[R]] {
def elementType(): ClassTag[R] = ClassTag.Any.asInstanceOf[ClassTag[R]]
}
public interface FlatMapFunction<T, R> extends Serializable {
public Iterable<R> call(T t) throws Exception;
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@
* limitations under the License.
*/

package org.apache.spark.api.java.function
package org.apache.spark.api.java.function;

import scala.reflect.ClassTag
import java.io.Serializable;

/**
* A function that takes two inputs and returns zero or more output records.
*/
abstract class FlatMapFunction2[A, B, C] extends Function2[A, B, java.lang.Iterable[C]] {
def elementType() : ClassTag[C] = ClassTag.Any.asInstanceOf[ClassTag[C]]
}
public interface FlatMapFunction2<T1, T2, R> extends Serializable {
public Iterable<R> call(T1 t1, T2 t2) throws Exception;
}
Loading

0 comments on commit d6e3b4a

Please sign in to comment.