Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating to latest Spark master #2

Merged
merged 359 commits into from
Feb 27, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
359 commits
Select commit Hold shift + click to select a range
6cc96cf
[Spark-5717] [MLlib] add stop and reorganize import
Feb 10, 2015
c7ad80a
[SPARK-5716] [SQL] Support TOK_CHARSETLITERAL in HiveQl
adrian-wang Feb 10, 2015
69bc3bb
SPARK-4136. Under dynamic allocation, cancel outstanding executor req…
sryza Feb 10, 2015
b640c84
[HOTFIX][SPARK-4136] Fix compilation and tests
Feb 10, 2015
59272da
[SPARK-5592][SQL] java.net.URISyntaxException when insert data to a p…
scwf Feb 10, 2015
c49a404
[SPARK-5668] Display region in spark_ec2.py get_existing_cluster()
MiguelPeralvo Feb 10, 2015
de80b1b
[SQL] Add toString to DataFrame/Column
marmbrus Feb 10, 2015
f98707c
[SPARK-5686][SQL] Add show current roles command in HiveQl
OopsOutOfMemory Feb 10, 2015
fd2c032
[SPARK-5021] [MLlib] Gaussian Mixture now supports Sparse Input
MechCoder Feb 10, 2015
5820961
[SPARK-5343][GraphX]: ShortestPaths traverses backwards
Feb 10, 2015
52983d7
[SPARK-5644] [Core]Delete tmp dir when sc is stop
Sephiroth-Lin Feb 10, 2015
91e3512
[SQL][Minor] correct some comments
OopsOutOfMemory Feb 11, 2015
2d50a01
[SPARK-5725] [SQL] Fixes ParquetRelation2.equals
liancheng Feb 11, 2015
e28b6bd
[SQL] Make Options in the data source API CREATE TABLE statements opt…
yhuai Feb 11, 2015
ed167e7
[SPARK-5493] [core] Add option to impersonate user.
Feb 11, 2015
aaf50d0
[SPARK-5658][SQL] Finalize DDL and write support APIs
yhuai Feb 11, 2015
6195e24
[SQL] Add an exception for analysis errors.
marmbrus Feb 11, 2015
a60aea8
[SPARK-5683] [SQL] Avoid multiple json generator created
chenghao-intel Feb 11, 2015
ea60284
[SPARK-5704] [SQL] [PySpark] createDataFrame from RDD with columns
Feb 11, 2015
45df77b
[SPARK-5709] [SQL] Add EXPLAIN support in DataFrame API for debugging…
chenghao-intel Feb 11, 2015
7e24249
[SQL][DataFrame] Fix column computability bug.
rxin Feb 11, 2015
1cb3770
[SPARK-4879] Use driver to coordinate Hadoop output committing for sp…
mccheah Feb 11, 2015
b969182
[SPARK-5729] Potential NPE in standalone REST API
Feb 11, 2015
b8f88d3
[SPARK-5702][SQL] Allow short names for built-in data sources.
rxin Feb 11, 2015
f86a89a
[SPARK-5714][Mllib] Refactor initial step of LDA to remove redundant …
viirya Feb 11, 2015
7e2f882
HOTFIX: Java 6 compilation error in Spark SQL
pwendell Feb 11, 2015
c2131c0
HOTFIX: Adding Junit to Hive tests for Maven build
pwendell Feb 11, 2015
658687b
[SPARK-4964] [Streaming] refactor createRDD to take leaders via map i…
koeninger Feb 11, 2015
da89720
SPARK-5728 [STREAMING] MQTTStreamSuite leaves behind ActiveMQ databas…
srowen Feb 11, 2015
bd0d6e0
SPARK-5727 [BUILD] Deprecate Debian packaging
srowen Feb 11, 2015
1ac099e
[SPARK-5733] Error Link in Pagination of HistroyPage when showing Inc…
Feb 11, 2015
b694eb9
[SPARK-5677] [SPARK-5734] [SQL] [PySpark] Python DataFrame API remain…
Feb 11, 2015
03bf704
Remove outdated remark about take(n).
darabos Feb 11, 2015
a60d2b7
[SPARK-5454] More robust handling of self joins
marmbrus Feb 11, 2015
44b2311
[SPARK-3688][SQL]LogicalPlan can't resolve column correctlly
tianyi Feb 11, 2015
fa6bdc6
[SPARK-3688][SQL] More inline comments for LogicalPlan.
rxin Feb 11, 2015
d931b01
[SQL] Two DataFrame fixes.
rxin Feb 12, 2015
a38e23c
[SQL] Make dataframe more tolerant of being serialized
marmbrus Feb 12, 2015
9a3ea49
SPARK-5727 [BUILD] Remove Debian packaging
srowen Feb 12, 2015
9a6efbc
ignore cache paths for RAT tests
orenmazor Feb 12, 2015
466b1f6
[SPARK-5655] Don't chmod700 application files if running in YARN
growse Feb 12, 2015
99bd500
[SPARK-5757][MLLIB] replace SQL JSON usage in model import/export by …
mengxr Feb 12, 2015
bc57789
SPARK-5776 JIRA version not of form x.y.z breaks merge_spark_pr.py
srowen Feb 12, 2015
6a1be02
[SQL][DOCS] Update sql documentation
ajnavarro Feb 12, 2015
aa4ca8b
[SQL] Improve error messages
marmbrus Feb 12, 2015
893d6fd
[SPARK-5645] Added local read bytes/time to task metrics
kayousterhout Feb 12, 2015
9c80765
[EC2] Update default Spark version to 1.2.1
potix2 Feb 12, 2015
629d014
[SPARK-5765][Examples]Fixed word split problem in run-example and com…
gvramana Feb 12, 2015
47c73d4
[SPARK-5762] Fix shuffle write time for sort-based shuffle
kayousterhout Feb 12, 2015
1d5663e
[SPARK-5760][SPARK-5761] Fix standalone rest protocol corner cases + …
Feb 12, 2015
947b8bd
[SPARK-5759][Yarn]ExecutorRunnable should catch YarnException while N…
lianhuiwang Feb 12, 2015
26c816e
SPARK-5747: Fix wordsplitting bugs in make-distribution.sh
dyross Feb 12, 2015
0bf0315
[SPARK-5780] [PySpark] Mute the logging during unit tests
Feb 12, 2015
c352ffb
[SPARK-5758][SQL] Use LongType as the default type for integers in JS…
yhuai Feb 12, 2015
ee04a8b
[SPARK-5573][SQL] Add explode to dataframes
marmbrus Feb 12, 2015
d5fc514
[SPARK-5755] [SQL] remove unnecessary Add
adrian-wang Feb 12, 2015
ada993e
[SPARK-5335] Fix deletion of security groups within a VPC
Feb 12, 2015
c025a46
[SQL] Move SaveMode to SQL package.
yhuai Feb 12, 2015
1d0596a
[SPARK-3299][SQL]Public API in SQLContext to list tables
yhuai Feb 13, 2015
2aea892
[SQL] Fix docs of SQLContext.tables
yhuai Feb 13, 2015
1c8633f
[SPARK-3365][SQL]Wrong schema generated for List type
tianyi Feb 13, 2015
1768bd5
[SPARK-4832][Deploy]some other processes might take the daemon pid
WangTaoTheTonic Feb 13, 2015
c0ccd25
[SPARK-5732][CORE]:Add an option to print the spark version in spark …
uncleGen Feb 13, 2015
e1a1ff8
[SPARK-5503][MLLIB] Example code for Power Iteration Clustering
sboeschhuawei Feb 13, 2015
fc6d3e7
[SPARK-5783] Better eventlog-parsing error messages
ryan-williams Feb 13, 2015
077eec2
[SPARK-5735] Replace uses of EasyMock with Mockito
JoshRosen Feb 13, 2015
9f31db0
SPARK-5805 Fixed the type error in documentation.
emres Feb 13, 2015
378c7eb
[HOTFIX] Ignore DirectKafkaStreamSuite.
rxin Feb 13, 2015
5d3cc6b
[HOTFIX] Fix build break in MesosSchedulerBackendSuite
Feb 13, 2015
2cbb3e4
[SPARK-5642] [SQL] Apply column pruning on unused aggregation fields
adrian-wang Feb 13, 2015
2e0c084
[SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson en…
yhuai Feb 13, 2015
cc56c87
[SPARK-5806] re-organize sections in mllib-clustering.md
mengxr Feb 13, 2015
d50a91d
[SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays
mengxr Feb 14, 2015
4f4c6d5
[SPARK-5730][ML] add doc groups to spark.ml components
mengxr Feb 14, 2015
d06d5ee
[SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRe…
JoshRosen Feb 14, 2015
0ce4e43
SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus
srowen Feb 14, 2015
e98dfe6
[SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
rxin Feb 14, 2015
f80e262
[SPARK-5800] Streaming Docs. Change linked files according the select…
gasparms Feb 14, 2015
15a2ab5
Revise formatting of previous commit f80e2629bb74bc62960c61ff313f7e78…
srowen Feb 14, 2015
ed5f4bb
SPARK-5822 [BUILD] cannot import src/main/scala & src/test/scala into…
ligangty Feb 14, 2015
c771e47
[SPARK-5827][SQL] Add missing import in the example of SqlContext
maropu Feb 15, 2015
61eb126
[MLLIB][SPARK-5502] User guide for isotonic regression
zapletal-martin Feb 15, 2015
836577b
SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libg…
srowen Feb 15, 2015
cd4a153
[SPARK-5769] Set params in constructors and in setParams in Python ML…
mengxr Feb 16, 2015
acf2558
SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatri…
srowen Feb 16, 2015
c78a12c
[Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline
petro-rudenko Feb 16, 2015
d51d6ba
[Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop
petro-rudenko Feb 16, 2015
199a9e8
[Minor] [SQL] Renames stringRddToDataFrame to stringRddToDataFrameHol…
liancheng Feb 16, 2015
3ce58cf
[SPARK-4553] [SPARK-5767] [SQL] Wires Parquet data source with the ne…
liancheng Feb 16, 2015
1115e8e
[SPARK-5831][Streaming]When checkpoint file size is bigger than 10, t…
XuTingjun Feb 16, 2015
a3afa4a
SPARK-5815 [MLLIB] Part 2. Deprecate SVDPlusPlus APIs that expose Dou…
srowen Feb 16, 2015
5c78be7
[SPARK-5799][SQL] Compute aggregation function on specified numeric c…
viirya Feb 16, 2015
9baac56
Minor fixes for commit https://github.com/apache/spark/pull/4592.
rxin Feb 16, 2015
8e25373
SPARK-5795 [STREAMING] api.java.JavaPairDStream.saveAsNewAPIHadoopFil…
srowen Feb 16, 2015
cc552e0
[SQL] [Minor] Update the SpecificMutableRow.copy
chenghao-intel Feb 16, 2015
275a0c0
[SPARK-5824] [SQL] add null format in ctas and set default col commen…
adrian-wang Feb 16, 2015
104b2c4
[SQL] Initial support for reporting location of error in sql string
marmbrus Feb 16, 2015
b4d7c70
[SQL] Add fetched row count in SparkSQLCLIDriver
OopsOutOfMemory Feb 16, 2015
6f54dee
[SPARK-5296] [SQL] Add more filter types for data sources API
liancheng Feb 16, 2015
c51ab37
[SPARK-5833] [SQL] Adds REFRESH TABLE command
liancheng Feb 16, 2015
bb05982
SPARK-5841: remove DiskBlockManager shutdown hook on stop
Feb 16, 2015
c01c4eb
SPARK-5357: Update commons-codec version to 1.10 (current)
Feb 16, 2015
0cfda84
[SPARK-2313] Use socket to communicate GatewayServer port back to Pyt…
JoshRosen Feb 16, 2015
04b401d
HOTFIX: Break in Jekyll build from #4589
pwendell Feb 16, 2015
5b6cd65
[SPARK-5746][SQL] Check invalid cases for the write path of data sour…
yhuai Feb 16, 2015
f3ff1eb
[SPARK-5839][SQL]HiveMetastoreCatalog does not recognize table names …
yhuai Feb 16, 2015
cb6c48c
[SQL] Optimize arithmetic and predicate operators
Feb 16, 2015
e189cbb
[SPARK-4865][SQL]Include temporary tables in SHOW TABLES
yhuai Feb 16, 2015
1294a6e
SPARK-5848: tear down the ConsoleProgressBar timer
Feb 17, 2015
b1bd1dd
[SPARK-5788] [PySpark] capture the exception in python write thread
Feb 17, 2015
1668765
[SPARK-3340] Deprecate ADD_JARS and ADD_FILES
azagrebin Feb 17, 2015
58a82a7
[SPARK-5849] Handle more types of invalid JSON requests in SubmitRest…
JoshRosen Feb 17, 2015
0e180bf
[SQL] Various DataFrame doc changes.
rxin Feb 17, 2015
ac6fe67
[SPARK-5363] [PySpark] check ending mark in non-block way
Feb 17, 2015
a51d51f
SPARK-5850: Remove experimental label for Scala 2.11 and FlumePolling…
pwendell Feb 17, 2015
d380f32
[SPARK-5853][SQL] Schema support in Row.
rxin Feb 17, 2015
fd84229
[SPARK-5802][MLLIB] cache transformed data in glm
mengxr Feb 17, 2015
c06e42f
HOTFIX: Style issue causing build break
pwendell Feb 17, 2015
a65766b
[SPARK-5826][Streaming] Fix Configuration not serializable problem
jerryshao Feb 17, 2015
ee6e3ef
Revert "[SPARK-5363] [PySpark] check ending mark in non-block way"
JoshRosen Feb 17, 2015
3ce46e9
SPARK-5856: In Maven build script, launch Zinc with more memory
pwendell Feb 17, 2015
c76da36
[SPARK-5858][MLLIB] Remove unnecessary first() call in GLM
mengxr Feb 17, 2015
c74b07f
[SPARK-5166][SPARK-5247][SPARK-5258][SQL] API Cleanup / Documentation
marmbrus Feb 17, 2015
d8adefe
[SPARK-5859] [PySpark] [SQL] fix DataFrame Python API
Feb 17, 2015
d8f69cf
[SPARK-5778] throw if nonexistent metrics config file provided
ryan-williams Feb 17, 2015
b271c26
[SPARK-5661]function hasShutdownDeleteTachyonDir should use shutdownD…
viper-kun Feb 17, 2015
9b746f3
[SPARK-3381] [MLlib] Eliminate bins for unordered features in Decisio…
MechCoder Feb 17, 2015
24f358b
MAINTENANCE: Automated closing of pull requests.
pwendell Feb 17, 2015
49c19fd
SPARK-5841 [CORE] [HOTFIX] Memory leak in DiskBlockManager
srowen Feb 17, 2015
fc4eb95
[SPARK-5864] [PySpark] support .jar as python package
Feb 17, 2015
31efb39
[Minor] fix typo in SQL document
CodingCat Feb 17, 2015
4611de1
[SPARK-5862][SQL] Only transformUp the given plan once in HiveMetasto…
viirya Feb 17, 2015
ac506b7
[Minor][SQL] Use same function to check path parameter in JSONRelation
viirya Feb 17, 2015
9d281fa
[SQL] [Minor] Update the HiveContext Unittest
chenghao-intel Feb 17, 2015
de4836f
[SPARK-5868][SQL] Fix python UDFs in HiveContext and checks in SQLCon…
marmbrus Feb 17, 2015
445a755
[SPARK-4172] [PySpark] Progress API in Python
Feb 17, 2015
3df85dc
[SPARK-5871] output explain in Python
Feb 17, 2015
4d4cc76
[SPARK-5872] [SQL] create a sqlCtx in pyspark shell
Feb 17, 2015
117121a
[SPARK-5852][SQL]Fail to convert a newly created empty metastore parq…
yhuai Feb 17, 2015
c3d2b90
[SPARK-5785] [PySpark] narrow dependency for cogroup/join in PySpark
Feb 18, 2015
ae6cfb3
[SPARK-5811] Added documentation for maven coordinates and added Spar…
brkyvz Feb 18, 2015
d46d624
[SPARK-4454] Properly synchronize accesses to DAGScheduler cacheLocs map
JoshRosen Feb 18, 2015
a51fc7e
[SPARK-4454] Revert getOrElse() cleanup in DAGScheduler.getCacheLocs()
JoshRosen Feb 18, 2015
d5f12bf
[SPARK-5875][SQL]logical.Project should not be resolved if it contain…
yhuai Feb 18, 2015
e50934f
[SPARK-5723][SQL]Change the default file format to Parquet for CTAS s…
yhuai Feb 18, 2015
3912d33
[SPARK-5731][Streaming][Test] Fix incorrect test in DirectKafkaStream…
tdas Feb 18, 2015
61ab085
[Minor] [SQL] Cleans up DataFrame variable names and toDF() calls
liancheng Feb 18, 2015
de0dd6d
Avoid deprecation warnings in JDBCSuite.
tmyklebu Feb 18, 2015
c1b6fa9
[SPARK-5878] fix DataFrame.repartition() in Python
Feb 18, 2015
e79a7a6
SPARK-4610 addendum: [Minor] [MLlib] Minor doc fix in GBT classificat…
MechCoder Feb 18, 2015
82197ed
[SPARK-4949]shutdownCallback in SparkDeploySchedulerBackend should be…
sarutak Feb 18, 2015
5aecdcf
SPARK-5669 [BUILD] [HOTFIX] Spark assembly includes incompatibly lice…
srowen Feb 18, 2015
85e9d09
[SPARK-5519][MLLIB] add user guide with example code for fp-growth
mengxr Feb 18, 2015
a8eb92d
[SPARK-5507] Added documentation for BlockMatrix
brkyvz Feb 18, 2015
f0e3b71
[SPARK-5840][SQL] HiveContext cannot be serialized due to tuple extra…
rxin Feb 18, 2015
aa8f10e
[SPARK-5722] [SQL] [PySpark] infer int as LongType
Feb 18, 2015
d12d2ad
[SPARK-5879][MLLIB] update PIC user guide and add a Java example
mengxr Feb 19, 2015
e945aa6
[SPARK-5846] Correctly set job description and pool for SQL jobs
kayousterhout Feb 19, 2015
fb87f44
SPARK-5548: Fix for AkkaUtilsSuite failure - attempt 2
jacek-lewandowski Feb 19, 2015
38e624a
[SPARK-5816] Add huge compatibility warning in DriverWrapper
Feb 19, 2015
90095bf
[SPARK-5423][Core] Cleanup resources in DiskMapIterator.finalize to e…
zsxwing Feb 19, 2015
94cdb05
[SPARK-5825] [Spark Submit] Remove the double checking instance name …
chenghao-intel Feb 19, 2015
8ca3418
[SPARK-5904][SQL] DataFrame API fixes.
rxin Feb 19, 2015
a5fed34
[SPARK-5902] [ml] Made PipelineStage.transformSchema public instead o…
jkbradley Feb 19, 2015
ad6b169
[Spark-5889] Remove pid file after stopping service.
zhzhan Feb 19, 2015
34b7c35
SPARK-4682 [CORE] Consolidate various 'Clock' classes
srowen Feb 19, 2015
6bddc40
SPARK-5570: No docs stating that `new SparkConf().set("spark.driver.m…
Feb 19, 2015
0cfd2ce
[SPARK-5900][MLLIB] make PIC and FPGrowth Java-friendly
mengxr Feb 20, 2015
3be92cd
[SPARK-4808] Removing minimum number of elements read before spill check
mccheah Feb 20, 2015
70bfb5c
[SPARK-5909][SQL] Add a clearCache command to Spark SQL's cache manager
yhuai Feb 20, 2015
d3dfebe
SPARK-5744 [CORE] Take 2. RDD.isEmpty / take fails for (empty) RDD of…
srowen Feb 20, 2015
4a17eed
[SPARK-5867] [SPARK-5892] [doc] [ml] [mllib] Doc cleanups for 1.3 rel…
jkbradley Feb 20, 2015
5b0a42c
[SPARK-5898] [SPARK-5896] [SQL] [PySpark] create DataFrame from pand…
Feb 20, 2015
e155324
[MLlib] fix typo
jackylk Feb 21, 2015
d3cbd38
SPARK-5841 [CORE] [HOTFIX 2] Memory leak in DiskBlockManager
nishkamravi2 Feb 21, 2015
7138816
[SPARK-5937][YARN] Fix ClientSuite to set YARN mode, so that the corr…
harishreedharan Feb 21, 2015
7683982
[SPARK-5860][CORE] JdbcRDD: overflow on large range with high number …
hotou Feb 21, 2015
46462ff
MAINTENANCE: Automated closing of pull requests.
pwendell Feb 22, 2015
a7f9039
[DOCS] Fix typo in API for custom InputFormats based on the “new” Map…
Feb 22, 2015
275b1be
[DataFrame] [Typo] Fix the typo
chenghao-intel Feb 22, 2015
e4f9d03
[SPARK-911] allow efficient queries for a range if RDD is partitioned…
aaronjosephs Feb 23, 2015
95cd643
[SPARK-3885] Provide mechanism to remove accumulators once they are n…
Feb 23, 2015
9348767
[EXAMPLES] fix typo.
fukuo33 Feb 23, 2015
757b14b
[SPARK-5943][Streaming] Update the test to use new API to reduce the …
jerryshao Feb 23, 2015
242d495
[SPARK-5724] fix the misconfiguration in AkkaUtils
CodingCat Feb 23, 2015
651a1c0
[SPARK-5939][MLLib] make FPGrowth example app take parameters
jackylk Feb 23, 2015
28ccf5e
[MLLIB] SPARK-5912 Programming guide for feature selection
avulanov Feb 23, 2015
59536cc
[SPARK-5912] [docs] [mllib] Small fixes to ChiSqSelector docs
jkbradley Feb 24, 2015
48376bf
[SPARK-5935][SQL] Accept MapType in the schema provided to a JSON dat…
yhuai Feb 24, 2015
1ed5708
[SPARK-5873][SQL] Allow viewing of partially analyzed plans in queryE…
marmbrus Feb 24, 2015
cf2e416
[SPARK-5958][MLLIB][DOC] update block matrix user guide
mengxr Feb 24, 2015
8403331
[SPARK-5968] [SQL] Suppresses ParquetOutputCommitter WARN logs
liancheng Feb 24, 2015
0a59e45
[SPARK-5910][SQL] Support for as in selectExpr
marmbrus Feb 24, 2015
2012366
[SPARK-5532][SQL] Repartition should not use external rdd representation
marmbrus Feb 24, 2015
64d2c01
[Spark-5967] [UI] Correctly clean JobProgressListener.stageIdToActive…
tdas Feb 24, 2015
6d2caa5
[SPARK-5965] Standalone Worker UI displays {{USER_JAR}}
Feb 24, 2015
105791e
[MLLIB] Change x_i to y_i in Variance's user guide
mengxr Feb 24, 2015
c5ba975
[Spark-5708] Add Slf4jSink to Spark Metrics
judynash Feb 24, 2015
a2b9137
[SPARK-5952][SQL] Lock when using hive metastore client
marmbrus Feb 24, 2015
da505e5
[SPARK-5973] [PySpark] fix zip with two RDDs with AutoBatchedSerializer
Feb 24, 2015
2a0fe34
[SPARK-5436] [MLlib] Validate GradientBoostedTrees using runWithValid…
MechCoder Feb 24, 2015
f816e73
[SPARK-5751] [SQL] [WIP] Revamped HiveThriftServer2Suite for robustness
liancheng Feb 25, 2015
53a1ebf
[SPARK-5904][SQL] DataFrame Java API test suites.
rxin Feb 25, 2015
fba11c2
[SPARK-5985][SQL] DataFrame sortBy -> orderBy in Python.
rxin Feb 25, 2015
922b43b
[SPARK-5993][Streaming][Build] Fix assembly jar location of kafka-ass…
tdas Feb 25, 2015
769e092
[SPARK-5286][SQL] SPARK-5286 followup
yhuai Feb 25, 2015
d641fbb
[SPARK-5994] [SQL] Python DataFrame documentation fixes
Feb 25, 2015
d51ed26
[SPARK-5666][streaming][MQTT streaming] some trivial fixes
prabeesh Feb 25, 2015
5b8480e
[GraphX] fixing 3 typos in the graphx programming guide
1123 Feb 25, 2015
dd077ab
[SPARK-5771] Number of Cores in Completed Applications of Standalone …
Feb 25, 2015
f84c799
[SPARK-5996][SQL] Fix specialized outbound conversions
marmbrus Feb 25, 2015
7d8e6a2
SPARK-5930 [DOCS] Documented default of spark.shuffle.io.retryWait is…
srowen Feb 25, 2015
a777c65
[SPARK-5970][core] Register directory created in getOrCreateLocalRoot…
foxik Feb 25, 2015
9f603fc
[SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing
Feb 25, 2015
838a480
[SPARK-5982] Remove incorrect Local Read Time Metric
kayousterhout Feb 25, 2015
f3f4c87
[SPARK-5944] [PySpark] fix version in Python API docs
Feb 25, 2015
e0fdd46
[SPARK-6010] [SQL] Merging compatible Parquet schemas before computin…
liancheng Feb 25, 2015
12dbf98
[SPARK-5999][SQL] Remove duplicate Literal matching block
viirya Feb 25, 2015
41e2e5a
[SPARK-5926] [SQL] make DataFrame.explain leverage queryExecution.log…
yanboliang Feb 25, 2015
46a044a
[SPARK-1182][Docs] Sort the configuration parameters in configuration.md
Feb 26, 2015
d20559b
[SPARK-5974] [SPARK-5980] [mllib] [python] [docs] Update ML guide wit…
jkbradley Feb 26, 2015
e43139f
[SPARK-5976][MLLIB] Add partitioner to factors returned by ALS
mengxr Feb 26, 2015
51a6f90
[SPARK-5914] to run spark-submit requiring only user perm on windows
judynash Feb 26, 2015
f02394d
[SPARK-6023][SQL] ParquetConversions fails to replace the destination…
yhuai Feb 26, 2015
192e42a
[SPARK-6016][SQL] Cannot read the parquet table after overwriting the…
yhuai Feb 26, 2015
df3d559
[SPARK-5801] [core] Avoid creating nested directories.
Feb 26, 2015
2358657
[SPARK-6007][SQL] Add numRows param in DataFrame.show()
jackylk Feb 26, 2015
cfff397
[SPARK-6004][MLlib] Pick the best model when training GradientBoosted…
viirya Feb 26, 2015
7fa960e
[SPARK-5363] Fix bug in PythonRDD: remove() inside iterator is not safe
Feb 26, 2015
cd5c8d7
SPARK-4704 [CORE] SparkSubmitDriverBootstrap doesn't flush output
srowen Feb 26, 2015
10094a5
Modify default value description for spark.scheduler.minRegisteredRes…
li-zhihui Feb 26, 2015
8942b52
[SPARK-3562]Periodic cleanup event logs
viper-kun Feb 26, 2015
aa63f63
[SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for K…
tdas Feb 26, 2015
5f3238b
[SPARK-6018] [YARN] NoSuchMethodError in Spark app is swallowed by YA…
Feb 26, 2015
3fb53c0
SPARK-4300 [CORE] Race condition during SparkWorker shutdown
srowen Feb 26, 2015
c871e2d
Add a note for context termination for History server on Yarn
moutai Feb 26, 2015
b38dec2
[SPARK-5951][YARN] Remove unreachable driver memory properties in yar…
mohitgoyal557 Feb 26, 2015
e60ad2f
SPARK-6045 RecordWriter should be checked against null in PairRDDFunc…
tedyu Feb 26, 2015
fbc4694
SPARK-4579 [WEBUI] Scheduling Delay appears negative
srowen Feb 27, 2015
18f2098
[SPARK-5529][CORE]Add expireDeadHosts in HeartbeatReceiver
shenh062326 Feb 27, 2015
4ad5153
[SPARK-6037][SQL] Avoiding duplicate Parquet schema merging
viirya Feb 27, 2015
5e5ad65
[SPARK-6024][SQL] When a data source table has too many columns, it's…
yhuai Feb 27, 2015
12135e9
[SPARK-5771][UI][hotfix] Change Requested Cores into * if default cor…
jerryshao Feb 27, 2015
67595eb
[SPARK-5495][UI] Add app and driver kill function in master web UI
jerryshao Feb 27, 2015
4a8a0a8
SPARK-2168 [Spark core] Use relative URIs for the app links in the Hi…
elyast Feb 27, 2015
7c99a01
[SPARK-6046] Privatize SparkConf.translateConfKey
Feb 27, 2015
0375a41
fix spark-6033, clarify the spark.worker.cleanup behavior in standalo…
Feb 27, 2015
8cd1692
[SPARK-6036][CORE] avoid race condition between eventlogListener and …
liyezhang556520 Feb 27, 2015
e747e98
[SPARK-6058][Yarn] Log the user class exception in ApplicationMaster
zsxwing Feb 27, 2015
57566d0
[SPARK-6059][Yarn] Add volatile to ApplicationMaster's reporterThread…
zsxwing Feb 27, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
target
cache
.gitignore
.gitattributes
.project
Expand All @@ -18,6 +19,7 @@ fairscheduler.xml.template
spark-defaults.conf.template
log4j.properties
log4j.properties.template
metrics.properties
metrics.properties.template
slaves
slaves.template
Expand Down
121 changes: 10 additions & 111 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,6 @@
<spark.jar.dir>scala-${scala.binary.version}</spark.jar.dir>
<spark.jar.basename>spark-assembly-${project.version}-hadoop${hadoop.version}.jar</spark.jar.basename>
<spark.jar>${project.build.directory}/${spark.jar.dir}/${spark.jar.basename}</spark.jar>
<deb.pkg.name>spark</deb.pkg.name>
<deb.install.path>/usr/share/spark</deb.install.path>
<deb.user>root</deb.user>
<deb.bin.filemode>744</deb.bin.filemode>
</properties>

<dependencies>
Expand Down Expand Up @@ -118,6 +114,16 @@
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
<filter>
<!-- Exclude libgfortran, libgcc for license issues -->
<artifact>org.jblas:jblas</artifact>
<excludes>
<!-- Linux amd64 is OK; not statically linked -->
<exclude>lib/static/Linux/i386/**</exclude>
<exclude>lib/static/Mac OS X/**</exclude>
<exclude>lib/static/Windows/**</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
Expand Down Expand Up @@ -217,113 +223,6 @@
</plugins>
</build>
</profile>
<profile>
<id>deb</id>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>buildnumber-maven-plugin</artifactId>
<version>1.2</version>
<executions>
<execution>
<phase>validate</phase>
<goals>
<goal>create</goal>
</goals>
<configuration>
<shortRevisionLength>8</shortRevisionLength>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.vafer</groupId>
<artifactId>jdeb</artifactId>
<version>0.11</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>jdeb</goal>
</goals>
<configuration>
<deb>${project.build.directory}/${deb.pkg.name}_${project.version}-${buildNumber}_all.deb</deb>
<attach>false</attach>
<compression>gzip</compression>
<dataSet>
<data>
<src>${spark.jar}</src>
<type>file</type>
<mapper>
<type>perm</type>
<user>${deb.user}</user>
<group>${deb.user}</group>
<prefix>${deb.install.path}/jars</prefix>
</mapper>
</data>
<data>
<src>${basedir}/src/deb/RELEASE</src>
<type>file</type>
<mapper>
<type>perm</type>
<user>${deb.user}</user>
<group>${deb.user}</group>
<prefix>${deb.install.path}</prefix>
</mapper>
</data>
<data>
<src>${basedir}/../conf</src>
<type>directory</type>
<mapper>
<type>perm</type>
<user>${deb.user}</user>
<group>${deb.user}</group>
<prefix>${deb.install.path}/conf</prefix>
<filemode>744</filemode>
</mapper>
</data>
<data>
<src>${basedir}/../bin</src>
<type>directory</type>
<mapper>
<type>perm</type>
<user>${deb.user}</user>
<group>${deb.user}</group>
<prefix>${deb.install.path}/bin</prefix>
<filemode>${deb.bin.filemode}</filemode>
</mapper>
</data>
<data>
<src>${basedir}/../sbin</src>
<type>directory</type>
<mapper>
<type>perm</type>
<user>${deb.user}</user>
<group>${deb.user}</group>
<prefix>${deb.install.path}/sbin</prefix>
<filemode>744</filemode>
</mapper>
</data>
<data>
<src>${basedir}/../python</src>
<type>directory</type>
<mapper>
<type>perm</type>
<user>${deb.user}</user>
<group>${deb.user}</group>
<prefix>${deb.install.path}/python</prefix>
<filemode>744</filemode>
</mapper>
</data>
</dataSet>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>kinesis-asl</id>
<dependencies>
Expand Down
2 changes: 0 additions & 2 deletions assembly/src/deb/RELEASE

This file was deleted.

8 changes: 0 additions & 8 deletions assembly/src/deb/control/control

This file was deleted.

4 changes: 2 additions & 2 deletions bin/compute-classpath.sh
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ fi

num_jars=0

for f in ${assembly_folder}/spark-assembly*hadoop*.jar; do
for f in "${assembly_folder}"/spark-assembly*hadoop*.jar; do
if [[ ! -e "$f" ]]; then
echo "Failed to find Spark assembly in $assembly_folder" 1>&2
echo "You need to build Spark before running this program." 1>&2
Expand All @@ -88,7 +88,7 @@ done

if [ "$num_jars" -gt "1" ]; then
echo "Found multiple Spark assembly jars in $assembly_folder:" 1>&2
ls ${assembly_folder}/spark-assembly*hadoop*.jar 1>&2
ls "${assembly_folder}"/spark-assembly*hadoop*.jar 1>&2
echo "Please remove all but one jar." 1>&2
exit 1
fi
Expand Down
4 changes: 2 additions & 2 deletions bin/run-example
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ fi

JAR_COUNT=0

for f in ${JAR_PATH}/spark-examples-*hadoop*.jar; do
for f in "${JAR_PATH}"/spark-examples-*hadoop*.jar; do
if [[ ! -e "$f" ]]; then
echo "Failed to find Spark examples assembly in $FWDIR/lib or $FWDIR/examples/target" 1>&2
echo "You need to build Spark before running this program" 1>&2
Expand All @@ -54,7 +54,7 @@ done

if [ "$JAR_COUNT" -gt "1" ]; then
echo "Found multiple Spark examples assembly jars in ${JAR_PATH}" 1>&2
ls ${JAR_PATH}/spark-examples-*hadoop*.jar 1>&2
ls "${JAR_PATH}"/spark-examples-*hadoop*.jar 1>&2
echo "Please remove all but one jar." 1>&2
exit 1
fi
Expand Down
Empty file modified bin/spark-shell.cmd
100755 → 100644
Empty file.
2 changes: 1 addition & 1 deletion bin/spark-submit2.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ set ORIG_ARGS=%*
rem Reset the values of all variables used
set SPARK_SUBMIT_DEPLOY_MODE=client

if not defined %SPARK_CONF_DIR% (
if [%SPARK_CONF_DIR%] == [] (
set SPARK_CONF_DIR=%SPARK_HOME%\conf
)
set SPARK_SUBMIT_PROPERTIES_FILE=%SPARK_CONF_DIR%\spark-defaults.conf
Expand Down
3 changes: 2 additions & 1 deletion bin/utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ function gatherSparkSubmitOpts() {
--master | --deploy-mode | --class | --name | --jars | --packages | --py-files | --files | \
--conf | --repositories | --properties-file | --driver-memory | --driver-java-options | \
--driver-library-path | --driver-class-path | --executor-memory | --driver-cores | \
--total-executor-cores | --executor-cores | --queue | --num-executors | --archives)
--total-executor-cores | --executor-cores | --queue | --num-executors | --archives | \
--proxy-user)
if [[ $# -lt 2 ]]; then
"$SUBMIT_USAGE_FUNCTION"
exit 1;
Expand Down
1 change: 1 addition & 0 deletions bin/windows-utils.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ SET opts="%opts:~1,-1% \<--conf\> \<--properties-file\> \<--driver-memory\> \<--
SET opts="%opts:~1,-1% \<--driver-library-path\> \<--driver-class-path\> \<--executor-memory\>"
SET opts="%opts:~1,-1% \<--driver-cores\> \<--total-executor-cores\> \<--executor-cores\> \<--queue\>"
SET opts="%opts:~1,-1% \<--num-executors\> \<--archives\> \<--packages\> \<--repositories\>"
SET opts="%opts:~1,-1% \<--proxy-user\>"

echo %1 | findstr %opts% >nul
if %ERRORLEVEL% equ 0 (
Expand Down
15 changes: 9 additions & 6 deletions build/mvn
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@
_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# Preserve the calling directory
_CALLING_DIR="$(pwd)"
# Options used during compilation
_COMPILE_JVM_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"

# Installs any application tarball given a URL, the expected tarball name,
# and, optionally, a checkable binary path to determine if the binary has
Expand All @@ -34,14 +36,14 @@ install_app() {
local binary="${_DIR}/$3"

# setup `curl` and `wget` silent options if we're running on Jenkins
local curl_opts=""
local curl_opts="-L"
local wget_opts=""
if [ -n "$AMPLAB_JENKINS" ]; then
curl_opts="-s"
wget_opts="--quiet"
curl_opts="-s ${curl_opts}"
wget_opts="--quiet ${wget_opts}"
else
curl_opts="--progress-bar"
wget_opts="--progress=bar:force"
curl_opts="--progress-bar ${curl_opts}"
wget_opts="--progress=bar:force ${wget_opts}"
fi

if [ -z "$3" -o ! -f "$binary" ]; then
Expand Down Expand Up @@ -136,14 +138,15 @@ cd "${_CALLING_DIR}"
# Now that zinc is ensured to be installed, check its status and, if its
# not running or just installed, start it
if [ -n "${ZINC_INSTALL_FLAG}" -o -z "`${ZINC_BIN} -status`" ]; then
export ZINC_OPTS=${ZINC_OPTS:-"$_COMPILE_JVM_OPTS"}
${ZINC_BIN} -shutdown
${ZINC_BIN} -start -port ${ZINC_PORT} \
-scala-compiler "${SCALA_COMPILER}" \
-scala-library "${SCALA_LIBRARY}" &>/dev/null
fi

# Set any `mvn` options if not already present
export MAVEN_OPTS=${MAVEN_OPTS:-"-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"}
export MAVEN_OPTS=${MAVEN_OPTS:-"$_COMPILE_JVM_OPTS"}

# Last, call the `mvn` command as usual
${MVN_BIN} "$@"
28 changes: 28 additions & 0 deletions build/sbt
Original file line number Diff line number Diff line change
Expand Up @@ -125,4 +125,32 @@ loadConfigFile() {
[[ -f "$etc_sbt_opts_file" ]] && set -- $(loadConfigFile "$etc_sbt_opts_file") "$@"
[[ -f "$sbt_opts_file" ]] && set -- $(loadConfigFile "$sbt_opts_file") "$@"

exit_status=127
saved_stty=""

restoreSttySettings() {
stty $saved_stty
saved_stty=""
}

onExit() {
if [[ "$saved_stty" != "" ]]; then
restoreSttySettings
fi
exit $exit_status
}

saveSttySettings() {
saved_stty=$(stty -g 2>/dev/null)
if [[ ! $? ]]; then
saved_stty=""
fi
}

saveSttySettings
trap onExit INT

run "$@"

exit_status=$?
onExit
2 changes: 1 addition & 1 deletion build/sbt-launch-lib.bash
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ execRunner () {
echo ""
}

exec "$@"
"$@"
}

addJava () {
Expand Down
9 changes: 9 additions & 0 deletions conf/metrics.properties.template
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,15 @@

#worker.sink.csv.unit=minutes

# Enable Slf4jSink for all instances by class name
#*.sink.slf4j.class=org.apache.spark.metrics.sink.Slf4jSink

# Polling period for Slf4JSink
#*.sink.sl4j.period=1

#*.sink.sl4j.unit=minutes


# Enable jvm source for instance master, worker, driver and executor
#master.source.jvm.class=org.apache.spark.metrics.source.JvmSource

Expand Down
25 changes: 15 additions & 10 deletions core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,13 @@
<artifactId>jetty-servlet</artifactId>
<scope>compile</scope>
</dependency>
<!-- Because we mark jetty as provided and shade it, its dependency
orbit is ignored, so we explicitly list it here (see SPARK-5557).-->
<dependency>
<groupId>org.eclipse.jetty.orbit</groupId>
<artifactId>javax.servlet</artifactId>
<version>${orbit.version}</version>
</dependency>

<dependency>
<groupId>org.apache.commons</groupId>
Expand Down Expand Up @@ -236,6 +243,14 @@
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-graphite</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
</dependency>
<dependency>
<groupId>org.apache.derby</groupId>
<artifactId>derby</artifactId>
Expand Down Expand Up @@ -314,16 +329,6 @@
<artifactId>scalacheck_${scala.binary.version}</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.easymock</groupId>
<artifactId>easymockclassextension</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>asm</groupId>
<artifactId>asm</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
Expand Down
Loading