Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try using a queue to keep track of exec request times. #8

Open
wants to merge 513 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
513 commits
Select commit Hold shift + click to select a range
837248a
[MINOR][DOC] Fix documentation for structured streaming - addListener
yeskarthik Feb 18, 2022
3a7eafd
[SPARK-38195][SQL] Add the `TIMESTAMPADD()` function
MaxGekk Feb 18, 2022
9fd9830
[SPARK-38225][SQL] Adjust input `format` of function `to_binary`
xinrong-meng Feb 18, 2022
0fcb560
[SPARK-38138][SQL] Materialize QueryPlan subqueries
pan3793 Feb 18, 2022
a92f873
[SPARK-38215][SQL] InsertIntoHiveDir should use data source if it's c…
AngersZhuuuu Feb 18, 2022
e263b65
[SPARK-38232][SQL] Explain formatted does not collect subqueries unde…
ulysses-you Feb 18, 2022
15532c7
[SPARK-35937][FOLLOW-UP][SQL] GetDateFieldOperations should skip unre…
Ngone51 Feb 18, 2022
e613f08
[SPARK-37867][SQL][FOLLOWUP] Compile aggregate functions for build-in…
beliefer Feb 18, 2022
b5eae59
[SPARK-38094] Enable matching schema column names by field ids
jackierwzhang Feb 18, 2022
42d8c50
[SPARK-38251][SQL] Change Cast.toString as "cast" instead of "ansi_ca…
gengliangwang Feb 18, 2022
4399755
[SPARK-38243][PYTHON][ML] Fix pyspark.ml.LogisticRegression.getThresh…
zero323 Feb 18, 2022
8a70aec
[SPARK-38246][CORE][SQL][SS][WEBUI] Refactor `KVUtils` and add UTs re…
LuciferYang Feb 19, 2022
bc6eb92
Revert "[SPARK-38244][K8S][BUILD] Upgrade kubernetes-client to 5.12.1"
dongjoon-hyun Feb 19, 2022
6ff760d
[SPARK-37154][PYTHON] Inline hints for pyspark.rdd
zero323 Feb 19, 2022
157dc7f
[SPARK-37428][PYTHON][MLLIB] Inline type hints for pyspark.mllib.util
zero323 Feb 19, 2022
06f4ce4
[SPARK-38175][CORE][FOLLOWUP] Remove `urlPattern` from `HistoryAppSta…
LuciferYang Feb 19, 2022
789a510
[SPARK-38249][CORE][GRAPHX] Cleanup unused private methods/fields
LuciferYang Feb 19, 2022
ae67add
[MINOR][DOCS] fix default value of history server
itayB Feb 20, 2022
4789e1f
[SPARK-37090][BUILD] Upgrade `libthrift` to 0.16.0 to avoid security …
wangyum Feb 20, 2022
8985427
[SPARK-38261][INFRA] Add missing R packages from base image
khalidmammadov Feb 21, 2022
b71b917
[SPARK-38236][SQL] Treat table location as absolute when the first le…
bozhang2820 Feb 21, 2022
e2796d2
[SPARK-38227][SQL][SS] Apply strict nullability of nested column in t…
HeartSaVioR Feb 21, 2022
6242145
[SPARK-37475][SQL] Add scale parameter to floor and ceil functions
sathiyapk Feb 21, 2022
17567f8
[SPARK-38140][SQL] Desc column stats (min, max) for timestamp type is…
wzhfy Feb 21, 2022
23119b0
[SPARK-38268][SQL] Hide the "failOnError" field in the toString metho…
gengliangwang Feb 21, 2022
3a750ca
[SPARK-38256][BUILD] Upgarde `scalatestplus-mockito` to 3.2.11.0
LuciferYang Feb 21, 2022
c538d26
[SPARK-37427][PYTHON][MLLIB] Inline typehints for pyspark.mllib.tree
zero323 Feb 21, 2022
a5caf0c
[SPARK-38276][SQL] Add approved TPCDS plans under ANSI mode
gengliangwang Feb 21, 2022
871bbf9
[SPARK-38274][BUILD] Upgarde `JUnit4` to `4.13.2` and upgrade corresp…
LuciferYang Feb 21, 2022
1112240
[SPARK-38259][BUILD] Upgrade Netty to 4.1.74
LuciferYang Feb 22, 2022
881f562
[SPARK-37290][SQL] - Exponential planning time in case of non-determi…
Stelyus Feb 22, 2022
5ebf793
[SPARK-38206][SS] Ignore nullability on comparing the data type of jo…
HeartSaVioR Feb 22, 2022
e6c5687
[SPARK-38155][SQL] Disallow distinct aggregate in lateral subqueries …
allisonwang-db Feb 22, 2022
a103a49
[SPARK-38279][TESTS][3.2] Pin MarkupSafe to 2.0.1 fix linter failure
itholic Feb 22, 2022
48b56c0
[SPARK-38278][PYTHON] Add SparkContext.addArchive in PySpark
HyukjinKwon Feb 22, 2022
ef818ed
[SPARK-38283][SQL] Test invalid datetime parsing under ANSI mode
gengliangwang Feb 22, 2022
c82e0fe
[SPARK-37422][PYTHON][MLLIB] Inline typehints for pyspark.mllib.feature
zero323 Feb 22, 2022
b683279
[SPARK-38271] PoissonSampler may output more rows than MaxRows
zhengruifeng Feb 22, 2022
43822cd
[SPARK-38060][SQL] Respect allowNonNumericNumbers when parsing quoted…
andygrove Feb 22, 2022
bd44611
[SPARK-38290][SQL] Fix JsonSuite and ParquetIOSuite under ANSI mode
gengliangwang Feb 22, 2022
27dbf6f
[SPARK-38291][BUILD][TESTS] Upgrade `postgresql` from 42.3.0 to 42.3.3
bjornjorgensen Feb 22, 2022
a11f799
[SPARK-38121][PYTHON][SQL][FOLLOW-UP] Make df.sparkSession return the…
HyukjinKwon Feb 23, 2022
4d75d47
[SPARK-38062][CORE] Avoid resolving placeholder hostname for Fallback…
xkrogen Feb 23, 2022
ceb32c9
[SPARK-38272][K8S][TESTS] Use `docker-desktop` instead of `docker-for…
Yikun Feb 23, 2022
2534217
[SPARK-38260][BUILD][CORE] Remove `commons-net` dependency in `hadoop…
LuciferYang Feb 23, 2022
43e93b5
[SPARK-38241][K8S][TESTS] Close KubernetesClient in K8S integrations …
martin-g Feb 23, 2022
b46b74c
[SPARK-38297][PYTHON] Explicitly cast the return value at DataFrame.t…
HyukjinKwon Feb 23, 2022
fab4ceb
[SPARK-38240][SQL] Improve RuntimeReplaceable and add a guideline for…
cloud-fan Feb 23, 2022
b425156
[SPARK-38162][SQL] Optimize one row plan in normal and AQE Optimizer
ulysses-you Feb 23, 2022
bf22078
[SPARK-38235][SQL][TESTS] Add test util for testing grouped aggregate…
itholic Feb 23, 2022
e18a93d
[SPARK-38295][SQL][TESTS] Fix ArithmeticExpressionSuite under ANSI mode
anchovYu Feb 23, 2022
a2448a4
[SPARK-38304][SQL] Elt() should return null if index is null under AN…
gengliangwang Feb 23, 2022
8ad85f8
[SPARK-38299][SQL] Clean up deprecated usage of `StringBuilder.newBui…
LuciferYang Feb 23, 2022
2fe5b04
[SPARK-38301][BUILD] Remove unused scala-actors dependency
leesf Feb 23, 2022
0bc16c6
[SPARK-38287][BUILD][SQL][TESTS] Upgrade `h2` from 2.0.204 to 2.1.210…
bjornjorgensen Feb 23, 2022
9257224
[SPARK-38281][SQL][TESTS] Fix AnalysisSuite under ANSI mode
anchovYu Feb 24, 2022
b28241d
[SPARK-38307][SQL][TESTS] Fix ExpressionTypeCheckingSuite and Collect…
anchovYu Feb 24, 2022
683bc46
[SPARK-38286][SQL] Union's maxRows and maxRowsPerPartition may overflow
zhengruifeng Feb 24, 2022
47c5b4c
[SPARK-38060][SQL][DOCS][FOLLOW-UP] Move migration guide note from CO…
HyukjinKwon Feb 24, 2022
fb543a7
[SPARK-38306][SQL] Fix ExplainSuite,StatisticsCollectionSuite and Str…
gengliangwang Feb 24, 2022
4357643
[SPARK-37923][SQL][FOLLOWUP] Rename MultipleBucketTransformsError in …
LuciferYang Feb 24, 2022
c4b013f
[SPARK-38229][FOLLOWUP][SQL] Clean up unnecessary code for code simpl…
yikf Feb 24, 2022
5190048
[SPARK-38300][SQL] Use `ByteStreams.toByteArray` to simplify `fileToS…
LuciferYang Feb 24, 2022
43c89dc
[SPARK-38273][SQL] `decodeUnsafeRows`'s iterators should close underl…
kevins-29 Feb 24, 2022
e58872d
[SPARK-38191][CORE] The staging directory of write job only needs to …
weixiuli Feb 25, 2022
9758d55
[SPARK-38303][BUILD] Upgrade `ansi-regex` from 5.0.0 to 5.0.1 in /dev
bjornjorgensen Feb 25, 2022
b8b1fbc
[SPARK-38275][SS] Include the writeBatch's memory usage as the total …
Myasuka Feb 25, 2022
860f44f
[SPARK-38311][SQL] Fix DynamicPartitionPruning/BucketedReadSuite/Expr…
gengliangwang Feb 25, 2022
6a79539
[SPARK-38298][SQL][TESTS] Fix DataExpressionSuite, NullExpressionsSui…
anchovYu Feb 25, 2022
95f06f3
[SPARK-37614][SQL] Support ANSI Aggregate Function: regr_avgx & regr_…
beliefer Feb 25, 2022
2dc0527
[SPARK-38322][SQL] Support query stage show runtime statistics in for…
ulysses-you Feb 25, 2022
e56f865
[SPARK-38316][SQL][TESTS] Fix SQLViewSuite/TriggerAvailableNowSuite/U…
gengliangwang Feb 25, 2022
29eca8c
[SPARK-38325][SQL] ANSI mode: avoid potential runtime error in HashJo…
gengliangwang Feb 25, 2022
64e1f28
[SPARK-38305][CORE] Explicitly check if source exists in unpack() bef…
srowen Feb 25, 2022
daa5f9d
[MINOR][DOCS] Fix missing field in query
Feb 25, 2022
b204710
[MINOR] Add git ignores for vscode and metals
Kimahriman Feb 25, 2022
dc153f5
[SPARK-38237][SQL][SS] Allow `ClusteredDistribution` to require full …
c21 Feb 25, 2022
9eab255
[SPARK-38242][CORE] Sort the SparkSubmit debug output
martin-g Feb 26, 2022
3aa0cd4
[SPARK-38302][K8S][TESTS] Use `Java 17` in K8S IT in case of `spark-t…
dcoliversun Feb 26, 2022
89464bf
[SPARK-36488][SQL][FOLLOWUP] Simplify the implementation of ResolveRe…
LuciferYang Feb 27, 2022
cfd66cf
[MINOR][PYTHON] Remove unnecessary quotes in pyspark
dchvn Feb 28, 2022
588064f
[SPARK-38339][BUILD] Upgrade `RoaringBitmap` to 0.9.25
LuciferYang Feb 28, 2022
0c74bff
[SPARK-38338][BUILD][CORE] Remove test dependency on `hamcrest`
LuciferYang Feb 28, 2022
309c65a
[SPARK-38337][CORE][SQL][DSTREAM][MLLIB] Replace `toIterator` with `i…
LuciferYang Feb 28, 2022
244716f
[SPARK-38321][SQL][TESTS] Fix BooleanSimplificationSuite under ANSI
anchovYu Feb 28, 2022
c7e363f
[SPARK-38244][K8S][BUILD] Upgrade kubernetes-client to 5.12.1
Yikun Feb 28, 2022
89799b8
[SPARK-38042][SQL] Ensure that ScalaReflection.dataTypeFor works on a…
jtnystrom Feb 28, 2022
50520fe
[SPARK-38314][SQL] Fix of failing to read parquet files after writing…
Yaohua628 Feb 28, 2022
744a223
[SPARK-38347][SQL] Fix nullability propagation in transformUpWithNewO…
sigmod Feb 28, 2022
2f5cfb0
[SPARK-38180][SQL] Allow safe up-cast expressions in correlated equal…
allisonwang-db Feb 28, 2022
07a6f0b
[SPARK-38343][SQL][TESTS] Fix SQLQuerySuite under ANSI mode
gengliangwang Feb 28, 2022
6df10ce
[SPARK-38332][SQL] Add the `DATEADD()` and `DATE_ADD()` aliases for `…
MaxGekk Feb 28, 2022
6aa83e7
[SPARK-38033][SS] The SS processing cannot be started because the com…
Feb 28, 2022
02aa6a0
[SPARK-38352][SQL] Fix DataFrameAggregateSuite/DataFrameSetOperations…
gengliangwang Mar 1, 2022
969d672
[SPARK-37688][CORE] ExecutorMonitor should ignore SparkListenerBlockU…
sleep1661 Mar 1, 2022
9336db7
Revert "[SPARK-38191][CORE] The staging directory of write job only n…
srowen Mar 1, 2022
1d068ce
[SPARK-38318][SQL] Skip view cyclic reference check if view is stored…
linhongliu-db Mar 1, 2022
ad4e5a6
[SPARK-38323][SQL][STREAMING] Support the hidden file metadata in Str…
Yaohua628 Mar 1, 2022
2da0d07
[SPARK-37582][SPARK-37583][SQL] CONTAINS, STARTSWITH, ENDSWITH should…
AngersZhuuuu Mar 1, 2022
1b95cfe
[SPARK-38348][BUILD] Upgrade `tink` to 1.6.1
LuciferYang Mar 1, 2022
615c5d8
[MINOR] Clean up an unnecessary variable
weixiuli Mar 1, 2022
6cd5803
[SPARK-38284][SQL] Add the `TIMESTAMPDIFF()` function
MaxGekk Mar 1, 2022
a633f77
[SPARK-37932][SQL] Wait to resolve missing attributes before applying…
chenzhx Mar 1, 2022
5c23c76
[SPARK-38358][DOC] Add migration guide for `spark.sql.hive.convertMet…
AngersZhuuuu Mar 1, 2022
ccb8af6
[SPARK-38188][K8S] Support `spark.kubernetes.job.queue`
Yikun Mar 1, 2022
42f118a
[SPARK-33206][CORE] Fix shuffle index cache weight calculation for sm…
attilapiros Mar 1, 2022
e81333c
[SPARK-37593][CORE] Reduce default page size by `LONG_ARRAY_OFFSET` i…
WangGuangxin Mar 1, 2022
c7b0dd2
[SPARK-38362][BUILD] Move eclipse.m2e Maven plugin config in its own …
martin-g Mar 2, 2022
96bcb04
[SPARK-38344][SHUFFLE] Avoid to submit task when there are no request…
weixiuli Mar 2, 2022
80f25ad
[SPARK-38363][SQL] Avoid runtime error in Dataset.summary()/Dataset.d…
gengliangwang Mar 2, 2022
5664403
[SPARK-38094][SQL][FOLLOWUP] Fix exception message and add a test case
jackierwzhang Mar 2, 2022
f14f6d6
[SPARK-38357][SQL][TESTS] Add test coverage for file source with OR(d…
huaxingao Mar 2, 2022
3ab18cc
[SPARK-38383][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholder in …
dongjoon-hyun Mar 2, 2022
42db298
Revert "[SPARK-37090][BUILD] Upgrade `libthrift` to 0.16.0 to avoid s…
dongjoon-hyun Mar 2, 2022
b141c15
[SPARK-38342][CORE] Clean up deprecated api usage of Ivy
LuciferYang Mar 2, 2022
f960328
[SPARK-38389][SQL] Add the `DATEDIFF()` and `DATE_DIFF()` aliases for…
MaxGekk Mar 2, 2022
4d4c044
[SPARK-38392][K8S][TESTS] Add `spark-` prefix to namespaces and `-dri…
martin-g Mar 2, 2022
ad5427e
[SPARK-36553][ML] KMeans avoid compute auxiliary statistics for large K
zhengruifeng Mar 2, 2022
829d7fb
[MINOR][SQL][DOCS] Add more examples to sql-ref-syntax-ddl-create-tab…
wangyum Mar 2, 2022
226bdec
[SPARK-38269][CORE][SQL][SS][ML][MLLIB][MESOS][YARN][K8S][EXAMPLES] C…
LuciferYang Mar 2, 2022
23db9b4
[SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only…
weixiuli Mar 3, 2022
86e0903
[SPARK-38398][K8S][TESTS] Add `priorityClassName` integration test case
dongjoon-hyun Mar 3, 2022
dfff8d8
[SPARK-38353][PYTHON] Instrument __enter__ and __exit__ magic methods…
heyihong Mar 3, 2022
b71d6d0
[SPARK-38378][SQL] Refactoring of the ANTLR grammar definition into s…
zhenlineo Mar 3, 2022
b81d90b
[SPARK-38312][CORE] Use error class in GraphiteSink
bozhang2820 Mar 3, 2022
34618a7
[SPARK-38351][TESTS] Don't use deprecate symbol API in test classes
martin-g Mar 3, 2022
5039c0f
[SPARK-38345][SQL] Introduce SQL function ARRAY_SIZE
xinrong-meng Mar 4, 2022
83d8000
[SPARK-38196][SQL] Refactor framework so as JDBC dialect could compil…
beliefer Mar 4, 2022
ae9b804
[SPARK-38417][CORE] Remove `Experimental` from `RDD.cleanShuffleDepen…
dongjoon-hyun Mar 5, 2022
980d88d
[SPARK-38418][PYSPARK] Add PySpark `cleanShuffleDependencies` develop…
dongjoon-hyun Mar 5, 2022
727f044
[SPARK-38189][K8S][DOC] Add `Priority scheduling` doc for Spark on K8S
Yikun Mar 5, 2022
97716f7
[SPARK-38393][SQL] Clean up deprecated usage of `GenSeq/GenMap`
LuciferYang Mar 5, 2022
18219d4
[SPARK-37400][SPARK-37426][PYTHON][MLLIB] Inline type hints for pyspa…
zero323 Mar 6, 2022
69bc9d1
[SPARK-38239][PYTHON][MLLIB] Fix pyspark.mllib.LogisticRegressionMode…
zero323 Mar 6, 2022
135841f
[SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingIntern…
pan3793 Mar 6, 2022
b651617
[SPARK-38416][PYTHON][TESTS] Change day to month
bjornjorgensen Mar 7, 2022
3175d83
[SPARK-38394][BUILD] Upgrade `scala-maven-plugin` to 4.4.0 for Hadoop…
steveloughran Mar 7, 2022
b99f58a
[SPARK-38267][CORE][SQL][SS] Replace pattern matches on boolean expre…
LuciferYang Mar 7, 2022
d83ab94
[SPARK-38419][BUILD] Replace tabs that exist in the script with spaces
jackylee-ch Mar 7, 2022
fc6b5e5
[SPARK-38188][K8S][TESTS][FOLLOWUP] Cleanup resources in `afterEach`
Yikun Mar 7, 2022
3bbc43d
[SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README
williamhyun Mar 7, 2022
f36d1bf
[SPARK-38423][K8S] Reuse driver pod's `priorityClassName` for `PodGroup`
Yikun Mar 7, 2022
4883a80
[SPARK-38382][DOC] Fix incorrect version infomation of migration guid…
AngersZhuuuu Mar 7, 2022
e21cb62
[SPARK-38335][SQL] Implement parser support for DEFAULT column values
dtenedor Mar 7, 2022
c1e5e8a
[SPARK-38407][SQL] ANSI Cast: loosen the limitation of casting non-nu…
gengliangwang Mar 7, 2022
1b31b7c
[SPARK-38434][SQL] Correct semantic of CheckAnalysis.getDataTypesAreC…
ivoson Mar 7, 2022
ed3a61d
[SPARK-38394][BUILD][FOLLOWUP] Update comments about `scala-maven-plu…
steveloughran Mar 7, 2022
60d3de1
[SPARK-38104][SQL] Migrate parsing errors of window into the new erro…
yutoacts Mar 7, 2022
ddc1803
[SPARK-38414][CORE][DSTREAM][EXAMPLES][ML][MLLIB][SQL] Remove redunda…
LuciferYang Mar 7, 2022
6c486d2
[SPARK-38436][PYTHON][TESTS] Fix `test_ceil` to test `ceil`
bjornjorgensen Mar 7, 2022
71991f7
[SPARK-38285][SQL] Avoid generator pruning for invalid extractor
viirya Mar 7, 2022
a13b478
[SPARK-38183][PYTHON][FOLLOWUP] Check the ANSI conf properly when cre…
itholic Mar 8, 2022
14cda58
[SPARK-38385][SQL] Improve error messages of 'mismatched input' cases…
anchovYu Mar 8, 2022
e80d979
[SPARK-37895][SQL] Filter push down column with quoted columns
planga82 Mar 8, 2022
e5ba617
[SPARK-38361][SQL] Add factory method `getConnection` into `JDBCDialect`
beliefer Mar 8, 2022
4df8512
[SPARK-37283][SQL][FOLLOWUP] Avoid trying to store a table which cont…
sarutak Mar 8, 2022
9e1d00c
[SPARK-38406][SQL] Improve perfermance of ShufflePartitionsUtil creat…
ulysses-you Mar 8, 2022
cd32c22
[SPARK-38240][SQL][FOLLOW-UP] Make RuntimeReplaceableAggregate as an …
HyukjinKwon Mar 8, 2022
9854456
[SPARK-35956][K8S][FOLLOWP] Fix typos in config names
dongjoon-hyun Mar 8, 2022
13021ed
[SPARK-38442][SQL] Fix ConstantFoldingSuite/ColumnExpressionSuite/Dat…
gengliangwang Mar 8, 2022
8a0b101
[SPARK-38112][SQL] Use error classes in the execution errors of date/…
ivoson Mar 8, 2022
8b08f19
[SPARK-37753][SQL] Fine tune logic to demote Broadcast hash join in D…
ekoifman Mar 8, 2022
b5589a9
[SPARK-38423][K8S][FOLLOWUP] PodGroup spec should not be null
dongjoon-hyun Mar 8, 2022
0ad7677
[SPARK-38309][CORE] Fix SHS `shuffleTotalReads` and `shuffleTotalBloc…
robreeves Mar 8, 2022
8fabd5e
[SPARK-38428][SHUFFLE] Check the FetchShuffleBlocks message only once…
weixiuli Mar 8, 2022
049d6d1
[SPARK-38443][SS][DOC] Document config STREAMING_SESSION_WINDOW_MERGE…
viirya Mar 9, 2022
59ce0a7
[SPARK-37865][SQL] Fix union deduplication correctness bug
karenfeng Mar 9, 2022
43c7824
[SPARK-38412][SS] Fix the swapped sequence of from and to in StateSch…
HeartSaVioR Mar 9, 2022
f2058eb
[SPARK-38450][SQL] Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/…
gengliangwang Mar 9, 2022
35c0e5c
[MINOR][PYTHON] Fix `MultilayerPerceptronClassifierTest.test_raw_and_…
harupy Mar 9, 2022
4da04fc
[SPARK-37600][BUILD] Upgrade to Hadoop 3.3.2
sunchao Mar 9, 2022
b8c03ee
[SPARK-38455][SPARK-38187][K8S] Support driver/executor `PodGroup` te…
dongjoon-hyun Mar 9, 2022
587ec34
[SPARK-38449][SQL] Avoid call createTable when ignoreIfExists=true an…
AngersZhuuuu Mar 9, 2022
66ff4b6
[SPARK-38452][K8S][TESTS] Support pyDockerfile and rDockerfile in SBT…
Yikun Mar 9, 2022
52e7602
[SPARK-38458][SQL] Fix always false condition in `LogDivertAppender#i…
LuciferYang Mar 9, 2022
bd6a3b4
[SPARK-38437][SQL] Lenient serialization of datetime from datasource
MaxGekk Mar 9, 2022
62e4c29
[SPARK-37421][PYTHON] Inline type hints for python/pyspark/mllib/eval…
dchvn Mar 9, 2022
93a25a4
[SPARK-37947][SQL] Extract generator from GeneratorOuter expression c…
bersprockets Mar 9, 2022
1584366
[SPARK-38354][SQL] Add hash probes metric for shuffled hash join
c21 Mar 9, 2022
effef84
[SPARK-36681][CORE][TEST] Enable SnappyCodec test in FileSuite
viirya Mar 9, 2022
97df016
[SPARK-38480][K8S] Remove `spark.kubernetes.job.queue` in favor of `s…
dongjoon-hyun Mar 9, 2022
01014aa
[SPARK-38486][K8S][TESTS] Upgrade the minimum Minikube version to 1.18.0
dongjoon-hyun Mar 10, 2022
0f4c26a
[SPARK-38387][PYTHON] Support `na_action` and Series input correspond…
xinrong-meng Mar 10, 2022
bd08e79
[SPARK-38355][PYTHON][TESTS] Use `mkstemp` instead of `mktemp`
bjornjorgensen Mar 10, 2022
ecabfb1
[SPARK-38187][K8S][TESTS] Add K8S IT for `volcano` minResources cpu/m…
Yikun Mar 10, 2022
82b6194
[SPARK-38385][SQL] Improve error messages of empty statement and <EOF…
anchovYu Mar 10, 2022
f286416
[SPARK-38379][K8S] Fix Kubernetes Client mode when mounting persisten…
tgravescs Mar 10, 2022
ec544ad
[SPARK-38148][SQL] Do not add dynamic partition pruning if there exis…
ulysses-you Mar 10, 2022
e5a86a3
[SPARK-38453][K8S][DOCS] Add `volcano` section to K8s IT `README.md`
Yikun Mar 10, 2022
c483e29
[SPARK-38487][PYTHON][DOC] Fix docstrings of nlargest/nsmallest of Da…
xinrong-meng Mar 10, 2022
3ab2455
[SPARK-38499][BUILD] Upgrade Jackson to 2.13.2
dongjoon-hyun Mar 10, 2022
bcf7849
[SPARK-38489][SQL] Aggregate.groupOnly support foldable expressions
wangyum Mar 10, 2022
538c81b
[SPARK-38481][SQL] Substitute Java overflow exception from `TIMESTAMP…
MaxGekk Mar 10, 2022
5cbd9b4
[SPARK-38500][INFRA] Add ASF License header to all Service Provider c…
yaooqinn Mar 10, 2022
216b972
[SPARK-38360][SQL][SS][PYTHON] Introduce a `exists` function for `Tre…
LuciferYang Mar 10, 2022
0a4a12d
[SPARK-38490][SQL][INFRA] Add Github action test job for ANSI SQL mode
gengliangwang Mar 10, 2022
a26c01d
[SPARK-38451][R][TESTS] Fix `make_date` test case to pass with ANSI mode
HyukjinKwon Mar 10, 2022
024d03e
[SPARK-38501][SQL] Fix thriftserver test failures under ANSI mode
gengliangwang Mar 10, 2022
f852100
[SPARK-38513][K8S] Move custom scheduler-specific configs to under `s…
dongjoon-hyun Mar 10, 2022
2239e9d
[MINOR][DOCS] Fix minor typos at nulls_option in Window Functions
bfallik Mar 11, 2022
54abb85
[SPARK-38517][INFRA] Fix PySpark documentation generation (missing ip…
HyukjinKwon Mar 11, 2022
aec70e8
[SPARK-38511][K8S] Remove `priorityClassName` propagation in favor of…
dongjoon-hyun Mar 11, 2022
2e3ac4f
[SPARK-38509][SQL] Unregister the `TIMESTAMPADD/DIFF` functions and r…
MaxGekk Mar 11, 2022
34e3029
[SPARK-38107][SQL] Use error classes in the compilation errors of pyt…
itholic Mar 11, 2022
36023c2
[SPARK-38491][PYTHON] Support `ignore_index` of `Series.sort_values`
xinrong-meng Mar 11, 2022
b1d8f35
[SPARK-38518][PYTHON] Implement `skipna` of `Series.all/Index.all` to…
xinrong-meng Mar 11, 2022
fd5896b
[SPARK-38527][K8S][DOCS] Set the minimum Volcano version
dongjoon-hyun Mar 11, 2022
60334d7
[SPARK-38516][BUILD] Add log4j-core and log4j-api to classpath if act…
wangyum Mar 12, 2022
c91c2e9
[SPARK-38526][SQL] Fix misleading function alias name for RuntimeRepl…
cloud-fan Mar 12, 2022
a511ca1
[SPARK-38534][SQL][TESTS] Disable `to_timestamp('366', 'DD')` test case
dongjoon-hyun Mar 12, 2022
c032928
[SPARK-37430][PYTHON][MLLIB] Inline hints for pyspark.mllib.linalg.di…
hi-zir Mar 12, 2022
6becf4e
[SPARK-38538][K8S][TESTS] Fix driver environment verification in Basi…
dongjoon-hyun Mar 13, 2022
96e5446
[SPARK-36058][K8S][TESTS][FOLLOWUP] Fix error message to include exce…
dongjoon-hyun Mar 13, 2022
6b64e5d
[SPARK-38320][SS] Fix flatMapGroupsWithState timeout in batch with da…
alex-balikov Mar 13, 2022
786a70e
[SPARK-38537][K8S] Unify `Statefulset*` to `StatefulSet*`
dongjoon-hyun Mar 13, 2022
9bede26
[MINOR][K8S][TESTS] Remove `verifyPriority` from `VolcanoFeatureStepS…
williamhyun Mar 13, 2022
0840b23
[SPARK-38540][BUILD] Upgrade `compress-lzf` from 1.0.3 to 1.1
LuciferYang Mar 13, 2022
83673c8
[SPARK-38528][SQL] Eagerly iterate over aggregate sequence when build…
bersprockets Mar 14, 2022
715a06c
[SPARK-38532][SS][TESTS] Add test case for invalid gapDuration of ses…
nyingping Mar 14, 2022
5699095
[SPARK-38519][SQL] AQE throw exception should respect SparkFatalExcep…
ulysses-you Mar 14, 2022
efe4330
[SPARK-38410][SQL] Support specify initial partition number for rebal…
ulysses-you Mar 14, 2022
9596942
[SPARK-38523][SQL] Fix referring to the corrupt record column from CSV
MaxGekk Mar 14, 2022
35536a1
[SPARK-38103][SQL] Migrate parsing errors of transform into the new e…
Mar 14, 2022
8e44791
[SPARK-38504][SQL] Cannot read TimestampNTZ as TimestampLTZ
beliefer Mar 14, 2022
2844a18
[SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst…
LuciferYang Mar 14, 2022
130bcce
[SPARK-38415][SQL] Update the histogram_numeric (x, y) result type to…
dtenedor Mar 14, 2022
a342214
[SPARK-38535][SQL] Add the `datetimeUnit` enum and use it in `TIMESTA…
MaxGekk Mar 14, 2022
5bb001b
[SPARK-36967][FOLLOWUP][CORE] Report accurate shuffle block size if i…
wankunde Mar 14, 2022
0005b41
[SPARK-38400][PYTHON] Enable Series.rename to change index labels
xinrong-meng Mar 14, 2022
c16a66a
[SPARK-36194][SQL] Add a logical plan visitor to propagate the distin…
wangyum Mar 14, 2022
f6c4634
[SPARK-37491][PYTHON] Fix Series.asof for unsorted values
pralabhkumar Mar 14, 2022
a30575e
[SPARK-38544][BUILD] Upgrade log4j2 to 2.17.2
LuciferYang Mar 14, 2022
1d4e917
[SPARK-38521][SQL] Change `partitionOverwriteMode` from string to var…
jackylee-ch Mar 15, 2022
8b5ec77
[SPARK-38549][SS] Add `numRowsDroppedByWatermark` to `SessionWindowSt…
viirya Mar 15, 2022
f17f078
[SPARK-38513][K8S][FOLLWUP] Cleanup executor-podgroup-template.yml
Yikun Mar 15, 2022
58c21e5
[SPARK-38527][K8S][DOCS][FOLLOWUP] Use v1.5.0 tag instead of release-1.5
dongjoon-hyun Mar 15, 2022
2a63fea
Revert "[SPARK-38544][BUILD] Upgrade log4j2 to 2.17.2"
wangyum Mar 15, 2022
c00942d
[SPARK-38524][SPARK-38553][K8S] Bump `Volcano` to v1.5.1 and fix Volc…
Yikun Mar 15, 2022
21db916
[SPARK-38484][PYTHON] Move usage logging instrumentation util functio…
heyihong Mar 15, 2022
4e31000
[SPARK-38204][SS] Use StatefulOpClusteredDistribution for stateful op…
HeartSaVioR Mar 15, 2022
f84018a
[SPARK-38424][PYTHON] Warn unused casts and ignores
zero323 Mar 16, 2022
1acadf3
[SPARK-38558][SQL] Remove unnecessary casts between IntegerType and I…
cashmand Mar 16, 2022
8476c8b
[SPARK-38542][SQL] UnsafeHashedRelation should serialize numKeys out
mcdull-zhang Mar 16, 2022
8193b40
[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4
HyukjinKwon Mar 16, 2022
1b41416
[SPARK-38106][SQL] Use error classes in the parsing errors of functions
ivoson Mar 16, 2022
71e2110
[SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable
Kimahriman Mar 16, 2022
32b0705
Try using a queue to keep track of exec request times.
holdenk Nov 1, 2021
b139440
ugh workflows
holdenk Jan 11, 2022
b01149c
Revert "ugh workflows"
holdenk Jan 11, 2022
603ba97
Fix the resource request decrease scenario and add a test for it.
holdenk Mar 16, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,20 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

from typing import Dict, List
name: ANSI SQL mode test

from pyspark.sql.types import Row, StructType
on:
push:
branches:
- master

from numpy import ndarray
jobs:
ansi_sql_test:
uses: ./.github/workflows/build_and_test.yml
if: github.repository == 'apache/spark'
with:
ansi_enabled: true

class _ImageSchema:
def __init__(self) -> None: ...
@property
def imageSchema(self) -> StructType: ...
@property
def ocvTypes(self) -> Dict[str, int]: ...
@property
def columnSchema(self) -> StructType: ...
@property
def imageFields(self) -> List[str]: ...
@property
def undefinedImageType(self) -> str: ...
def toNDArray(self, image: Row) -> ndarray: ...
def toImage(self, array: ndarray, origin: str = ...) -> Row: ...

ImageSchema: _ImageSchema
32 changes: 22 additions & 10 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ on:
- cron: '0 13 * * *'
# Java 17
- cron: '0 16 * * *'
workflow_call:
inputs:
ansi_enabled:
required: false
type: boolean
default: false

jobs:
configure-jobs:
Expand Down Expand Up @@ -92,7 +98,7 @@ jobs:
echo '::set-output name=java::8'
echo '::set-output name=branch::master' # Default branch to run on. CHANGE here when a branch is cut out.
echo '::set-output name=type::regular'
echo '::set-output name=envs::{}'
echo '::set-output name=envs::{"SPARK_ANSI_SQL_MODE": "${{ inputs.ansi_enabled }}"}'
echo '::set-output name=hadoop::hadoop3'
fi

Expand Down Expand Up @@ -252,7 +258,7 @@ jobs:
- name: Install Python packages (Python 3.8)
if: (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
run: |
python3.8 -m pip install 'numpy>=1.20.0' 'pyarrow<5.0.0' pandas scipy xmlrunner
python3.8 -m pip install 'numpy>=1.20.0' pyarrow pandas scipy xmlrunner
python3.8 -m pip list
# Run the tests.
- name: Run tests
Expand Down Expand Up @@ -287,7 +293,7 @@ jobs:
name: "Build modules (${{ format('{0}, {1} job', needs.configure-jobs.outputs.branch, needs.configure-jobs.outputs.type) }}): ${{ matrix.modules }}"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20211228
image: dongjoon/apache-spark-github-action-image:20220207
strategy:
fail-fast: false
matrix:
Expand All @@ -311,6 +317,7 @@ jobs:
SKIP_UNIDOC: true
SKIP_MIMA: true
METASPACE_SIZE: 1g
SPARK_ANSI_SQL_MODE: ${{ inputs.ansi_enabled }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
Expand Down Expand Up @@ -391,13 +398,14 @@ jobs:
name: "Build modules: sparkr"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20211228
image: dongjoon/apache-spark-github-action-image:20220207
env:
HADOOP_PROFILE: ${{ needs.configure-jobs.outputs.hadoop }}
HIVE_PROFILE: hive2.3
GITHUB_PREV_SHA: ${{ github.event.before }}
SPARK_LOCAL_IP: localhost
SKIP_MIMA: true
SPARK_ANSI_SQL_MODE: ${{ inputs.ansi_enabled }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
Expand Down Expand Up @@ -462,7 +470,7 @@ jobs:
PYSPARK_DRIVER_PYTHON: python3.9
PYSPARK_PYTHON: python3.9
container:
image: dongjoon/apache-spark-github-action-image:20211228
image: dongjoon/apache-spark-github-action-image:20220207
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
Expand Down Expand Up @@ -529,11 +537,14 @@ jobs:
# See also https://github.com/sphinx-doc/sphinx/issues/7551.
# Jinja2 3.0.0+ causes error when building with Sphinx.
# See also https://issues.apache.org/jira/browse/SPARK-35375.
python3.9 -m pip install 'sphinx<3.1.0' mkdocs pydata_sphinx_theme ipython nbsphinx numpydoc 'jinja2<3.0.0'
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' 'pyarrow<5.0.0' pandas 'plotly>=4.8'
# Pin the MarkupSafe to 2.0.1 to resolve the CI error.
# See also https://issues.apache.org/jira/browse/SPARK-38279.
python3.9 -m pip install 'sphinx<3.1.0' mkdocs pydata_sphinx_theme ipython nbsphinx numpydoc 'jinja2<3.0.0' 'markupsafe==2.0.1'
python3.9 -m pip install ipython_genutils # See SPARK-38517
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' pyarrow pandas 'plotly>=4.8'
apt-get update -y
apt-get install -y ruby ruby-dev
Rscript -e "install.packages(c('devtools', 'testthat', 'knitr', 'rmarkdown', 'roxygen2'), repos='https://cloud.r-project.org/')"
Rscript -e "install.packages(c('devtools', 'testthat', 'knitr', 'rmarkdown', 'markdown', 'e1071', 'roxygen2'), repos='https://cloud.r-project.org/')"
Rscript -e "devtools::install_version('pkgdown', version='2.0.1', repos='https://cloud.r-project.org')"
Rscript -e "devtools::install_version('preferably', version='0.4', repos='https://cloud.r-project.org')"
gem install bundler
Expand Down Expand Up @@ -614,7 +625,7 @@ jobs:
export MAVEN_CLI_OPTS="--no-transfer-progress"
export JAVA_VERSION=${{ matrix.java }}
# It uses Maven's 'install' intentionally, see https://github.com/apache/spark/pull/26414.
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} install
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} install
rm -rf ~/.m2/repository/org/apache/spark

scala-213:
Expand Down Expand Up @@ -660,7 +671,7 @@ jobs:
- name: Build with SBT
run: |
./dev/change-scala-version.sh 2.13
./build/sbt -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pdocker-integration-tests -Pkubernetes-integration-tests -Pspark-ganglia-lgpl -Pscala-2.13 compile test:compile
./build/sbt -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pdocker-integration-tests -Pkubernetes-integration-tests -Pspark-ganglia-lgpl -Pscala-2.13 compile test:compile

tpcds-1g:
needs: [configure-jobs, precondition]
Expand All @@ -669,6 +680,7 @@ jobs:
runs-on: ubuntu-20.04
env:
SPARK_LOCAL_IP: localhost
SPARK_ANSI_SQL_MODE: ${{ inputs.ansi_enabled }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
Expand Down
38 changes: 29 additions & 9 deletions .github/workflows/notify_test_workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,19 @@ jobs:
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const endpoint = 'GET /repos/:owner/:repo/commits/:ref/check-runs'
const endpoint = 'GET /repos/:owner/:repo/actions/workflows/:id/runs?&branch=:branch'
const check_run_endpoint = 'GET /repos/:owner/:repo/commits/:ref/check-runs'

// TODO: Should use pull_request.user and pull_request.user.repos_url?
// If a different person creates a commit to another forked repo,
// it wouldn't be able to detect.
const params = {
owner: context.payload.pull_request.head.repo.owner.login,
repo: context.payload.pull_request.head.repo.name,
id: 'build_and_test.yml',
branch: context.payload.pull_request.head.ref,
}
const check_run_params = {
owner: context.payload.pull_request.head.repo.owner.login,
repo: context.payload.pull_request.head.repo.name,
ref: context.payload.pull_request.head.ref,
Expand All @@ -67,7 +74,7 @@ jobs:
const head_sha = context.payload.pull_request.head.sha
let status = 'queued'

if (!runs || runs.data.check_runs.filter(r => r.name === "Configure jobs").length === 0) {
if (!runs || runs.data.workflow_runs.length === 0) {
status = 'completed'
const conclusion = 'action_required'

Expand Down Expand Up @@ -99,16 +106,29 @@ jobs:
}
})
} else {
const runID = runs.data.check_runs.filter(r => r.name === "Configure jobs")[0].id
const run_id = runs.data.workflow_runs[0].id

if (runs.data.check_runs[0].head_sha != context.payload.pull_request.head.sha) {
if (runs.data.workflow_runs[0].head_sha != context.payload.pull_request.head.sha) {
throw new Error('There was a new unsynced commit pushed. Please retrigger the workflow.');
}

const runUrl = 'https://github.com/'
// Here we get check run ID to provide Check run view instead of Actions view, see also SPARK-37879.
const check_runs = await github.request(check_run_endpoint, check_run_params)
const check_run_head = check_runs.data.check_runs.filter(r => r.name === "Configure jobs")[0]

if (check_run_head.head_sha != context.payload.pull_request.head.sha) {
throw new Error('There was a new unsynced commit pushed. Please retrigger the workflow.');
}

const check_run_url = 'https://github.com/'
+ context.payload.pull_request.head.repo.full_name
+ '/runs/'
+ runID
+ check_run_head.id

const actions_url = 'https://github.com/'
+ context.payload.pull_request.head.repo.full_name
+ '/actions/runs/'
+ run_id

github.checks.create({
owner: context.repo.owner,
Expand All @@ -118,13 +138,13 @@ jobs:
status: status,
output: {
title: 'Test results',
summary: '[See test results](' + runUrl + ')',
summary: '[See test results](' + check_run_url + ')',
text: JSON.stringify({
owner: context.payload.pull_request.head.repo.owner.login,
repo: context.payload.pull_request.head.repo.name,
run_id: runID
run_id: run_id
})
},
details_url: runUrl,
details_url: actions_url,
})
}
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
*~
.java-version
.DS_Store
.ammonite
.bloop
.bsp/
.cache
.classpath
Expand All @@ -21,10 +23,12 @@
# SPARK-35223: Add IssueNavigationLink to make IDEA support hyperlink on JIRA Ticket and GitHub PR on Git plugin.
!.idea/vcs.xml
.idea_modules/
.metals
.project
.pydevproject
.scala_dependencies
.settings
.vscode
/lib/
R-unit-tests.log
R/unit-tests.out
Expand Down Expand Up @@ -59,6 +63,7 @@ lint-r-report.log
lint-js-report.log
log/
logs/
metals.sbt
out/
project/boot/
project/build/target/
Expand Down
2 changes: 2 additions & 0 deletions LICENSE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,7 @@ net.sf.py4j:py4j
org.jpmml:pmml-model
org.jpmml:pmml-schema
org.threeten:threeten-extra
org.jdom:jdom2

python/lib/py4j-*-src.zip
python/pyspark/cloudpickle.py
Expand Down Expand Up @@ -504,6 +505,7 @@ Common Development and Distribution License (CDDL) 1.0
javax.activation:activation http://www.oracle.com/technetwork/java/javase/tech/index-jsp-138795.html
javax.xml.stream:stax-api https://jcp.org/en/jsr/detail?id=173
javax.transaction:javax.transaction-api
javax.xml.bind:jaxb-api


Common Development and Distribution License (CDDL) 1.1
Expand Down
3 changes: 3 additions & 0 deletions NOTICE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -917,6 +917,9 @@ This product includes code (JaspellTernarySearchTrie) from Java Spelling Checkin
g Package (jaspell): http://jaspell.sourceforge.net/
License: The BSD License (http://www.opensource.org/licenses/bsd-license.php)

This product includes software developed by the JDOM Project (http://www.jdom.org/)
License: https://raw.githubusercontent.com/hunterhacker/jdom/master/LICENSE.txt

The snowball stemmers in
analysis/common/src/java/net/sf/snowball
were developed by Martin Porter and Richard Boulton.
Expand Down
2 changes: 1 addition & 1 deletion R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Collate:
'types.R'
'utils.R'
'window.R'
RoxygenNote: 7.1.1
RoxygenNote: 7.1.2
VignetteBuilder: knitr
NeedsCompilation: no
Encoding: UTF-8
20 changes: 13 additions & 7 deletions R/pkg/tests/fulltests/test_sparkSQL.R
Original file line number Diff line number Diff line change
Expand Up @@ -1690,9 +1690,9 @@ test_that("column functions", {

df <- as.DataFrame(list(list("col" = "1")))
c <- collect(select(df, schema_of_csv("Amsterdam,2018")))
expect_equal(c[[1]], "STRUCT<`_c0`: STRING, `_c1`: INT>")
expect_equal(c[[1]], "STRUCT<_c0: STRING, _c1: INT>")
c <- collect(select(df, schema_of_csv(lit("Amsterdam,2018"))))
expect_equal(c[[1]], "STRUCT<`_c0`: STRING, `_c1`: INT>")
expect_equal(c[[1]], "STRUCT<_c0: STRING, _c1: INT>")

# Test to_json(), from_json(), schema_of_json()
df <- sql("SELECT array(named_struct('name', 'Bob'), named_struct('name', 'Alice')) as people")
Expand Down Expand Up @@ -1725,9 +1725,9 @@ test_that("column functions", {

df <- as.DataFrame(list(list("col" = "1")))
c <- collect(select(df, schema_of_json('{"name":"Bob"}')))
expect_equal(c[[1]], "STRUCT<`name`: STRING>")
expect_equal(c[[1]], "STRUCT<name: STRING>")
c <- collect(select(df, schema_of_json(lit('{"name":"Bob"}'))))
expect_equal(c[[1]], "STRUCT<`name`: STRING>")
expect_equal(c[[1]], "STRUCT<name: STRING>")

# Test to_json() supports arrays of primitive types and arrays
df <- sql("SELECT array(19, 42, 70) as age")
Expand Down Expand Up @@ -2051,13 +2051,19 @@ test_that("date functions on a DataFrame", {
})

test_that("SPARK-37108: expose make_date expression in R", {
ansiEnabled <- sparkR.conf("spark.sql.ansi.enabled")[[1]] == "true"
df <- createDataFrame(
list(list(2021, 10, 22), list(2021, 13, 1),
list(2021, 2, 29), list(2020, 2, 29)),
c(
list(list(2021, 10, 22), list(2020, 2, 29)),
if (ansiEnabled) list() else list(list(2021, 13, 1), list(2021, 2, 29))
),
list("year", "month", "day")
)
expect <- createDataFrame(
list(list(as.Date("2021-10-22")), NA, NA, list(as.Date("2020-02-29"))),
c(
list(list(as.Date("2021-10-22")), list(as.Date("2020-02-29"))),
if (ansiEnabled) list() else list(NA, NA)
),
list("make_date(year, month, day)")
)
actual <- select(df, make_date(df$year, df$month, df$day))
Expand Down
2 changes: 1 addition & 1 deletion bin/pyspark
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ export PYSPARK_DRIVER_PYTHON_OPTS

# Add the PySpark classes to the Python path:
export PYTHONPATH="${SPARK_HOME}/python/:$PYTHONPATH"
export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.3-src.zip:$PYTHONPATH"
export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.4-src.zip:$PYTHONPATH"

# Load the PySpark shell.py script when ./pyspark is used interactively:
export OLD_PYTHONSTARTUP="$PYTHONSTARTUP"
Expand Down
2 changes: 1 addition & 1 deletion bin/pyspark2.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ if "x%PYSPARK_DRIVER_PYTHON%"=="x" (
)

set PYTHONPATH=%SPARK_HOME%\python;%PYTHONPATH%
set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.3-src.zip;%PYTHONPATH%
set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.4-src.zip;%PYTHONPATH%

set OLD_PYTHONSTARTUP=%PYTHONSTARTUP%
set PYTHONSTARTUP=%SPARK_HOME%\python\pyspark\shell.py
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ public int hashCode() {
public int compareTo(ComparableObjectArray other) {
int len = Math.min(array.length, other.array.length);
for (int i = 0; i < len; i++) {
int diff = ((Comparable<Object>) array[i]).compareTo((Comparable<Object>) other.array[i]);
int diff = ((Comparable<Object>) array[i]).compareTo(other.array[i]);
if (diff != 0) {
return diff;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -329,13 +329,14 @@ private int countKeys(Class<?> type) throws Exception {
byte[] prefix = db.getTypeInfo(type).keyPrefix();
int count = 0;

DBIterator it = db.db().iterator();
it.seek(prefix);

while (it.hasNext()) {
byte[] key = it.next().getKey();
if (LevelDBIterator.startsWith(key, prefix)) {
count++;
try (DBIterator it = db.db().iterator()) {
it.seek(prefix);

while (it.hasNext()) {
byte[] key = it.next().getKey();
if (LevelDBIterator.startsWith(key, prefix)) {
count++;
}
}
}

Expand Down
Loading