You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug, including details regarding any error messages, version, and platform.
As seen here: #44981 (comment)
When I tried to run "pyspark.sql.tests.arrow.test_arrow_grouped_map" and "pyspark.sql.tests.arrow.test_arrow_cogrouped_map" they fail due to missing pandas:
Traceback (most recent call last):
File "/spark/python/pyspark/sql/tests/arrow/test_arrow_grouped_map.py", line 264, in test_self_join
df2 = df.groupby("k").applyInArrow(arrow_func, schema="x long, y long")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/spark/python/pyspark/sql/pandas/group_ops.py", line 809, in applyInArrow
udf = pandas_udf(
^^^^^^^^^^^
File "/spark/python/pyspark/sql/pandas/functions.py", line 372, in pandas_udf
require_minimum_pandas_version()
File "/spark/python/pyspark/sql/pandas/utils.py", line 43, in require_minimum_pandas_version
raise PySparkImportError(
pyspark.errors.exceptions.base.PySparkImportError: [PACKAGE_NOT_INSTALLED] Pandas >= 2.0.0 must be installed; however, it was not found.
Those tests were never executed in the past but might be worth to include them on the job as they are arrow related.
Component(s)
Continuous Integration, Python
The text was updated successfully, but these errors were encountered:
Describe the bug, including details regarding any error messages, version, and platform.
As seen here:
#44981 (comment)
When I tried to run "pyspark.sql.tests.arrow.test_arrow_grouped_map" and "pyspark.sql.tests.arrow.test_arrow_cogrouped_map" they fail due to missing pandas:
Those tests were never executed in the past but might be worth to include them on the job as they are arrow related.
Component(s)
Continuous Integration, Python
The text was updated successfully, but these errors were encountered: