Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: fix stacklevel for DataFrame(mgr) deprecation #55591

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions doc/source/user_guide/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -763,14 +763,12 @@ Parquet
Writing to a Parquet file:

.. ipython:: python
:okwarning:

df.to_parquet("foo.parquet")

Reading from a Parquet file Store using :func:`read_parquet`:

.. ipython:: python
:okwarning:

pd.read_parquet("foo.parquet")

Expand Down
11 changes: 0 additions & 11 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2247,7 +2247,6 @@ For line-delimited json files, pandas can also return an iterator which reads in
Line-limited json can also be read using the pyarrow reader by specifying ``engine="pyarrow"``.

.. ipython:: python
:okwarning:

from io import BytesIO
df = pd.read_json(BytesIO(jsonl.encode()), lines=True, engine="pyarrow")
Expand Down Expand Up @@ -5372,15 +5371,13 @@ See the documentation for `pyarrow <https://arrow.apache.org/docs/python/>`__ an
Write to a parquet file.

.. ipython:: python
:okwarning:

df.to_parquet("example_pa.parquet", engine="pyarrow")
df.to_parquet("example_fp.parquet", engine="fastparquet")

Read from a parquet file.

.. ipython:: python
:okwarning:

result = pd.read_parquet("example_fp.parquet", engine="fastparquet")
result = pd.read_parquet("example_pa.parquet", engine="pyarrow")
Expand All @@ -5390,7 +5387,6 @@ Read from a parquet file.
By setting the ``dtype_backend`` argument you can control the default dtypes used for the resulting DataFrame.

.. ipython:: python
:okwarning:

result = pd.read_parquet("example_pa.parquet", engine="pyarrow", dtype_backend="pyarrow")

Expand All @@ -5404,7 +5400,6 @@ By setting the ``dtype_backend`` argument you can control the default dtypes use
Read only certain columns of a parquet file.

.. ipython:: python
:okwarning:

result = pd.read_parquet(
"example_fp.parquet",
Expand Down Expand Up @@ -5433,7 +5428,6 @@ Serializing a ``DataFrame`` to parquet may include the implicit index as one or
more columns in the output file. Thus, this code:

.. ipython:: python
:okwarning:

df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
df.to_parquet("test.parquet", engine="pyarrow")
Expand All @@ -5450,7 +5444,6 @@ If you want to omit a dataframe's indexes when writing, pass ``index=False`` to
:func:`~pandas.DataFrame.to_parquet`:

.. ipython:: python
:okwarning:

df.to_parquet("test.parquet", index=False)

Expand All @@ -5473,7 +5466,6 @@ Partitioning Parquet files
Parquet supports partitioning of data based on the values of one or more columns.

.. ipython:: python
:okwarning:

df = pd.DataFrame({"a": [0, 0, 1, 1], "b": [0, 1, 0, 1]})
df.to_parquet(path="test", engine="pyarrow", partition_cols=["a"], compression=None)
Expand Down Expand Up @@ -5539,14 +5531,12 @@ ORC format, :func:`~pandas.read_orc` and :func:`~pandas.DataFrame.to_orc`. This
Write to an orc file.

.. ipython:: python
:okwarning:

df.to_orc("example_pa.orc", engine="pyarrow")

Read from an orc file.

.. ipython:: python
:okwarning:

result = pd.read_orc("example_pa.orc")

Expand All @@ -5555,7 +5545,6 @@ Read from an orc file.
Read only certain columns of an orc file.

.. ipython:: python
:okwarning:

result = pd.read_orc(
"example_pa.orc",
Expand Down
3 changes: 0 additions & 3 deletions doc/source/user_guide/pyarrow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,6 @@ To convert a :external+pyarrow:py:class:`pyarrow.Table` to a :class:`DataFrame`,
:external+pyarrow:py:meth:`pyarrow.Table.to_pandas` method with ``types_mapper=pd.ArrowDtype``.

.. ipython:: python
:okwarning:

table = pa.table([pa.array([1, 2, 3], type=pa.int64())], names=["a"])

Expand Down Expand Up @@ -165,7 +164,6 @@ functions provide an ``engine`` keyword that can dispatch to PyArrow to accelera
* :func:`read_feather`

.. ipython:: python
:okwarning:

import io
data = io.StringIO("""a,b,c
Expand All @@ -180,7 +178,6 @@ PyArrow-backed data by specifying the parameter ``dtype_backend="pyarrow"``. A r
``engine="pyarrow"`` to necessarily return PyArrow-backed data.

.. ipython:: python
:okwarning:

import io
data = io.StringIO("""a,b,c,d,e,f,g,h,i
Expand Down
3 changes: 0 additions & 3 deletions doc/source/user_guide/scale.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ To load the columns we want, we have two options.
Option 1 loads in all the data and then filters to what we need.

.. ipython:: python
:okwarning:

columns = ["id_0", "name_0", "x_0", "y_0"]

Expand All @@ -60,7 +59,6 @@ Option 1 loads in all the data and then filters to what we need.
Option 2 only loads the columns we request.

.. ipython:: python
:okwarning:

pd.read_parquet("timeseries_wide.parquet", columns=columns)

Expand Down Expand Up @@ -202,7 +200,6 @@ counts up to this point. As long as each individual file fits in memory, this wi
work for arbitrary-sized datasets.

.. ipython:: python
:okwarning:

%%time
files = pathlib.Path("data/timeseries/").glob("ts*.parquet")
Expand Down
1 change: 0 additions & 1 deletion doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,6 @@ When this keyword is set to ``"pyarrow"``, then these functions will return pyar
* :meth:`Series.convert_dtypes`

.. ipython:: python
:okwarning:

import io
data = io.StringIO("""a,b,c,d,e,f,g,h,i
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -697,7 +697,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=1, # bump to 2 once pyarrow 15.0 is released with fix
)

if using_copy_on_write():
Expand Down
8 changes: 4 additions & 4 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)
if using_copy_on_write():
data = data.copy(deep=False)
Expand Down Expand Up @@ -446,7 +446,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)

if copy:
Expand All @@ -465,7 +465,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)

name = ibase.maybe_extract_name(name, data, type(self))
Expand Down Expand Up @@ -539,7 +539,7 @@ def __init__(
"is deprecated and will raise in a future version. "
"Use public APIs instead.",
DeprecationWarning,
stacklevel=find_stack_level(),
stacklevel=2,
)
allow_mgr = True

Expand Down
22 changes: 10 additions & 12 deletions pandas/tests/arrays/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,9 @@ def test_arrow_array_missing():
assert result.storage.equals(expected)


@pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)
@pytest.mark.parametrize(
"breaks",
[[0.0, 1.0, 2.0, 3.0], date_range("2017", periods=4, freq="D")],
Expand All @@ -325,29 +328,26 @@ def test_arrow_table_roundtrip(breaks):

table = pa.table(df)
assert isinstance(table.field("a").type, ArrowIntervalType)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, pd.IntervalDtype)
tm.assert_frame_equal(result, df)

table2 = pa.concat_tables([table, table])
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table2.to_pandas()
result = table2.to_pandas()
expected = pd.concat([df, df], ignore_index=True)
tm.assert_frame_equal(result, expected)

# GH-41040
table = pa.table(
[pa.chunked_array([], type=table.column(0).type)], schema=table.schema
)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
tm.assert_frame_equal(result, expected[0:0])


@pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)
@pytest.mark.parametrize(
"breaks",
[[0.0, 1.0, 2.0, 3.0], date_range("2017", periods=4, freq="D")],
Expand All @@ -365,9 +365,7 @@ def test_arrow_table_roundtrip_without_metadata(breaks):
table = table.replace_schema_metadata()
assert table.schema.metadata is None

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, pd.IntervalDtype)
tm.assert_frame_equal(result, df)

Expand Down
24 changes: 9 additions & 15 deletions pandas/tests/arrays/masked/test_arrow_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
import pandas as pd
import pandas._testing as tm

pytestmark = pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)

pa = pytest.importorskip("pyarrow")

from pandas.core.arrays.arrow._arrow_utils import pyarrow_array_to_numpy_and_mask
Expand Down Expand Up @@ -36,9 +40,7 @@ def test_arrow_roundtrip(data):
table = pa.table(df)
assert table.field("a").type == str(data.dtype.numpy_dtype)

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
Comment on lines -39 to +43
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In several of those cases I removed the warning assert, and added a general filterwarnings mark (at the test or top of the file), because 1) there is no need to "test" that the warning is raised in all those cases, and 2) once the PR to fix this on the pyarrow side is merged, all those cases will start failing anyway

assert result["a"].dtype == data.dtype
tm.assert_frame_equal(result, df)

Expand All @@ -56,9 +58,7 @@ def types_mapper(arrow_type):
record_batch = pa.RecordBatch.from_arrays(
[bools_array, ints_array, small_ints_array], ["bools", "ints", "small_ints"]
)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = record_batch.to_pandas(types_mapper=types_mapper)
result = record_batch.to_pandas(types_mapper=types_mapper)
bools = pd.Series([True, None, False], dtype="boolean")
ints = pd.Series([1, None, 2], dtype="Int64")
small_ints = pd.Series([-1, 0, 7], dtype="Int64")
Expand All @@ -75,9 +75,7 @@ def test_arrow_load_from_zero_chunks(data):
table = pa.table(
[pa.chunked_array([], type=table.field("a").type)], schema=table.schema
)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert result["a"].dtype == data.dtype
tm.assert_frame_equal(result, df)

Expand All @@ -98,18 +96,14 @@ def test_arrow_sliced(data):

df = pd.DataFrame({"a": data})
table = pa.table(df)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.slice(2, None).to_pandas()
result = table.slice(2, None).to_pandas()
expected = df.iloc[2:].reset_index(drop=True)
tm.assert_frame_equal(result, expected)

# no missing values
df2 = df.fillna(data[0])
table = pa.table(df2)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.slice(2, None).to_pandas()
result = table.slice(2, None).to_pandas()
expected = df2.iloc[2:].reset_index(drop=True)
tm.assert_frame_equal(result, expected)

Expand Down
21 changes: 9 additions & 12 deletions pandas/tests/arrays/period/test_arrow_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@
period_array,
)

pytestmark = pytest.mark.filterwarnings(
"ignore:Passing a BlockManager to DataFrame:DeprecationWarning"
)


pa = pytest.importorskip("pyarrow")


Expand Down Expand Up @@ -81,16 +86,12 @@ def test_arrow_table_roundtrip():

table = pa.table(df)
assert isinstance(table.field("a").type, ArrowPeriodType)
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, PeriodDtype)
tm.assert_frame_equal(result, df)

table2 = pa.concat_tables([table, table])
msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table2.to_pandas()
result = table2.to_pandas()
expected = pd.concat([df, df], ignore_index=True)
tm.assert_frame_equal(result, expected)

Expand All @@ -109,9 +110,7 @@ def test_arrow_load_from_zero_chunks():
[pa.chunked_array([], type=table.column(0).type)], schema=table.schema
)

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, PeriodDtype)
tm.assert_frame_equal(result, df)

Expand All @@ -126,8 +125,6 @@ def test_arrow_table_roundtrip_without_metadata():
table = table.replace_schema_metadata()
assert table.schema.metadata is None

msg = "Passing a BlockManager to DataFrame is deprecated"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = table.to_pandas()
result = table.to_pandas()
assert isinstance(result["a"].dtype, PeriodDtype)
tm.assert_frame_equal(result, df)
Loading