Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static type checking via Mypy #6381

Merged
merged 127 commits into from
Jan 27, 2021
Merged
Show file tree
Hide file tree
Changes from 121 commits
Commits
Show all changes
127 commits
Select commit Hold shift + click to select a range
267e44b
First clean mypy run
shwina Sep 10, 2020
a46ac7e
Add mypy to pre-commmit hooks
shwina Sep 10, 2020
68e47e1
Merge branch 'branch-0.16' of https://github.com/rapidsai/cudf into mypy
shwina Sep 28, 2020
5cb9a5e
Add typing for _accessors attr
shwina Sep 29, 2020
e54944c
Testing mypy style check on CI
shwina Sep 30, 2020
d55e33d
Test changelog entry
shwina Sep 30, 2020
1b4e7dc
Add stub for Table
shwina Oct 1, 2020
f675884
Start adding type annotations to dtypes.py
shwina Oct 1, 2020
a896926
Merge remote-tracking branch 'upstream/branch-0.17' into mypy
shwina Oct 12, 2020
0d923a4
Add type annotations to `buffer.py`
shwina Oct 13, 2020
27f5431
Add Column type stubs
shwina Oct 14, 2020
b8cfffb
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into mypy
shwina Oct 14, 2020
cdce768
Remove trivial ctor
shwina Oct 15, 2020
5d85a47
Specialize data_array_view in StringColumn and CategoricalColumn
shwina Oct 15, 2020
5f9347a
Make build_column keyword only args to avoid confusion
shwina Oct 15, 2020
e14267b
Move data_array_view to CategoricalColumn
shwina Oct 15, 2020
d7049d4
Fix import issue
shwina Oct 15, 2020
99a4d91
Move clip specialization to CategoricalColumn
shwina Oct 15, 2020
df9712d
Add scalar to typing
shwina Oct 15, 2020
24a5e81
Type annotations for clip
shwina Oct 15, 2020
686ccff
Replace BufferOrNone with Buffer
shwina Oct 15, 2020
80fd10b
More Column annotations
shwina Oct 15, 2020
b741897
Column annotations
shwina Oct 15, 2020
f9da6ba
More column type annotations
shwina Oct 20, 2020
aa541d6
More column annotations
shwina Oct 20, 2020
faf7919
ColumnBase.setitem annotations + refactor
shwina Oct 20, 2020
7cb9adb
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into mypy
shwina Oct 21, 2020
9b02100
More Column annotations
shwina Oct 21, 2020
d09ecd0
Finish up annotating column.py
shwina Oct 21, 2020
9cb4f4d
Fix array interface handling
shwina Oct 23, 2020
9d75898
set_mask actually returns
shwina Oct 23, 2020
4ee6568
Adding NumericalColumn annotations
shwina Oct 23, 2020
45c1d01
Add type annotations to numerical.py
shwina Nov 2, 2020
5720c3b
datetime annotations
shwina Nov 4, 2020
f4ac673
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into mypy
shwina Nov 5, 2020
48a94d9
Add DecimalDtype.itemsize
shwina Nov 9, 2020
3264366
Add DecimalColumn
shwina Nov 9, 2020
e30f896
CategoricalColumn type hints
shwina Nov 10, 2020
5253560
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into mypy
shwina Nov 10, 2020
9a14676
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into r…
shwina Nov 12, 2020
7038138
Remove string/categorical accessor kwargs
shwina Nov 12, 2020
b76f88a
Changelog
shwina Nov 12, 2020
2f90afa
Remove itemsize
shwina Nov 12, 2020
20fad28
Add type annotations to methods.py
shwina Nov 12, 2020
b9b381f
Add Index.__getitem__
shwina Nov 12, 2020
9e99bd6
Merge branch 'rm-str-kwargs' into mypy
shwina Nov 12, 2020
04df2e3
Type annotate StringMethods
shwina Nov 12, 2020
c5be6fe
Annotate StringMethods arguments
shwina Nov 13, 2020
662762a
Removing some **kwargs from CategoricalColumn
shwina Nov 13, 2020
ff23a79
tmp commit
shwina Nov 13, 2020
d6144a9
More kwargs related fixes
shwina Nov 14, 2020
079e69d
Fix CI failures
shwina Nov 16, 2020
41f5794
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into r…
shwina Nov 16, 2020
265cccc
Merge branch 'rm-str-kwargs' into mypy
shwina Nov 16, 2020
db8f449
Remove DecimalColumn from this branch
shwina Nov 16, 2020
af96a99
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into r…
shwina Nov 16, 2020
b2f2ed9
as_numerical_column() still needs **kwargs
shwina Nov 16, 2020
b713315
Remove DecimalColumn from this branch
shwina Nov 17, 2020
cf3f2cd
Merge branch 'rm-str-kwargs' into mypy
shwina Nov 17, 2020
3c89a36
String column types
shwina Nov 17, 2020
3341b29
Add timedelta annotations
shwina Nov 19, 2020
e1b7652
Update NumericalColumn typing
shwina Nov 20, 2020
c920594
Remove **kwargs from as_numerical_column
shwina Nov 20, 2020
df3c3ac
Remove **kwargs from as_numerical_column()
shwina Nov 20, 2020
d06b247
Don't pass **kwargs to as_numerical_column
shwina Nov 23, 2020
de277eb
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into mypy
shwina Nov 24, 2020
fb8abcb
Merge branch 'rm-str-kwargs' into mypy
shwina Nov 24, 2020
4baf99d
Merge branch 'mypy' of github.com:shwina/cudf into mypy
shwina Nov 24, 2020
2334680
Merge branch 'branch-0.17' of https://github.com/rapidsai/cudf into r…
shwina Dec 3, 2020
32bf4a2
Pass expand through to return_or_inplace
shwina Dec 4, 2020
b71b203
Trailing comma
shwina Dec 4, 2020
1814af0
Merge branch 'rm-str-kwargs' into mypy
shwina Dec 7, 2020
14d2ae9
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Dec 7, 2020
37c8227
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into r…
shwina Dec 8, 2020
4e08bea
Changelog
shwina Dec 8, 2020
660eecb
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Dec 8, 2020
4778284
Merge branch 'rm-str-kwargs' into mypy
shwina Dec 8, 2020
0bcf1a3
Typing for ColumnAccessor
shwina Dec 9, 2020
6525b25
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Dec 10, 2020
07d3d74
Type mimic_inplace
shwina Dec 10, 2020
604113c
More typing stuff
shwina Dec 10, 2020
5494c29
CI fail if mypy fails
shwina Dec 10, 2020
9645945
Don't type check for metadata file
shwina Dec 10, 2020
4a4b4af
Merge branch 'branch-0.17' into branch-0.18
shwina Dec 11, 2020
223f2b5
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina Dec 15, 2020
89588db
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Dec 15, 2020
88fcd4b
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Dec 16, 2020
0713a74
Vix var name
shwina Dec 16, 2020
abd6ad2
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina Dec 17, 2020
18863b5
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into b…
shwina Jan 4, 2021
95755eb
Merge branch 'branch-0.18' into mypy
shwina Jan 5, 2021
43d8bcf
Fix call to fillna()
shwina Jan 5, 2021
03b394e
Fix again
shwina Jan 5, 2021
a07a447
Add typing_extensions to meta.yaml
shwina Jan 5, 2021
aea5183
Replace Buffer->Optional[Buffer]
shwina Jan 15, 2021
e699f82
Stray code
shwina Jan 15, 2021
be1aba9
_index is Optional
shwina Jan 15, 2021
7d7d0c9
Set ScalarObj->Any
shwina Jan 15, 2021
b687546
Replace ScalarObj -> ScalarLike
shwina Jan 15, 2021
9f0a902
Ignore type errors from _version.py
shwina Jan 15, 2021
105d308
Add type annotation to Buffer.empty
shwina Jan 15, 2021
5ec7cc0
Fix find_and_replace typing
shwina Jan 15, 2021
640cbcb
Type view()
shwina Jan 15, 2021
179fa74
Annotate pandas_categorical_as_column
shwina Jan 15, 2021
c6ebeb7
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Jan 22, 2021
0bc4d8a
Copyrights
shwina Jan 22, 2021
f5acafe
Replace assertion
shwina Jan 22, 2021
73ec159
Import needed even when not type checking
shwina Jan 22, 2021
d08418c
Type ColumnMethodsMixin attributes
shwina Jan 22, 2021
dcdd16b
Get rid of type: ignore in orc.py
shwina Jan 22, 2021
e40bf1e
Use ColumnLike as a type annotation, not a default value
shwina Jan 22, 2021
3451b6a
Add back type: ignore
shwina Jan 25, 2021
112d4c8
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Jan 25, 2021
83821f3
Fix default value
shwina Jan 25, 2021
a981ef4
Remove some typeignores
shwina Jan 25, 2021
3b13b70
More type ignores
shwina Jan 25, 2021
fa74989
More type ignores
shwina Jan 25, 2021
f7b6bb5
Changelog
shwina Jan 25, 2021
30048cc
parts -> part
shwina Jan 25, 2021
4ab8bb2
Sort by index before comparing in groupby serialize testss
shwina Jan 26, 2021
73385e1
More sort index
shwina Jan 26, 2021
556e0ff
Replace string annotations
shwina Jan 26, 2021
d3e0230
Typo
shwina Jan 26, 2021
032a9e6
Fix casts
shwina Jan 26, 2021
47eb4cf
Copyright
shwina Jan 26, 2021
af44d22
Merge branch 'branch-0.18' of https://github.com/rapidsai/cudf into mypy
shwina Jan 26, 2021
06aef48
Remove typing from tests
shwina Jan 26, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,15 @@ repos:
language: system
files: \.(cu|cuh|h|hpp|cpp|inl)$
args: ['-fallback-style=none']
- repo: local
cwharris marked this conversation as resolved.
Show resolved Hide resolved
hooks:
- id: mypy
name: mypy
description: mypy
pass_filenames: false
entry: mypy --config-file=python/cudf/setup.cfg python/cudf/cudf
language: system
types: [python]

default_language_version:
python: python3
14 changes: 13 additions & 1 deletion ci/checks/style.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@ FLAKE_RETVAL=$?
FLAKE_CYTHON=`flake8 --config=python/.flake8.cython`
FLAKE_CYTHON_RETVAL=$?

# Run mypy and get results/return code
MYPY_CUDF=`mypy --config=python/cudf/setup.cfg python/cudf/cudf`
MYPY_CUDF_RETVAL=$?
cwharris marked this conversation as resolved.
Show resolved Hide resolved

# Run clang-format and check for a consistent code format
CLANG_FORMAT=`python cpp/scripts/run-clang-format.py 2>&1`
CLANG_FORMAT_RETVAL=$?
Expand Down Expand Up @@ -66,6 +70,14 @@ else
echo -e "\n\n>>>> PASSED: flake8-cython style check\n\n"
fi

if [ "$MYPY_CUDF_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: mypy style check; begin output\n\n"
echo -e "$MYPY_CUDF"
echo -e "\n\n>>>> FAILED: mypy style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: mypy style check\n\n"
fi

cwharris marked this conversation as resolved.
Show resolved Hide resolved
if [ "$CLANG_FORMAT_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: clang format check; begin output\n\n"
echo -e "$CLANG_FORMAT"
Expand All @@ -79,7 +91,7 @@ HEADER_META=`ci/checks/headers_test.sh`
HEADER_META_RETVAL=$?
echo -e "$HEADER_META"

RETVALS=($ISORT_RETVAL $BLACK_RETVAL $FLAKE_RETVAL $FLAKE_CYTHON_RETVAL $CLANG_FORMAT_RETVAL $HEADER_META_RETVAL)
RETVALS=($ISORT_RETVAL $BLACK_RETVAL $FLAKE_RETVAL $FLAKE_CYTHON_RETVAL $CLANG_FORMAT_RETVAL $HEADER_META_RETVAL $MYPY_CUDF_RETVAL)
IFS=$'\n'
RETVAL=`echo "${RETVALS[*]}" | sort -nr | head -n1`

Expand Down
2 changes: 2 additions & 0 deletions conda/environments/cudf_dev_cuda10.1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ dependencies:
- flake8=3.8.3
- black=19.10
- isort=5.0.7
- mypy=0.782
- typing_extensions
- pre_commit
cwharris marked this conversation as resolved.
Show resolved Hide resolved
- dask>=2.22.0
- distributed>=2.22.0
Expand Down
2 changes: 2 additions & 0 deletions conda/environments/cudf_dev_cuda10.2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ dependencies:
- flake8=3.8.3
- black=19.10
- isort=5.0.7
- mypy=0.782
- typing_extensions
- pre_commit
- dask>=2.22.0
- distributed>=2.22.0
Expand Down
2 changes: 2 additions & 0 deletions conda/environments/cudf_dev_cuda11.0.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ dependencies:
- flake8=3.8.3
- black=19.10
- isort=5.0.7
- mypy=0.782
- typing_extensions
- pre_commit
- dask>=2.22.0
- distributed>=2.22.0
Expand Down
1 change: 1 addition & 0 deletions conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ requirements:
run:
- protobuf
- python
- typing_extensions
- pandas >=1.0,<1.2.0dev0
- cupy >7.1.0,<9.0.0a0
- numba >=0.49.0
Expand Down
4 changes: 4 additions & 0 deletions python/cudf/cudf/_lib/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,16 @@
datetime,
filling,
gpuarrow,
groupby,
hash,
interop,
join,
json,
merge,
null_mask,
nvtext,
orc,
parquet,
partitioning,
quantiles,
reduce,
Expand All @@ -27,6 +30,7 @@
search,
sort,
stream_compaction,
string_casting,
strings,
table,
transpose,
Expand Down
123 changes: 123 additions & 0 deletions python/cudf/cudf/_lib/column.pyi
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Copyright (c) 2021, NVIDIA CORPORATION.

from typing import Tuple, Union, TypeVar, Optional
kkraus14 marked this conversation as resolved.
Show resolved Hide resolved

from cudf._typing import DtypeObj, Dtype, ScalarLike
from cudf.core.buffer import Buffer
from cudf.core.column import ColumnBase


T = TypeVar("T")

class Column:
_data: Optional[Buffer]
_mask: Optional[Buffer]
_base_data: Optional[Buffer]
_base_mask: Optional[Buffer]
_dtype: DtypeObj
_offset: int
_null_count: int
_children: Tuple["ColumnBase", ...]
_base_children: Tuple["ColumnBase", ...]

def __init__(
self,
data: Optional[Buffer],
dtype: Dtype,
size: int = None,
mask: Optional[Buffer] = None,
offset: int = None,
null_count: int = None,
children: Tuple["ColumnBase", ...] = (),
) -> None:
...

@property
def base_size(self) -> int:
...

@property
def dtype(self) -> DtypeObj:
...

@property
def size(self) -> int:
...

@property
def base_data(self) -> Optional[Buffer]:
...

@property
def base_data_ptr(self) -> int:
...

@property
def data(self) -> Optional[Buffer]:
...

@property
def data_ptr(self) -> int:
...

def set_base_data(self, value: Buffer) -> None:
...

@property
def nullable(self) -> bool:
...

@property
def has_nulls(self) -> bool:
...

@property
def base_mask(self) -> Optional[Buffer]:
...

@property
def base_mask_ptr(self) -> int:
...

@property
def mask(self) -> Optional[Buffer]:
...

@property
def mask_ptr(self) -> int:
...

def set_base_mask(self, value: Optional[Buffer]) -> None:
...

def set_mask(self: T, value: Optional[Buffer]) -> T:
...

@property
def null_count(self) -> int:
...

@property
def offset(self) -> int:
...

@property
def base_children(self) -> Tuple["ColumnBase", ...]:
...

@property
def children(self) -> Tuple["ColumnBase", ...]:
...

def set_base_children(self, value: Tuple["ColumnBase", ...]) -> None:
...

def _mimic_inplace(self, other_col: "ColumnBase", inplace=False) -> Optional["ColumnBase"]:
shwina marked this conversation as resolved.
Show resolved Hide resolved
...

@staticmethod
def from_scalar(
val: ScalarLike,
size: int
) -> "ColumnBase": # TODO: This should be Scalar, not ScalarLike
...
57 changes: 27 additions & 30 deletions python/cudf/cudf/_lib/column.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -60,14 +60,14 @@ cdef class Column:
The *dtype* indicates the Column's element type.
"""
def __init__(
self,
object data,
int size,
object dtype,
object mask=None,
int offset=0,
object null_count=None,
object children=()
self,
object data,
int size,
object dtype,
object mask=None,
int offset=0,
object null_count=None,
object children=()
):

self._size = size
Expand Down Expand Up @@ -247,10 +247,10 @@ cdef class Column:
)

return cudf.core.column.build_column(
self.data,
self.dtype,
mask,
self.size,
data=self.data,
dtype=self.dtype,
mask=mask,
size=self.size,
offset=0,
children=self.children
)
Expand Down Expand Up @@ -561,25 +561,22 @@ cdef class Column:
children = tuple(children)

result = cudf.core.column.build_column(
data,
dtype,
mask,
size,
offset,
null_count,
tuple(children)
data=data,
dtype=dtype,
mask=mask,
size=size,
offset=offset,
null_count=null_count,
children=tuple(children)
)

return result


def make_column_from_scalar(object py_val, size_type size):

cdef DeviceScalar val = py_val.device_value

cdef const scalar* c_val = val.get_raw_ptr()
cdef unique_ptr[column] c_result
with nogil:
c_result = move(cpp_make_column_from_scalar(c_val[0], size))

return Column.from_unique_ptr(move(c_result))
@staticmethod
def from_scalar(py_val, size_type size):
cdef DeviceScalar val = py_val.device_value
cdef const scalar* c_val = val.get_raw_ptr()
cdef unique_ptr[column] c_result
with nogil:
c_result = move(cpp_make_column_from_scalar(c_val[0], size))
return Column.from_unique_ptr(move(c_result))
29 changes: 29 additions & 0 deletions python/cudf/cudf/_lib/table.pyi
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Copyright (c) 2021, NVIDIA CORPORATION.

from typing import List, Any, Optional, TYPE_CHECKING
shwina marked this conversation as resolved.
Show resolved Hide resolved

import cudf

class Table(object):
_data: cudf.core.column_accessor.ColumnAccessor
_index: Optional[cudf.core.index.Index]

def __init__(self, data: object = None, index: object = None) -> None: ...

@property
def _num_columns(self) -> int: ...

@property
def _num_indices(self) -> int: ...

@property
def _num_rows(self) -> int: ...

@property
def _column_names(self) -> List[Any]: ...

@property
def _index_names(self) -> List[Any]: ...

@property
def _columns(self) -> List[Any]: ... # TODO: actually, a list of columns
28 changes: 28 additions & 0 deletions python/cudf/cudf/_typing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Copyright (c) 2021, NVIDIA CORPORATION.

from typing import TYPE_CHECKING, Any, TypeVar, Union
shwina marked this conversation as resolved.
Show resolved Hide resolved

import numpy as np
from pandas import Period, Timedelta, Timestamp
from pandas.api.extensions import ExtensionDtype

if TYPE_CHECKING:
import cudf

# Many of these are from
# https://github.com/pandas-dev/pandas/blob/master/pandas/_typing.py

Dtype = Union["ExtensionDtype", str, np.dtype]
DtypeObj = Union["ExtensionDtype", np.dtype]
kkraus14 marked this conversation as resolved.
Show resolved Hide resolved

# scalars
DatetimeLikeScalar = TypeVar(
"DatetimeLikeScalar", Period, Timestamp, Timedelta
)
ScalarLike = Any

# columns
ColumnLike = Any

# binary operation
BinaryOperand = Union["cudf.Scalar", "cudf.core.column.ColumnBase"]
2 changes: 1 addition & 1 deletion python/cudf/cudf/core/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Copyright (c) 2018-2020, NVIDIA CORPORATION.

from cudf.core import buffer, column, common
from cudf.core import buffer, column, column_accessor, common
from cudf.core.buffer import Buffer
from cudf.core.dataframe import DataFrame, from_pandas, merge
from cudf.core.index import (
Expand Down
4 changes: 2 additions & 2 deletions python/cudf/cudf/core/abc.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@
try:
import pickle5 as pickle
except ImportError:
import pickle
import pickle # type: ignore
else:
import pickle
import pickle # type: ignore


class Serializable(abc.ABC):
Expand Down
Loading