Skip to content

Releases: cupy/cupy

v2.1.0

17 Nov 05:37
@hvy hvy
4867ff1
Compare
Choose a tag to compare

This is the release of v2.1.0. See here for the complete list of solved issues and merged PRs.

New features

  • Add argpartition (#608)
  • Add window functions (#612, thanks @ishihara1989!)
    • blackman, hamming, hanning
  • Support sparse.coo_matrix initialization with other types of sparse matrices (#626)
  • Line memory profiler using memory hook and traceback (#630)
  • Support dtype argument in random.randint (#706)
  • cuDNN grouped convolution (#721, thanks @anaruse!)

Improvements

  • Performance improvements
    • Optimize sparse.csc_matrix.__mul__ (#625)
    • Cythonize memory hook (#728)
  • Support uint32 sampling up to 0xffffffff in random.RandomState.interval (#633)
  • Fix random.RandomState.seed to only accept integer types (#709)
  • Fix typo in IndexError error message (#683)
  • Fix interface for cuDNN find algorithm APIs (#664)

Bug fixes

  • Fix OverflowError passing large integer to elementwise operation (#615)
  • Fix indexing zero-dimensional array with boolean mask (#645)
  • Setup Python’s builtin random state in testing.fix_random (#648)
  • Use v6 RNN API when using cuDNN7 to avoid incompatibility (#665, thanks @anaruse!)
  • Set arch option for NVRTC, as the option is neccessary on some GPUs (#696, thanks @grafi-tt!)
  • Fix memory pool for multi-threaded applications (#697)
  • Fix var and std to correctly handle ddof argument (#711, thanks @stevendbrown!)
  • Fix advanced indexing to not alter the indices (#723, thanks @yuyu2172!)

Documentation

  • Fix a link in README.md to the contribution guide (#629)
  • Remove unrelated “see also” from testing.numpy_cupy_raises (#637, thanks @Hakuyume!)
  • Write note about environment variables for installation (#641)
  • Fix reference page of linalg (#651)
  • Fix doctest for Python 3.5 (#663)
  • Add intersphinx mapping to Chainer (#666)
  • Fix typo and heading in documentation (#667)
  • Update testing section in the contribution guide (#716)
  • Fix a link in README.md to the forum (#754, thanks @muupan!)
  • Fix incorrect heading “CuPy” instead of “NumPy” in license page (#674)

Test

  • Use the latest Cython in Travis CI (#636)
  • Fix typo (#647, thanks @Hakuyume!)
  • Move to PyTest
    • Move to PyTest (#659)
    • Remove nose dependency in tests (#676)
    • Use pytest-warnings to check deprecated warnings (#729)
  • Fix doctest for Python 3.5 (#663)
  • Allow filtering test cases by number of GPUs with CUPY_TEST_GPU_LIMIT environment variable (#677)
  • Ignore ComplexWarning in numpy.pad for NumPy 1.11 or older (#690)
  • Fix NumPy warning for bool and complex operations (#708)
  • Fix test of where to use different seeds for different arrays (#710)
  • Skip some dtypes in test_einsum (#740)
  • Skip some tests for old NumPy (#746)

Others

  • Improve version embedding (#652)

v3.0.0a1

17 Oct 06:30
Compare
Choose a tag to compare
v3.0.0a1 Pre-release
Pre-release

This is the release of CuPy v3.0.0a1. See here for the complete list of solved issues and merged PRs.

New features

  • Memory pool is now used as the default allocator even if CuPy is used without Chainer (#472).
  • Add line memory profiler using memory hook and traceback (#265)
  • Add cuDNN support for dropout. (#479)
  • Add cudnnGetTensor4dDescriptor for fp16 BatchNormalization support in Chainer (#492, thanks @anaruse!)
  • Add Tensor-Core support (cuDNN and cuBLAS) (#494 and #495, thanks @anaruse!)
  • Add window functions (#555, thanks @ishihara1989!)
  • Add cupy.sparse.random (#557)
  • Add cupy.argpartition (#294)

Bug fixes

  • Fix multithread problem in PooledMemory (#480)
  • Resolve dealloc problem and multithread problem in PinnedMemory (#481)
  • Fix cupy.nonzero for corner cases (#498)
  • Fix simple reduction for corner cases (#499)
  • Fix broadcast for corner cases (#543)
  • Fix broadcast_arrays return type (#545)
  • Avoid using global state in RandomState.choice (#556)
  • Fix csrmm2 to support transa (#565)
  • Fix csrmv (#571)
  • Avoid using dtype option in numpy.random.randint which is introduced in NumPy v1.11 (#574)

Improvements

  • Fix get_array_module to be aware of spmatrix (#568)
  • Use vector to improve free memory searching in malloc (#476)
  • Fix Cython warning on variable declaration (#491)
  • Check kernel name validity (#522)
  • Show NVRTC error code (#531)
  • Optimize RandomState.interval (#559)
  • Fix random.normal double memory consumption (#562)

Installation

  • Import memory_hooks (#502)
  • Avoid Cython 0.27.0 (#550)
  • Change minimum Cython version to 0.26.1 (#365, #530, #548)
  • Support NVCC environment variable (#501)

Documentation

  • Fix documentation of fusion functions (#497)
  • Add documentation of cupy.all and cupy.any function (#511)
  • Correct URLs in documentation (#547)
  • Fix typo (#614, thanks @fukatani!)

Examples

  • Add an example of option pricing using Black-Scholes equation (#473)

v2.0.0

17 Oct 06:57
Compare
Choose a tag to compare

This is a major release of CuPy v2.0.0. All of the updates since the previous major version (v1.0.0) can be found in the release notes below:

Important Updates

Supports the latest versions of the following libraries

In v2.0.0a1

  • We started using NVRTC instead of NVCC for kernel compilation. This change enables CuPy to run in an environment where CUDA is installed but NVCC is not available. Note that some features depending on Thrust (e.g. sorting functions) cannot be used if NVCC is not available at the installation.
  • Many functions for sorting, linear algebra, and others are added.

In v2.0.0b1

  • Sparse matrix. cupy.sparse is a module that implements scipy.sparse API using CUDA and cuSPARSE. We now have basic features for using sparse matrices on GPU.
  • New memory allocator (#168). The memory pool implementation is greatly updated. It is based on best-fit allocation with coalescing. When there are a large number of allocations with different sizes (e.g. NLP applications), the memory usage is improved and the number of re-allocations is reduced (which also reduces the running time).

In v2.0.0rc1

  • Complex numbers (#232)
  • Many New functions.

Bug fixes

  • Fix cupy.nonzero for corner cases (#504)
  • Fix simple reduction for corner cases (#505)
  • Fix multithread problem in PooledMemory (#507)
  • Resolve dealloc problem and multithread problem in PinnedMemory (#510)
  • Avoid using global state in RandomState.choice (#560)
  • Fix broadcast for corner cases (#577)
  • Fix csrmm2 to support transa (#601)
  • Fix csrmv (#607)

Improvements

  • Fix get_array_module to be aware of spmatrix (#586)
  • Show NVRTC error code (#538)
  • Optimize RandomState.interval (#585)
  • Fix random.normal double memory consumption (#592)
  • Check kernel name validity (#596)

Installation

  • Avoid Cython 0.27.0 (#579)
  • Import memory_hooks (#506)
  • Support NVCC environment variable (#537)

Documentation

  • Fix warnings (#535)
  • Add documentation of cupy.all and cupy.any function (#514)
  • Fix documentation of fusion functions (#517)
  • Treat sphinx warnings as errors (#519)
  • Correct URLs in documentation (#561, thanks @aonotas!)
  • Fix Cython requirement for documentation build (#566)

Tests

  • Fix doctest warnings (#500)
  • Use mock.patch instead of directly replacing function with Mock (#610)
  • Remove print() in tests (#509)
  • Travis fails with Cython 0.27. Use Cython 0.26.1 for a while (#539)
  • Add corner test cases for indexing (#576)
  • Add unit tests for csrgemm (#602)

Others

  • Avoid duplicate loop index (#520)

v2.0.0rc1

12 Sep 06:14
Compare
Choose a tag to compare
v2.0.0rc1 Pre-release
Pre-release

This is the release of CuPy v2.0.0rc1. See here for the complete list of solved issues and merged PRs.

Changes that break compatibility

  • Change the default value of the order argument of copy from ’C’ to ’K’ (#159)
  • Add order and subok arguments to array (#167). It breaks the compatibility of positional arguments.

New features

  • Complex numbers (#232)
  • Memory hook (#264). It can be used to observe the memory allocation/deallocation events.
  • New functions
    • Complex routines: angle, conj, imag, real (#232)
    • einsum (#199, thanks @fukatani!)
    • Linear algebra: linalg.solve (#207), linalg.tensorsolve (#215), linalg.inv (#441), linalg.pinv (#459)
    • Random numbers: random.shuffle (#216, thanks @KotaroSetoyama!)
    • Sorting: partition (#270)
  • New features in sparse matrices
    • Support dia_matrix (#313, #321, #320, #450)
    • Sparse matrix creation methods: eye (#399), spdiags (#388) and identity (#358)
    • csr_matrix and csc_matrix are improved: __mul__ (#239), __rmul__ (#300), __getitem__ (#240, #301, #302), dot (#351, #352)
    • Initializers of csr_matrix, csc_matrix, and coo_matrix support shape argument (#316, #375)
    • Sparse matrices can have duplicated elements (#326, #371)
    • order argument in toarray method of csc and csr (#311)
    • __pow__ (#359)
    • Conversion from a dense array to a sparse matrix (#335)
    • Support conversion from scipy.sparse matrix to cupy.sparse (#370)
  • Added supports of new libraries
  • argsort for arrays of rank two or more (#288)
  • Fix race-condition on memory pool (#382)
  • Implemented copy option of array conversion methods and wrote tests (#408)
  • Enable saving CUDA source with environment variable (#415)
  • Basic support of CUDA unified memory (#447)
  • Use original function name as fusion kernel name (#448)
  • Support replace=False in random.choice (#453)
  • Add a sync option to time_range (#474, thanks @anaruse!)

Bug fixes

  • Fix bug of empty coo_matrix (#328)
  • Fix default behavior of methods in spmatrix (#356)
  • Made dummy implementation to prevent infinite loop (#364)
  • Avoid to call python methods in __dealloc__(), use __del__() instead. (#381)
  • Fix race-condition on memory pool (#382)
  • Fix view when the itemsize of the dtype changes (#406, thanks @boeddeker!)
  • Use double backslash in str literal (#418)
  • Improved pow test (#421)
  • Use randint instead of random_integer, which is deprecated (#425)
  • Fix diagonal (#428, thanks @fukatani!)
  • Use six.assertRegex (#432)
  • Fix for numpy1.13 (#445)
  • Fix tocsc behavior for an empty dia matrix (#451)

Improvements

  • Tell the memory size when cudaErrorMemoryAllocation occurred (#314)
  • Simplify nogil (#164)
  • Skip cross compile on setup.py develop to build faster (#309)
  • Remove device memory allocation out of memory pool (#337)
  • Avoid importing NumPy docstring (#355)
  • Improve header handling (#367)
  • Remove redundant code in cupy_thrust.cu (#369)
  • Improve _tril() and _triu() with an ElementwiseKernel (#377)
  • Remove unnecessary condition (#383)
  • Add semicolons to the reduction kernel template (#386)
  • Remove redundant transpose (#390)
  • Fix usage about ElementwiseKernel (#391)
  • Remove duplicated preamble definition. (#402)
  • Fix cumsum (#414)
  • Use AxisError to maintain compatibility to multiple versions of NumPy (#437)
  • doc: Sort out navigation menu (#444)
  • Improve tensordot_core (#465)
  • Simplify flip (#468)
  • Use None instead of set() to improve memory allocation performance (#475)

Installation

  • Skip cross compile on setup.py develop to build faster (#309)
  • Fix double declaration of tuple_less (#368)
  • Made a cutomized version of sdist command to use cython (#446)

Documentation

  • Fix a grammatical error in tutorial (#267)
  • Add cupy.sparse reference (#299, #303)
  • Cleanup README.md (#334)
  • Hide source link for alien objects (#354)
  • Avoid importing NumPy docstring (#355)
  • Remove unsupported strides argument from docstring (#361)
  • Fix matmul arguments (#384, thanks @hvy!)
  • Add link to our contribution guide (#392)
  • Update docstring of linalg.einsum (#405)
  • Write docstring of A property and its test (#407)
  • Use double backslash in str literal (#418)
  • Fix typo in sparse.spdiags docstring. (#426)
  • Remove "Edit on GitHub" link (#434)
  • Reorganize navigation menu (#444)
  • Clear doctest warnings (#455)
  • Add documents of linalg (#456)
  • Write docstring of sparse.issparse (#470)

Examples

  • Conjugate Gradient (#94, thanks @KotaroSetoyama!)

Tests

  • Example test (#297)
  • Add test for cuda.cusolver_enabled flag (#374)
  • Write tests for operators for sparse matrices (#401)
  • Write docstring of A property and its test (#407)
  • Fix test for random generator (#413)
  • Fix cumsum test (#414)
  • Add test for transpose when axes is not None (#420)
  • Improved pow test (#421)
  • Changed order argument for unknown order test as SciPy causes DeprecationWarning (#422)
  • Add tests for asfptype (#423)
  • Add assert_warns (#424)
  • Use randint instead of random_integer that is deprecated (#425)
  • Use six.assertRegex (#432)
  • Show error message when an error occurs on example test (#433)
  • Fix tests on Windows (#435)
  • Fix tolerance of arithmetic tests (#443)
  • Added test for __iter__ of csr_matrix (#449)
  • Fix tocsc behavior for an empty dia matrix (#451)
  • Fix test for tensorsolve (#454)
  • Skip NumPy clip tests in Windows (#467)
  • Fix typo in test function names (#394)

Others

  • Configure flake8 to ignore the .git directory (#339)

v1.0.3

12 Sep 07:00
Compare
Choose a tag to compare

This release includes bug fixes and improvements to the documentation and tests. See the list for the complete list of solved issues and merged PRs.

Bug fixes

  • Avoid decoding nvcc output with UTF-8 to remove UnicodeDecodeError. (#378, #379)
  • Bug in view with different itemsize. (#403, thanks @boeddeker!)
  • Avoid to call python methods in __dealloc__ and use __del__ instead. (#411)
  • Fix ndarray.view when the itemsize of the dtype changes. (#416)
  • Fix inconsistency of ndarray.diagonal between NumPy and CuPy. (#436)

Improvements

  • Make a compilation error readable. (#380)
  • Add semicolons to the reduction kernel template. (#396)

Documentation

  • Remove unsupported strides argument from docstring. (#366)
  • Hide source link for alien objects. (#373)
  • Fix the document of matmul. (#412)
  • Use double backslashes in str literals. (#429)
  • Clear doctest warnings. (#457)
  • Sort out navigation menu. (#460)
  • Fix a grammatical error in tutorial. (#463)

Tests

  • Use randint instead of random_integer that is deprecated. (#430)
  • Add testing.assert_warns and test deprecation warning of Memory.free_all_free. (#431)
  • Skip some tests for RandomState when NumPy < 1.11.0. (#438)
  • Loosen the torelance of tests for binary operators. (#461)
  • Fix typo in test names. (#395)

v2.0.0b1

01 Aug 06:53
Compare
Choose a tag to compare
v2.0.0b1 Pre-release
Pre-release

This is a minor release. See https://github.com/cupy/cupy/milestone/8?closed=1 for the complete list of solved issues and merged PRs.

New features

Sparse matrix

cupy.sparse is a module that implements scipy.sparse API using CUDA and cuSPARSE. We now have basic features for using sparse matrices on GPU.

  • CSR and CSC (#226)
  • COO matrix (#234)
  • Conversion method from compressed matrix (csr, csc) to coordinate format (coo) (#235)
  • CSR and CSC copy (#236)
  • __add__, __radd__, __sub__ and __rsub__ for CSR and CSC (#238)
  • Fix toarray in cupy.sparse.spmatrix (#312)
  • Return NotImplemented instead of NotImplementedError (#330)
  • Use csc2dense to convert csr-matrix to dense (#305)

We are planning to add more features to cupy.sparse in upcoming releases.

New memory allocator (#168)

The memory pool implementation is greatly updated. It is based on best-fit allocation with coalescing. When there are a large number of allocations with different sizes (e.g. NLP applications), the memory usage is improved and the number of re-allocations is reduced (which also reduces the running time).

For example, the memory usage of the sequence-to-sequence code using Chainer (chainer/chainer#2070) is reduced from 12GiB (which means the process is using all of the available GPU memory) to 3GiB, and the number of memory reallocations from 20 times to 0 times.

It may increase the memory usage in some cases, although the amount of additional usage is small in practice (see the benchmark results in #168).

You can use this memory allocator by calling cupy.cuda.set_allocator(cupy.cuda.MemoryPool().malloc) (when using Chainer, it is called by default).

Other features

  • Implement cupy.linalg.det (#96)
  • Support cupy.sort to sort arrays along arbitrary axis (#229)
  • Implemented RangeStart and RangeEnd for NVIDIA visual profiler (nvvp) (#246)
  • Introduce cupy.is_available() which takes account of device availability (#247)
  • Implement cupy.msort (#251, #329)

Bug fixes

  • Fix cupy.copyto function to treat multiple GPUs correctly (#220)
  • Restore kernel type check (#253)
  • Fix deepcopy with multiple devices (#254)
  • Fix cupy.argsort for non-contiguous arrays (#284)
  • Fix ldexp on Windows (#278)

Improvements

  • Improve cupy.argsort performance (#285)

Installation

  • Remove old cuDNN support (#219)
  • Add compile options to build on Windows (#244)
  • Remove duplicated build options (#280)
  • Avoid creating garbage file on setup (#287)
  • Fix setup for cusolver (#292)
  • Use cupy.cuda.thrust_enabled to check Thrust enabled (#224)

Documentation

  • Updated difference with NumPy on reduction function behavior (#144)
  • Fix spelling in tutorial (#268)
  • Fix test instruction in README (#310)
  • Fix links to GitHub source pages (#332)

Examples

  • Add Gaussian Mixture Model (GMM) example (#29, thanks @KotaroSetoyama!)
  • Make grid size to integer for SGEMM example (#289, thanks @yuyu2172!)
  • Use absolute path in SGEMM example (#291)
  • Updated README for SGEMM example (#245, thanks @yuyu2172!)

Tests

  • Use cupy.testing.for_all_dtypes (#269)
  • Enable style check for Python code in Travis (#273)
  • Refactor cupy.argsort tests (#282)

Others

  • Small fixes for cupy.argsort (#223)

v1.0.2

01 Aug 06:13
Compare
Choose a tag to compare

This release includes bug fixes and improvements to the documentation and tests. See the list for the complete list of solved issues and merged PRs.

Enhancement

  • Change allocation_unit_size from 256 to 512 (#256)
  • Avoid synchronize in array function (#257)
  • Deterministic test (#217)
    • Note that this change includes an additional public function; we prioritized stabilizing tests more than keeping the rule of not introducing new features in stable updates.

Bug fixes

  • Fix out argument in fusion ufunc (#242)
  • Fix array method on multi GPU (#258)
  • Fix deepcopy with multiple devices (#263)
  • Fix multi-device copyto (#275)
  • Fix link args for cusolver (#315)

Installation

  • Add compile option to build on Windows (#279)
  • Do not create a.out on running python setup.py develop (#293)
  • Fix link args for cusolver (#315)

Documentation

  • Fix spelling in tutorial (#272)
  • Fix difference of reduction functions (#324)
  • Fix GitHub link (#333)

Tests

  • Make tests deterministic when possible (#217)
  • Add unit tests for cupy.array (#259)
  • Fix Numpy VisibleDeprecationWarning in indexing tests (#261)
  • Add retry to unit tests of decomposition functions (#262)
  • Fix travis test to enable style check for normal Python code (#290)
  • Skip bool unary negative test (#341)

Other

  • Add include option for covreage (#286)
  • Ignore generated reference (#318)
  • Add tags file to .gitignore (#325)

v2.0.0a1

04 Jul 06:37
Compare
Choose a tag to compare
v2.0.0a1 Pre-release
Pre-release

This is the release of CuPy v2.0.0a1. See here for the complete list of solved issues and merged PRs.

Release Notes

Important updates

  • We start using NVRTC instead of NVCC for kernel compilation. This change enables CuPy to run on an environment where CUDA is installed but NVCC is not available. Note that some features depending on Thrust (e.g. sorting functions) cannot be used if NVCC is not available at the installation.
  • Many functions for sorting, linear algebra, and others are added

New features

  • Use NVRTC instead of NVCC to compile kernels (#33, #62)
  • Sorting functions
    • cupy.msort (#150)
    • cupy.lexsort (#132)
    • cupy.argsort (#67)
    • cupy.sort sorting arrays with two or more rank along last axis (#186, #187)
    • Make cupy.sort support arrays with rank two or more. (#152)
  • Linear algebra functions
    • cupy.linalg.slogdet (#95)
    • cupy.linalg.matrix_rank (#97)
    • cupy.linalg.eigh and cupy.linalg.eigvalsh (#46)
  • Preliminary features to support sparse matrices
    • Note: the sparse matrix itself cannot be used in this version, yet; we are planning to make it usable in the next beta.
    • cupy.sparse.spmatrix, a base class of sparse matrices (#40)
    • Add cuSPARSE APIs (#39)
  • cupy.mgrid and cupy.ogrid (#145, thanks @iory!)
  • cupy.random.multinomial (#85)
  • cupy.cumprod (#110, thanks @ronekko!)
  • Support cuDNN v6 dilated convolution (#133, thanks @anaruse!)
  • Add total_bytes(), free_bytes(), and used_bytes() methods to memory pool (#184)
  • Support order option in astype (#111) and copy (#112)
  • cupy.fuse now does not require parentheses (#43)
  • Add ndim to CArray and CIndexer (#160, #161)

Enhancement

  • Improve memory deallocation (#174)
  • Skip installing thrust support in case nvcc not found in PATH. (#91)
  • Improve asynchronous host to device copy (#123)
  • Change the allocation unit size from 256 to 512 (#176)
  • Workaround to "No supported gcc/g++ host compiler found” error in Ubuntu 17.04 (#198)
  • Avoid synchronization in cupy.array for 0-dim values (#157)
  • Make cupy.count_nonzero return an array instead of int to avoid device-to-host synchronization (#154)
  • Check type in assert_array_list_equal (#205)
  • Improve performance (#169, #171, #172, #193, #206)
  • Improve testing utility (#218, #231)
  • Refactor cupy.atleast_nd functions (#142)

Bug fixes

  • Fix out argument in fusion (#209, #213)
  • Fix cupy.array on multiple GPU environment (#122, #135)
  • Fix usages of copy argument of ndarray.astype (#118, #121)
  • Make memory pool thread-safe (#105, thanks @kmaehashi!)
  • Fix fusion to reject NumPy arrays (#151)
  • Fix thread safety of cupy.random.get_random_state (#77, #78)

Documents

  • Fix tutorial (#93, thanks @hvy!)
  • Add links to GitHub source pages (#131)
  • Fix typo (#148, thanks @ignisan!)
  • Write about advanced indexing support (#88, thanks @yuyu2172!)
  • Remove description about discrepancy with NumPy regarding exponential of boolean arrays, which was resolved in NumPy 1.13.0 (#140)
  • Add missing documentation of cupy.cumsum (#90, thanks @ronekko!)
  • Add documentation of __getitem__ and __setitem__ for ndarray (#89, thanks @yuyu2172!)
  • Minor improvement for README and the document (#45, #49, #117, #134, #138, #155 thanks @ClimbsRocks!, #165, #177, #166)

Examples

Tests

  • Stabilize cupy.random.choice test (#98, #104)
  • Fix Numpy VisibleDeprecationWarning in indexing tests (#202)
  • Make random tests deterministic (#81, #82)
  • Retry unit tests of decomposition functions (#129)
  • Fix bug of histogram in RandomState.interval test (#175)

Others

  • Add SciPy license (#196)
  • Fix error message in setup script (#139)

v1.0.1

04 Jul 08:11
Compare
Choose a tag to compare

This release includes bug fixes and improvements on documents and tests. See the list for the complete list of solved issues and merged PRs.

Release Notes

Enhancement

  • Workaround to "No supported gcc/g++ host compiler found” error in Ubuntu 17.04 (#243)

Bug fixes

  • Make memory pool thread-safe (#109, thanks @kmaehashi!)
  • Fix fusion to reject NumPy arrays (#241)
  • Fix thread safety of cupy.random.get_random_state (#77, #99)

Documents

  • Fix markdown in the tutorial (#106, thanks @hvy!)
  • Write about advanced indexing support (#126, thanks @yuyu2172!)
  • Remove description about discrepancy with NumPy regarding exponential of boolean arrays, which was resolved in NumPy 1.13.0 (#146)
  • Fix typo in the tutorial (#153, thanks @ignisan!)
  • Other documentation improvements (#125, #189, #173, #210)

Examples

  • Fix color argument in the k-means example (#107)

Install

  • Skip installing thrust support in case nvcc not found in PATH. (#116)
  • Other install improvement: (#143)

Others

v1.0.0

01 Jun 07:04
Compare
Choose a tag to compare

This is the release of CuPy v1.

This release also contains updates of CuPy included in Chainer v1.23.0 and v1.24.0. See the release note of Chainer v1.23.0 and the release note of Chainer v1.24.0 for the details.

Announcements

The set of supported versions of CUDA and cuDNN is changed from Chainer v1.x as follows.

  • CUDA 7.0 and later
  • cuDNN 4.0 and later

Release Notes

Note: We had originally planned to include NVRTC support for the just-in-time compilation of kernels via pynvrtc, but we found that there is no guarantee on pynvrtc being compatible with old versions of CUDA, so we decided to make our own wrapper instead. Unfortunately, it cannot be included in this version. We are planning to add NVRTC support in the next version.

New features

  • Add cupy.sort function (#55, #66, #68)
  • 64bit address support on CUDA (#31)
  • Support CUPY_SEED enviroment variable (#44)

Enhancement

  • Refactor carray.cuh file (#53, #56, #57)
  • Support lock-free cache of compiled nvcc binary (#37)
  • Allow cupy.copyto from Python scalar (#38)
  • Improve setup process (#65, #69, #70, #73, #76, #80)

Bug fixes

  • Fix cupy.random.choise (#84)

Documents

Examples

  • Add KMeans example (#35)

Tests

  • Improve test stability (#48, #50)