Quickstart#

Acknowledgments: This notebook adapts the content in this NCAR tutorial to Xarray, and uses it to illustrate cupy-xarray and working with cupy arrays and Xarray objects in general.

Setup#

import cupy as cp
import cupy_xarray  # Adds .cupy to Xarray objects
import numpy as np
import xarray as xr

Creating Arrays#

First we create arrays on the CPU and GPU

# NumPy data (host / cpu)
x_cpu = np.linspace(0, 2, 5)
print("On the CPU: ", x_cpu)

# CuPy data
x_gpu = cp.linspace(2, 4, 5)
print("On the GPU: ", x_gpu)
On the CPU:  [0.  0.5 1.  1.5 2. ]
On the GPU:  [2.  2.5 3.  3.5 4. ]

And now wrap those in a Xarray DataArray

da_gpu = xr.DataArray(x_gpu, dims="x")
da_gpu
<xarray.DataArray (x: 5)>
array([2. , 2.5, 3. , 3.5, 4. ])
Dimensions without coordinates: x

That was easy! Xarray seamlessly wraps numpy array-like objects that support specific protocols.

For array-specific functionality Xarray recommends adding new packages that provide “accessors” on Xarray objects.

For example, the pint-xarray package that wraps unit-aware pint arrays and provides a .pint for unit-specific functionality.

In this tutorial, we demonstrate cupy-xarray which provides a cupy accessor that in turn provides access to cupy-specific functionality.

Checking for cupy arrays#

Unfortunately the text representation of CuPy arrays isn’t very informative so it isn’t obvious that this DataArray wraps a CuPy array on the GPU.

da_gpu
<xarray.DataArray (x: 5)>
array([2. , 2.5, 3. , 3.5, 4. ])
Dimensions without coordinates: x

Instead we’ll use the is_cupy property provided by the cupy accessor

da_gpu.cupy.is_cupy
True

Accessing the underlying array#

Use the DataArray.data property to access the underlying CuPy Array

da_gpu.data
array([2. , 2.5, 3. , 3.5, 4. ])

This means we now have access to CuPy-specific properties

da_gpu.data.device
<CUDA Device 0>

Moving data between CPU and GPU (or host and device)#

Xarray provides DataArray.as_numpy to convert all kinds of arrays to numpy arrays

# Move data to host
da_cpu = da_gpu.as_numpy()
da_cpu
<xarray.DataArray (x: 5)>
array([2. , 2.5, 3. , 3.5, 4. ])
Dimensions without coordinates: x

Let’s make sure this isn’t a cupy array anymore

da_cpu.cupy.is_cupy
False

To convert a numpy array to a CuPy array (move data to GPU) use cupy.as_cupy()

# Move data to GPU
da_cpu.cupy.as_cupy()
<xarray.DataArray (x: 5)>
array([2. , 2.5, 3. , 3.5, 4. ])
Dimensions without coordinates: x
da_cpu.as_cupy().cupy.is_cupy
True

Most Xarray operations preserve array type#

expanded = da_gpu.expand_dims(y=3)
expanded.cupy.is_cupy
True

Alignment#

Alignment is a fundamental Xarray operation. It preserves array types

aligned = xr.align(da_gpu, expanded)
[a.cupy.is_cupy for a in aligned]
[True, True]

Broadcasting#

da_gpu2 = da_gpu.rename({"x": "y"})
broadcasted = xr.broadcast(da_gpu, da_gpu2)
[a.cupy.is_cupy for a in broadcasted]
[True, True]

Basic Arithmetic#

# works on both CPU and GPU
print("is_gpu: ", (da_gpu + 1).cupy.is_cupy)
da_gpu + 1
is_gpu:  True
<xarray.DataArray (x: 5)>
array([3. , 3.5, 4. , 4.5, 5. ])
Dimensions without coordinates: x

Numpy universal functions#

np.min(da_gpu.mean()).cupy.is_cupy
True

We can use np.round which dispatches

np.round(da_gpu.mean(), 2).cupy.is_cupy
True

High-level Xarray functions#

Groupby works#

Though this is a slow for loop over groups. We could add an explicit parallel algorithm to flox

da_gpu.groupby("x").mean(...)
<xarray.DataArray (x: 5)>
array([2. , 2.5, 3. , 3.5, 4. ])
Dimensions without coordinates: x

Since groupby works; groupby_bins and resample should also work

Rolling windows do not work#

cupy needs to add support for sliding_window_view

da_gpu.rolling(x=3).mean().cupy.is_cupy
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [19], in <cell line: 1>()
----> 1 da_gpu.rolling(x=3).mean().cupy.is_cupy

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:155, in Rolling._reduce_method.<locals>.method(self, keep_attrs, **kwargs)
    151 def method(self, keep_attrs=None, **kwargs):
    153     keep_attrs = self._get_keep_attrs(keep_attrs)
--> 155     return self._numpy_or_bottleneck_reduce(
    156         array_agg_func,
    157         bottleneck_move_func,
    158         rolling_agg_func,
    159         keep_attrs=keep_attrs,
    160         fillna=fillna,
    161         **kwargs,
    162     )

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:580, in DataArrayRolling._numpy_or_bottleneck_reduce(self, array_agg_func, bottleneck_move_func, rolling_agg_func, keep_attrs, fillna, **kwargs)
    576     return self._bottleneck_reduce(
    577         bottleneck_move_func, keep_attrs=keep_attrs, **kwargs
    578     )
    579 if rolling_agg_func:
--> 580     return rolling_agg_func(self, keep_attrs=self._get_keep_attrs(keep_attrs))
    581 if fillna is not None:
    582     if fillna is dtypes.INF:

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:169, in Rolling._mean(self, keep_attrs, **kwargs)
    168 def _mean(self, keep_attrs, **kwargs):
--> 169     result = self.sum(keep_attrs=False, **kwargs) / self.count(keep_attrs=False)
    170     if keep_attrs:
    171         result.attrs = self.obj.attrs

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:155, in Rolling._reduce_method.<locals>.method(self, keep_attrs, **kwargs)
    151 def method(self, keep_attrs=None, **kwargs):
    153     keep_attrs = self._get_keep_attrs(keep_attrs)
--> 155     return self._numpy_or_bottleneck_reduce(
    156         array_agg_func,
    157         bottleneck_move_func,
    158         rolling_agg_func,
    159         keep_attrs=keep_attrs,
    160         fillna=fillna,
    161         **kwargs,
    162     )

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:589, in DataArrayRolling._numpy_or_bottleneck_reduce(self, array_agg_func, bottleneck_move_func, rolling_agg_func, keep_attrs, fillna, **kwargs)
    586     kwargs.setdefault("skipna", False)
    587     kwargs.setdefault("fillna", fillna)
--> 589 return self.reduce(array_agg_func, keep_attrs=keep_attrs, **kwargs)

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:472, in DataArrayRolling.reduce(self, func, keep_attrs, **kwargs)
    470 else:
    471     obj = self.obj
--> 472 windows = self._construct(
    473     obj, rolling_dim, keep_attrs=keep_attrs, fill_value=fillna
    474 )
    476 result = windows.reduce(
    477     func, dim=list(rolling_dim.values()), keep_attrs=keep_attrs, **kwargs
    478 )
    480 # Find valid windows based on count.

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/rolling.py:389, in DataArrayRolling._construct(self, obj, window_dim, stride, fill_value, keep_attrs, **window_dim_kwargs)
    384 window_dims = self._mapping_to_list(
    385     window_dim, allow_default=False, allow_allsame=False  # type: ignore[arg-type]  # https://github.com/python/mypy/issues/12506
    386 )
    387 strides = self._mapping_to_list(stride, default=1)
--> 389 window = obj.variable.rolling_window(
    390     self.dim, self.window, window_dims, self.center, fill_value=fill_value
    391 )
    393 attrs = obj.attrs if keep_attrs else {}
    395 result = DataArray(
    396     window,
    397     dims=obj.dims + tuple(window_dims),
   (...)
    400     name=obj.name,
    401 )

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/variable.py:2319, in Variable.rolling_window(self, dim, window, window_dim, center, fill_value)
   2315 axis = [self.get_axis_num(d) for d in dim]
   2316 new_dims = self.dims + tuple(window_dim)
   2317 return Variable(
   2318     new_dims,
-> 2319     duck_array_ops.sliding_window_view(
   2320         padded.data, window_shape=window, axis=axis
   2321     ),
   2322 )

File ~/miniconda3/envs/gpu/lib/python3.10/site-packages/xarray/core/duck_array_ops.py:640, in sliding_window_view(array, window_shape, axis)
    638     return dask_array_compat.sliding_window_view(array, window_shape, axis)
    639 else:
--> 640     return npcompat.sliding_window_view(array, window_shape, axis)

File <__array_function__ internals>:180, in sliding_window_view(*args, **kwargs)

TypeError: no implementation found for 'numpy.lib.stride_tricks.sliding_window_view' on types that implement __array_function__: [<class 'cupy.ndarray'>]

Weighted operations work#

weights = xr.DataArray(cp.asarray([0, 0.5, 1, 0.5, 0]), dims="y")
da_gpu.weighted(weights).sum().cupy.is_cupy
True

Plotting works#

Automatically moves data to the CPU before passing on to matplotlib

da_gpu.plot()
[<matplotlib.lines.Line2D at 0x2b7c1f73f190>]
_images/a089d5c13d3276dc5f85aad56086be5ab30b5b52428819a18441e7202c10eec8.png

Apply custom kernels with apply_ufunc#

(This kernel was copied from this NCAR tutorial)

x = cp.arange(6, dtype="f").reshape(2, 3)
y = cp.arange(3, dtype="f")

kernel = cp.ElementwiseKernel(
    "float32 x, float32 y",
    "float32 z",
    """
    if (x - 2 > y) {
      z = x * y;
    } else {
      z = x + y;
    }
    """,
    "my_kernel",
)

kernel(x, y)
array([[ 0.,  2.,  4.],
       [ 0.,  4., 10.]], dtype=float32)

We can apply these and other custom kernels using xarray.apply_ufunc

xda = xr.DataArray(x, dims=("a", "b"))
yda = xr.DataArray(y, dims=("b"))
result = xr.apply_ufunc(
    kernel,
    xda,
    yda,
)
print("is_gpu:", result.cupy.is_cupy)
result
is_gpu: True
<xarray.DataArray (a: 2, b: 3)>
array([[ 0.,  2.,  4.],
       [ 0.,  4., 10.]], dtype=float32)
Dimensions without coordinates: a, b