Reading In Grid Data#

UXarray offers support for loading and representing unstructured grids by utilizing existing Xarray functionality paired with new routines that are specifically written for operating on unstructured grids.

Overview#

When working with unstructured grids, the grid definition and data variables are often stored as separate files (most of the time as netCDF). This means that there are multiple separate files that need to be read in at once.

For example, the NOAA Geoflow project consists of 4 files (1 grid file and 3 data files). This project follows the UGRID conventions. Special thanks to John Clyne, Shilpi Gupta, and the VAPOR team for providing these data!

geoflow-small
│   grid.nc
│   v1.nc
│   v2.nc
│   v3.nc

Opening The Grid and Data Files with Xarray#

First, read in the grid definition and data variable netCCDF files by using xarray.open_dataset. In addition, xr.open_mf_dataset can be used for combining multiple data variables into a single xarray.Dataset.

import xarray as xr
# Base data path
base_path = "../../test/meshfiles/geoflow-small/"

# Path to Grid file
grid_path = base_path + "grid.nc"

# Paths to Data Variable files
var_names = ['v1.nc', 'v2.nc', 'v3.nc']
data_paths = [base_path + name for name in var_names]
# Open the Grid file
grid_ds = xr.open_dataset(grid_path)

# Open a single file or multiple files
v1_ds = xr.open_dataset(data_paths[0])
multi_data_ds = xr.open_mfdataset(data_paths)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/file_manager.py:209, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    208 try:
--> 209     file = self._cache[self._key]
    210 except KeyError:

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/lru_cache.py:55, in LRUCache.__getitem__(self, key)
     54 with self._lock:
---> 55     value = self._cache[key]
     56     self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/docs/checkouts/readthedocs.org/user_builds/uxarray/checkouts/v2023.02.0/test/meshfiles/geoflow-small/grid.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), '7c2e88ef-9739-4b74-83df-b2e6875eb5f6']

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
Cell In[3], line 2
      1 # Open the Grid file
----> 2 grid_ds = xr.open_dataset(grid_path)
      4 # Open a single file or multiple files
      5 v1_ds = xr.open_dataset(data_paths[0])

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/api.py:541, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
    529 decoders = _resolve_decoders_kwargs(
    530     decode_cf,
    531     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    537     decode_coords=decode_coords,
    538 )
    540 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 541 backend_ds = backend.open_dataset(
    542     filename_or_obj,
    543     drop_variables=drop_variables,
    544     **decoders,
    545     **kwargs,
    546 )
    547 ds = _dataset_from_backend_dataset(
    548     backend_ds,
    549     filename_or_obj,
   (...)
    557     **kwargs,
    558 )
    559 return ds

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:578, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    557 def open_dataset(
    558     self,
    559     filename_or_obj,
   (...)
    574     autoclose=False,
    575 ):
    577     filename_or_obj = _normalize_path(filename_or_obj)
--> 578     store = NetCDF4DataStore.open(
    579         filename_or_obj,
    580         mode=mode,
    581         format=format,
    582         group=group,
    583         clobber=clobber,
    584         diskless=diskless,
    585         persist=persist,
    586         lock=lock,
    587         autoclose=autoclose,
    588     )
    590     store_entrypoint = StoreBackendEntrypoint()
    591     with close_on_error(store):

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:382, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
    376 kwargs = dict(
    377     clobber=clobber, diskless=diskless, persist=persist, format=format
    378 )
    379 manager = CachingFileManager(
    380     netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
    381 )
--> 382 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:329, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
    327 self._group = group
    328 self._mode = mode
--> 329 self.format = self.ds.data_model
    330 self._filename = self.ds.filepath()
    331 self.is_remote = is_remote_uri(self._filename)

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:391, in NetCDF4DataStore.ds(self)
    389 @property
    390 def ds(self):
--> 391     return self._acquire()

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/netCDF4_.py:385, in NetCDF4DataStore._acquire(self, needs_lock)
    384 def _acquire(self, needs_lock=True):
--> 385     with self._manager.acquire_context(needs_lock) as root:
    386         ds = _nc4_require_group(root, self._group, self._mode)
    387     return ds

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/contextlib.py:119, in _GeneratorContextManager.__enter__(self)
    117 del self.args, self.kwds, self.func
    118 try:
--> 119     return next(self.gen)
    120 except StopIteration:
    121     raise RuntimeError("generator didn't yield") from None

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/file_manager.py:197, in CachingFileManager.acquire_context(self, needs_lock)
    194 @contextlib.contextmanager
    195 def acquire_context(self, needs_lock=True):
    196     """Context manager for acquiring a file."""
--> 197     file, cached = self._acquire_with_cache_info(needs_lock)
    198     try:
    199         yield file

File ~/checkouts/readthedocs.org/user_builds/uxarray/conda/v2023.02.0/lib/python3.9/site-packages/xarray/backends/file_manager.py:215, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    213     kwargs = kwargs.copy()
    214     kwargs["mode"] = self._mode
--> 215 file = self._opener(*self._args, **kwargs)
    216 if self._mode == "w":
    217     # ensure file doesn't get overridden when opened again
    218     self._mode = "a"

File src/netCDF4/_netCDF4.pyx:2463, in netCDF4._netCDF4.Dataset.__init__()

File src/netCDF4/_netCDF4.pyx:2026, in netCDF4._netCDF4._ensure_nc_success()

FileNotFoundError: [Errno 2] No such file or directory: b'/home/docs/checkouts/readthedocs.org/user_builds/uxarray/checkouts/v2023.02.0/test/meshfiles/geoflow-small/grid.nc'
grid_ds
<xarray.Dataset>
Dimensions:          (nMeshFaces: 3840, nFaceNodes: 4, nMeshNodes: 6000, meshLayers: 20)
Coordinates:
    mesh_node_x      (nMeshNodes) float64 0.0 5.214 16.5 ... 62.7 68.8 72.0
    mesh_node_y      (nMeshNodes) float64 58.28 59.8 62.06 ... -31.68 -31.72
Dimensions without coordinates: nMeshFaces, nFaceNodes, nMeshNodes, meshLayers
Data variables:
    mesh             int32 -2147483647
    mesh_face_nodes  (nMeshFaces, nFaceNodes) float64 0.0 1.0 ... 5.998e+03
    mesh_depth       (meshLayers, nMeshNodes) float64 ...
multi_data_ds
<xarray.Dataset>
Dimensions:  (time: 1, meshLayers: 20, nMeshNodes: 6000)
Coordinates:
  * time     (time) float64 13.0
Dimensions without coordinates: meshLayers, nMeshNodes
Data variables:
    v1       (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
    v2       (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
    v3       (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>

Representing The Unstructured Grid with UXarray#

import uxarray as ux
# Construct a UXarray Grid object from `grid_ds`
grid = ux.Grid(grid_ds)
grid.ds
<xarray.Dataset>
Dimensions:          (nMeshFaces: 3840, nFaceNodes: 4, nMeshNodes: 6000, meshLayers: 20)
Coordinates:
    mesh_node_x      (nMeshNodes) float64 0.0 5.214 16.5 ... 62.7 68.8 72.0
    mesh_node_y      (nMeshNodes) float64 58.28 59.8 62.06 ... -31.68 -31.72
Dimensions without coordinates: nMeshFaces, nFaceNodes, nMeshNodes, meshLayers
Data variables:
    mesh             int32 -2147483647
    mesh_face_nodes  (nMeshFaces, nFaceNodes) float64 0.0 1.0 ... 5.998e+03
    mesh_depth       (meshLayers, nMeshNodes) float64 ...

As can be seen, the Grid object has its own grid.ds of type xarray.Dataset to define the grid topology. However, the Grid object has further attributes, variables, and functions to specify the unstructured grid and be executed on it, which can be explored in the next notebooks.