Working with Unstructured Grid Data#
Authors: Philip Chmielowiec, Orhan Eroglu
UXarray offers support for loading and representing unstructured grids by providing Xarray-like functionality paired with new routines that are specifically written for operating on unstructured grids.
Grid Definition and Data Variables#
When working with Unstructured Grids, the grid definition and data variables are often stored as separate files. This means that there are multiple separate files that need to be read and linked together to represent the entire dataset.
For example, the following sample dataset is taken from the NOAA Geoflow project, which is made up of 4 files: 1 grid definition and 3 data files. (Special thanks to John Clyne, Shilpi Gupta, and the VAPOR team for providing this data!)
geoflow-small
│ grid.nc
│ v1.nc
│ v2.nc
│ v3.nc
Grid Conventions#
Given the complexity of Unstructured Grids, there are many different ways of representing their underlying topology and structure. These representations are referred to as conventions, and they outline the required connectivity variables, naming conventions, data types, and many other specifications. UXarray uses the UGRID conventions as a foundation for internally representing Unstructured Grids, converting any supported input grid format into the UGRID convention at the data loading step. Below is a list of supported formats and conventions that can be read in with UXarray:
UGRID
Model for Prediction Across Scales (MPAS)
Exodus
In addition to loading datasets, we also provide support for constructing a grid from user-defined primitives such as vertices, which is showcased in our other notebooks.
Reading Grid and Data Files#
UXarray provides the UxDataset
data structure, which is an unstructure grid-informed implementation of Xarray’s Dataset
class. The main addition is the introduction of the uxgrid
property, which stores our grid topology dimensions, coordinates, variables and provides grid-specific functions.
Constructing a UxDataset
can be done using our custom open_dataset
and open_mfdataset
methods, depending on whether one or multiple data files or objects are meant to be linked to a single grid.
import uxarray as ux
import numpy as np
# Base data path
base_path = "../../test/meshfiles/ugrid/geoflow-small/"
# Path to Grid file
grid_path = base_path + "grid.nc"
# Paths to Data Variable files
var_names = ['v1.nc', 'v2.nc', 'v3.nc']
data_paths = [base_path + name for name in var_names]
Loading a single data file with a grid is done using the open_dataset
method. The resulting UxDataset
only contains the data variables stored in v1.nc
.
uxds_single = ux.open_dataset(grid_path, data_paths[0])
uxds_single
<xarray.UxDataset> Size: 960kB Dimensions: (time: 1, meshLayers: 20, n_node: 6000) Coordinates: * time (time) float64 8B 13.0 Dimensions without coordinates: meshLayers, n_node Data variables: v1 (time, meshLayers, n_node) float64 960kB -0.009766 ... 0.03283
Similarly, if you wish to open multiple data files with a grid, you would use the open_mfdataset
method. The resulting UxDataset
contains all the data variables stored in v1.nc
, v2.nc
, and v3.nc
uxds_multiple = ux.open_mfdataset(grid_path, data_paths)
uxds_multiple
<xarray.UxDataset> Size: 3MB Dimensions: (time: 1, meshLayers: 20, n_node: 6000) Coordinates: * time (time) float64 8B 13.0 Dimensions without coordinates: meshLayers, n_node Data variables: v1 (time, meshLayers, n_node) float64 960kB dask.array<chunksize=(1, 20, 6000), meta=np.ndarray> v2 (time, meshLayers, n_node) float64 960kB dask.array<chunksize=(1, 20, 6000), meta=np.ndarray> v3 (time, meshLayers, n_node) float64 960kB dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
Grid Topology#
Each dataset contains the aforementioned uxgrid
property, which is a Grid
object and represents the grid topology that the data variables lie on. The uxgrid
property can be used to execute grid specific functions and access grid topology dimensions, coordinates, and variables. A detailed overview of functionalities can be found in subsequent notebooks.
For both instances of UxDataset
that contain single and multiple data sets (i.e. uxds_single
and uxds_multiple
), the uxgrid
property contains the same grid information, however they are each instantiated separately.
# check if the grids contain the same variables & information
print(uxds_single.uxgrid == uxds_multiple.uxgrid)
# check if the grids point to the same object in memory
print(uxds_single.uxgrid is uxds_multiple.uxgrid)
True
False
Printing out the uxgrid
property provides an overview of the original grid format, dimensions, coordinates, and connectivity variables.
uxds_multiple.uxgrid
<uxarray.Grid>
Original Grid Type: UGRID
Grid Dimensions:
* n_node: 6000
* n_edge: 9600
* n_face: 3840
* n_max_face_nodes: 4
* two: 2
* n_nodes_per_face: (3840,)
Grid Coordinates (Spherical):
* node_lon: (6000,)
* node_lat: (6000,)
Grid Coordinates (Cartesian):
Grid Connectivity Variables:
* face_node_connectivity: (3840, 4)
* edge_node_connectivity: (9600, 2)
Grid Descriptor Variables:
These dimensions, coordinates, and connectivity variables can be accessed with attributes using the same names as shown in the print-out. Below are a few examples.
uxds_multiple.uxgrid.n_node
6000
uxds_multiple.uxgrid.node_lon
<xarray.DataArray 'node_lon' (n_node: 6000)> Size: 48kB array([ 0. , 5.213776, 16.497974, ..., 62.701014, 68.804082, 72. ]) Coordinates: node_lon (n_node) float64 48kB 0.0 5.214 16.5 29.14 ... 62.7 68.8 72.0 node_lat (n_node) float64 48kB 58.28 59.8 62.06 ... -31.38 -31.68 -31.72 Dimensions without coordinates: n_node Attributes: standard_name: longitude long_name: Longitude of 2D mesh nodes. units: degrees_east
uxds_multiple.uxgrid.face_node_connectivity
<xarray.DataArray 'face_node_connectivity' (n_face: 3840, n_max_face_nodes: 4)> Size: 123kB array([[ 0, 1, 6, 5], [ 1, 2, 7, 6], [ 2, 3, 8, 7], ..., [5991, 5992, 5997, 5996], [5992, 5993, 5998, 5997], [5993, 5994, 5999, 5998]]) Dimensions without coordinates: n_face, n_max_face_nodes Attributes: cf_role: face_node_connectivity long_name: Maps every face to its corner nodes. start_index: 0 _FillValue: -9223372036854775808
Data Variables#
While grid-specific variables and functions are stored under the uxgrid
property, data variables that lie on the grid are stored directly in the UxDataset
or UxDataArray
. Most Xarray
functions and operators can be executed on these data structures.
uxds_single.values
<bound method Mapping.values of <xarray.UxDataset> Size: 960kB
Dimensions: (time: 1, meshLayers: 20, n_node: 6000)
Coordinates:
* time (time) float64 8B 13.0
Dimensions without coordinates: meshLayers, n_node
Data variables:
v1 (time, meshLayers, n_node) float64 960kB -0.009766 ... 0.03283>
uxds_single.dims
FrozenMappingWarningOnValuesAccess({'time': 1, 'meshLayers': 20, 'n_node': 6000})
uxds_single.coords
Coordinates:
* time (time) float64 8B 13.0
uxds_single.attrs
{}
uxds_single.min()
<xarray.UxDataset> Size: 8B Dimensions: () Data variables: v1 float64 8B -18.53
uxds_single > 0
<xarray.UxDataset> Size: 120kB Dimensions: (time: 1, meshLayers: 20, n_node: 6000) Coordinates: * time (time) float64 8B 13.0 Dimensions without coordinates: meshLayers, n_node Data variables: v1 (time, meshLayers, n_node) bool 120kB False False ... True True
grid = uxds_single.uxgrid
foo = ux.UxDataArray(
data = np.random.random(grid.n_face),
dims = ["n_face"],
uxgrid = grid
)
foo
<xarray.UxDataArray (n_face: 3840)> Size: 31kB array([0.34134569, 0.89137306, 0.50184783, ..., 0.920733 , 0.20588804, 0.20177541]) Dimensions without coordinates: n_face
uxds_new_var = uxds_single.assign({"foo" : foo})
uxds_new_var
<xarray.UxDataset> Size: 991kB Dimensions: (time: 1, meshLayers: 20, n_node: 6000, n_face: 3840) Coordinates: * time (time) float64 8B 13.0 Dimensions without coordinates: meshLayers, n_node, n_face Data variables: v1 (time, meshLayers, n_node) float64 960kB -0.009766 ... 0.03283 foo (n_face) float64 31kB 0.3413 0.8914 0.5018 ... 0.9207 0.2059 0.2018