Reading Data#

When working with Unstructured Grid datasets, the grid and data variables are often stored as separate netCDF files. This means that there are multiple separate files that need to be read in at once.

For example, the NOAA Geoflow project consists of 4 files (1 grid file and 3 data files). This project follows the UGRID conventions. Special thanks to John Clyne, Shilpi Gupta, and the VAPOR team for providing this data!

geoflow-small
│   grid.nc
│   v1.nc
│   v2.nc
│   v3.nc
import uxarray as ux    
# Base projct path
base_path = "../../test/meshfiles/geoflow-small/"

# Path to grid file 
grid_path = base_path + "grid.nc"

# Data Variable names
var_names = ['v1.nc', 'v2.nc', 'v3.nc']

# List of all data variable paths
data_paths = [base_path + name for name in var_names]

The open_dataset(grid_file, *args) method handles opening and reading in grid and data files. This function takes in a single grid file and any number of data files and creates an uxarray.Grid object from them.

These Grid objects are used to describe our unstructured grid and have many attributes and methods that can directly operate on our grid. A more in-depth description of this functionality can be found in our documentation and in future usage examples.

# Opening 1 grid file
grid_1 = ux.open_dataset(grid_path)

# Opening 1 grid file and 1 data file
grid_2 = ux.open_dataset(grid_path, data_paths[0])

# Opening 1 grid file and 2 data files
grid_3 = ux.open_dataset(grid_path, data_paths[0], data_paths[1])

# Opening 1 grid with a list of data files
grid_4 = ux.open_dataset(grid_path, *data_paths)

Now let’s look at the xarray.dataset that our Grid object constructs. This dataset contains our Coordinate and Data Variables that represent our unstructured grid. As show in the open_dataset() section, any data variables will be contained within this single dataset, regardless of how many are passed through

# Accessing underlying dataset
grid_4.ds
<xarray.Dataset>
Dimensions:          (nMeshFaces: 3840, nFaceNodes: 4, nMeshNodes: 6000,
                      meshLayers: 20, time: 1)
Coordinates:
    mesh_node_x      (nMeshNodes) float64 0.0 5.214 16.5 ... 62.7 68.8 72.0
    mesh_node_y      (nMeshNodes) float64 58.28 59.8 62.06 ... -31.68 -31.72
  * time             (time) float64 13.0
Dimensions without coordinates: nMeshFaces, nFaceNodes, nMeshNodes, meshLayers
Data variables:
    mesh             int32 -2147483647
    mesh_face_nodes  (nMeshFaces, nFaceNodes) uint32 0 1 6 5 ... 5994 5999 5998
    mesh_depth       (meshLayers, nMeshNodes) float64 ...
    v1               (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
    v2               (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>
    v3               (time, meshLayers, nMeshNodes) float64 dask.array<chunksize=(1, 20, 6000), meta=np.ndarray>