# Working with Unstructured Grid Data

Authors: [Philip Chmielowiec](https://github.com/philipc2), [Orhan Eroglu](https://github.com/erogluorhan)

UXarray offers support for loading and representing unstructured grids
by providing Xarray-like functionality paired with new routines that
are specifically written for operating on unstructured grids.


## Grid Definition and Data Variables

When working with Unstructured Grids, the grid definition and data variables
are often stored as separate files. This means that there are multiple
separate files that need to be read and linked together to represent the
entire dataset.

For example, the following sample dataset is taken from the NOAA Geoflow project,
which is made up of 4 files: 1 grid definition and 3 data files. (Special thanks to John Clyne, Shilpi Gupta, and the VAPOR team for providing this data!)

```
geoflow-small
│   grid.nc
│   v1.nc
│   v2.nc
│   v3.nc
```


## Grid Conventions

Given the complexity of Unstructured Grids, there are many different ways of representing their underlying topology and structure. These representations are referred to as conventions, and they outline
the required connectivity variables, naming conventions, data types, and many other specifications. UXarray uses the [UGRID](http://ugrid-conventions.github.io/ugrid-conventions/)
conventions as a foundation for internally representing Unstructured Grids, converting any supported input grid format into the UGRID convention at the data loading step. Below is a list of supported formats and conventions that can be read in with UXarray:
* UGRID
* Model for Prediction Across Scales (MPAS)
* Exodus

In addition to loading datasets, we also provide support for constructing a grid from user-defined primitives such as vertices, which is showcased in our other notebooks.


## Reading Grid and Data Files
UXarray provides the `UxDataset` data structure, which is an unstructure grid-informed implementation of Xarray's `Dataset` class. The main addition is the introduction of the `uxgrid` property, which stores our grid topology dimensions, coordinates, variables and provides grid-specific functions.

Constructing a `UxDataset` can be done using our custom `open_dataset` and `open_mfdataset` methods, depending on whether one or multiple data files or objects are meant to be linked to a single grid.


In [None]:
import uxarray as ux
import numpy as np

In [None]:
# Base data path
base_path = "../../test/meshfiles/ugrid/geoflow-small/"

# Path to Grid file
grid_path = base_path + "grid.nc"

# Paths to Data Variable files
var_names = ['v1.nc', 'v2.nc', 'v3.nc']

data_paths = [base_path + name for name in var_names]

Loading a single data file with a grid is done using the `open_dataset` method. The resulting `UxDataset` only contains the data variables stored in `v1.nc`.

In [None]:
uxds_single = ux.open_dataset(grid_path, data_paths[0])
uxds_single

Similarly, if you wish to open multiple data files with a grid, you would use the `open_mfdataset` method. The resulting `UxDataset` contains all the data variables stored in `v1.nc`, `v2.nc`, and `v3.nc`

In [None]:
uxds_multiple = ux.open_mfdataset(grid_path, data_paths)
uxds_multiple

## Grid Topology

Each dataset contains the aforementioned `uxgrid` property, which is a `Grid` object and represents the grid topology that the data variables lie on. The `uxgrid` property can be used to execute grid specific functions and access grid topology dimensions, coordinates, and variables. A detailed overview of functionalities can be found in subsequent notebooks.

For both instances of `UxDataset` that contain single and multiple data sets (i.e. `uxds_single` and `uxds_multiple`), the `uxgrid` property contains the same grid information, however they are each instantiated separately.


In [None]:
# check if the grids contain the same variables & information
print(uxds_single.uxgrid == uxds_multiple.uxgrid)

# check if the grids point to the same object in memory
print(uxds_single.uxgrid is uxds_multiple.uxgrid)

Printing out the `uxgrid` property provides an overview of the original grid format, dimensions, coordinates, and connectivity variables.

In [None]:
uxds_multiple.uxgrid

These dimensions, coordinates, and connectivity variables can be accessed with attributes using the same names as shown in the print-out. Below are a few examples.

In [None]:
uxds_multiple.uxgrid.n_node

In [None]:
uxds_multiple.uxgrid.node_lon

In [None]:
uxds_multiple.uxgrid.face_node_connectivity

## Data Variables

While grid-specific variables and functions are stored under the `uxgrid` property, data variables that lie on the grid are stored directly in the `UxDataset` or `UxDataArray`. Most `Xarray` functions and operators can be executed on these data structures.


In [None]:
uxds_single.values

In [None]:
uxds_single.dims

In [None]:
uxds_single.coords

In [None]:
uxds_single.attrs

In [None]:
uxds_single.min()

In [None]:
uxds_single > 0

In [None]:
grid = uxds_single.uxgrid
foo = ux.UxDataArray(
    data = np.random.random(grid.n_face),
    dims = ["n_face"],
    uxgrid = grid
)
foo

In [None]:
uxds_new_var = uxds_single.assign({"foo" : foo})

In [None]:
uxds_new_var