core module#

Defines the core xsnowDataset class for the xsnow package.

This module contains the primary user-facing object, xsnowDataset. It is designed as a high-level, convenient wrapper around a standard xarray.Dataset object, tailored for handling multi-dimensional snow profile data.

Core Design Philosophy#

The xsnowDataset class aims to simplify the user experience without sacrificing the power and flexibility of the underlying xarray data model. It achieves this through several key features:

  1. Composition over Inheritance: This class contains an xarray.Dataset instance (in self.data) rather than inheriting from it. This design choice is more robust and prevents future updates to xarray from breaking this class, giving us full control over the user-facing API.

  2. Simplified Initialization: The constructor is designed to accept a pre-existing xarray.Dataset. The primary, user-facing way to create a dataset from files is the top-level xsnow.read() function. This makes the extension architecture clean and powerful.

  3. Seamless `xarray` Integration: Through dunder forwarding (__getattr__, __getitem__, __setitem__), the class mimics the behavior of an xarray.Dataset, allowing users to interact with it using the familiar and powerful xarray API (e.g., ds.sel(…), ds[‘density’]).

  4. Automatic Re-wrapping: When a forwarded method returns a new xarray.Dataset, the class automatically wraps the result back into an xsnowDataset instance. This ensures that the user can chain operations (e.g., ds.sel(…).mean(…)) and always retain the convenience of the xsnowDataset wrapper.

Vertical Coordinate System#

The package uses a physically intuitive vertical coordinate system. - z: The primary vertical coordinate, representing depth below the snow

surface in meters. The surface is z=0, and values are negative downwards.

  • height: A data variable representing the original height above the ground in meters.

class xsnow.core.xsnowDataset(source)#

Bases: object

A class to represent and interact with a multi-dimensional snow profile dataset.

This class acts as a high-level wrapper around an xarray.Dataset, providing a convenient API for analysis, I/O, and data manipulation. It forwards unrecognized attribute and method calls to the underlying xarray.Dataset, enabling seamless use of the powerful xarray API.

Initializes the xsnowDataset from an existing xarray.Dataset.

This is the low-level constructor for the class. The recommended, user-facing way to create an xsnowDataset from files is the top-level xsnow.read() function. This constructor is primarily used internally and by extension functions that return a new xsnowDataset instance.

Parameters:

source (Dataset) – An existing xarray.Dataset object.

copy(deep=False)#

Returns a copy of the dataset.

This method provides a convenient wrapper around the underlying xarray.Dataset.copy() method and ensures the result is a new xsnowDataset instance.

Parameters:

deep (bool) – If True, the data and metadata are fully copied. If False (default), only the metadata is copied, and the new dataset’s variables will still point to the original data buffers.

Return type:

xsnowDataset

Returns:

A new xsnowDataset instance.

Examples

>>> ds = xsnow.read('path/to/data')
>>> # Create a shallow copy (changes to ds_shallow affect ds)
>>> ds_shallow = ds.copy(deep=False)
>>> # Create a deep copy (a completely independent object)
>>> ds_deep = ds.copy(deep=True)
merge_smet(source, var_prefix=None, **kwargs)#

Reads SMET data and merges it into the dataset.

An optional prefix can be added to the variable names from the SMET file to avoid naming conflicts (e.g., when merging input forcing data with model output data).

Parameters:
  • source (Union[str, List[str], Path, List[Path]]) – Path to a .smet file, a directory, or a list of files.

  • var_prefix (Optional[str]) – A string prefix to add to all data variable names read from the SMET files (e.g., ‘input.’).

  • **kwargs – Additional keyword arguments passed to the underlying xsnow.read() function (e.g., time selectors).

Return type:

xsnowDataset

Returns:

A new xsnowDataset with the combined data.

to_xarray()#

Returns the underlying xarray.Dataset.

This method unwraps the xsnowDataset and provides direct access to the core xarray.Dataset object. This is useful for interoperability with other libraries or functions that are built to work with standard xarray objects.

Return type:

Dataset

Returns:

The stored xarray.Dataset object.

Examples

>>> ds = xsnow.read('path/to/data')
>>> # Get the raw xarray object for use with another library
>>> xr_ds = ds.to_xarray()
compute_hs()#

Computes the total snow height (HS) from layer heights.

This method calculates the maximum value of the ‘height’ variable along the ‘layer’ dimension for each profile. The result is added to the dataset as a new data variable named ‘HS’ (Total Snow Height).

The resulting ‘HS’ variable will not have a ‘layer’ dimension.

Return type:

xsnowDataset

Returns:

A new xsnowDataset instance with the ‘HS’ variable added.

Raises:

ValueError – If the dataset is empty or the ‘height’ variable is missing.

Examples

>>> ds = xsnow.read('path/to/data')
>>> ds_with_hs = ds.compute_hs()
>>> print(ds_with_hs['HS'])
compute_z_depth()#

Calculates the depth below the surface (z) for each snowpack layer and adds it as a non-dimension coordinate.

This method first computes the total snow height (HS) and then calculates z as (height - HS), adding it as a ‘z’ data coordinate to the dataset. Metadata is pulled from the central registry.

Return type:

xsnowDataset

Returns:

A new xsnowDataset instance with the ‘z’ coordinate added.

update_profile_status()#

Attach or update profile_status with very simple semantics: :rtype: Dataset

-1 : error during parsing (preserved as-is) 0 : no snow layer data 1 : existing snow layer data (this may include “no snow”, see below!)

Logic#

1. Start from existing profile_status if present, otherwise from 0 over the core dims (location, time, slope, realization that exist in the dataset).

Negative values (< 0) are *never* changed by this method.

2. For profile_status to be positive after this call, the dataset must contain the vertical (layer) variable z and

  • have negative z values (i.e., snow layers), or

  • NaN z values (i.e., no snow layers) but a pre-existing profile_status==1, or

  • NaN z values (i.e., no snow layers) but an existing HS==0.

Note that the profile_status counts *no snow* as existing layer information if the previous criteria are met.

If z doesn’t exists, all non-negative profile_status entries are set to 0 (no layer data). Similarly, if z exists with all-NaN and HS does not exist or is NaN, the status is set to 0.

profile_status_summary()#

Print a summary table of profile_status occurrences

Return type:

None

to_gpu()#

Moves the dataset’s data to the GPU using CuPy if available.

This method checks for an available GPU and the cupy library. If found, it converts the underlying NumPy arrays to CuPy arrays for accelerated computation. If cupy is not installed or a GPU is not available, it returns the original dataset without modification.

Return type:

xsnowDataset

Returns:

A new xsnowDataset instance with data on the GPU, or the original object if a GPU is not available.

Examples

>>> ds_cpu = xsnow.read('path/to/data')
>>> # Move to GPU for fast computation
>>> ds_gpu = ds_cpu.to_gpu()
to_cpu()#

Moves the dataset’s data back to the CPU (NumPy).

This is necessary before performing operations that are not supported on the GPU, such as saving to NetCDF. If the data is already on the CPU, it returns the original dataset.

Return type:

xsnowDataset

Returns:

A new xsnowDataset instance with data on the CPU.

Examples

>>> ds_gpu = ds_cpu.to_gpu()
>>> # ... perform GPU-accelerated analysis ...
>>> ds_cpu_again = ds_gpu.to_cpu()
>>> ds_cpu_again.to_netcdf('results.nc')
append_latest(source, from_time=None, join='left', logger=None, **kwargs)#

Incrementally extend a dataset by appending newer timestamps from files.

This method is a convenience wrapper for time selection, read, and concat. Semantics:

  1. Determine a start time:
    • If from_time is None, use (max(existing time) + 1s).

    • If from_time is provided, drop all times <= from_time from existing_ds and start reading from from_time (inclusive).

  2. Read source with that time filter.

  3. Concatenate along time using xsnow.concat.
    • join controls non-time dims (default: ‘left’ keeps existing domains).

    • Time is always the union along the concat axis (no extra reindexing).

Parameters:
  • existing_ds (xsnowDataset) – The dataset instance to extend.

  • source (Union[str, List[str], Path]) – Path(s) to files or directories to read new data from (calls read upon it).

  • from_time (Union[str, datetime, Timestamp, datetime64, None]) – Subset existing data to times < from_time and read new data starting with from_time (this will already be new data). (also read Semantics above) Accepts str/pandas/np datetime or datetime.

  • join (str) – Join mode for non-time dimensions forwarded to xsnow.concat.

  • logger (Optional[Logger]) – Optional logger to use for status/warning messages. If None, a logger is created.

  • **kwargs – Additional keyword args forwarded to xsnow.concat like compat or combine_attrs.

Return type:

xsnowDataset

Returns:

xsnowDataset with the appended data (or trimmed-only if nothing new).

classify(func, *, output_name='class_id', output_kind='coord', input_var=None, attrs=None, **func_kwargs)#

General classification: run a user function to compute a classification array, then attach it as a coordinate or data variable.

Parameters:
  • func (callable) – A custom classification function: Either f(xr.Dataset) -> xr.DataArray or f(xr.DataArray) -> xr.DataArray. The returned DataArray must align/broadcast within the dataset.

  • output_name (str, default "class_id") – Name of the resulting coord or data var.

  • output_kind ({"coord","data"}, default "coord") –

    How to store the classification result. * “coord”: attach as a coordinate (good for fast selection, e.g. ds.sel(hs_class=2)

    or boolean masks).

    • ”data”: attach as a data variable (good if you’ll plot it, compute stats on it, or

      don’t need to use it as a selector).

  • input_var (str, optional) – If set, call func(self.data[input_var]); else func(self.data).

  • attrs (dict, optional) – Metadata to attach to the resulting DataArray.

  • **func_kwargs – Extra keyword arguments forwarded to func.

Returns:

A new dataset wrapper with the classification attached.

Return type:

xsnowDataset

See also

classify_citeria

A generic, boolean classification function.

Examples

>>> def by_hs_tertiles(hs: xr.DataArray) -> xr.DataArray:
...     c = xr.zeros_like(hs, dtype=np.int64)
...     c = c.where(hs <= 0.5, 1)     # > 0.5 --> at least 1
...     c = c.where(hs <= 1.5, 2)     # > 1.5 --> 2
...     c.attrs["mapping"] = {0: "shallow", 1: "medium", 2: "deep"}
...     return c
>>> ds2 = ds.classify(by_hs, output_name="hs_class", output_kind="coord", input_var="HS")
>>> ds2.sel(hs_class=2)
>>> def classify_by_bins(da: xr.DataArray, *, bins: np.ndarray, right: bool = True) -> xr.DataArray:
        '''
        Bin `da` into integer classes using np.digitize.
        - `bins` are bin *edges* (length B) → classes are 0..B (B+1 classes).
        - If `right=True`, intervals are (-inf, b0], (b0, b1], ..., (b_{B-1}, +inf).
        Returns an int64 DataArray with same shape/coords as `da`.
        '''
        def _digitize(x, *, bins, right=False):
            return np.digitize(x, bins, right=right)
        classes = xr.apply_ufunc(
            _digitize,
            da,
            dask="allowed",
            kwargs={"bins": np.asarray(bins), "right": right},
            output_dtypes=[np.int64],
        )
        return classes
>>> hs_bins = np.array([0.25, 0.50, 1.00, 1.50, 2.00])
>>> hs_mapping = {
        0: "(-inf, 0.25]",
        1: "(0.25, 0.50]"
        2: "(0.50, 1.00]",
        3: "(1.00, 1.50]",
        4: "(1.50, 2.00]",
        5: "(2.00, +inf)"
    }
>>> class_attrs = {
        "mapping": hs_mapping,
        "long_name": "Snow depth class (6 bins)",
        "method": "np.digitize",
        "bin_edges": hs_bins.tolist(),
        "right_closed": True,
    }
>>> ds2 = ds.classify(
        classify_by_bins,
        output_name="HS_class",
        output_kind="coord",
        input_var="HS",          # classifier gets ds["HS"]
        attrs=class_attrs,
        bins=hs_bins,            # forwarded to classify_by_bins()
        right=True,
    )
>>> ds_extremely_deep = ds2.sel(HS_class=5)
classify_criteria(criteria, *, name='classification_mask', kind='mask', attrs=None)#

Criteria-based boolean classification: build a boolean mask from a string expression. The mask can be returned, or attached as a coordinate/data variable.

Parameters:
  • criteria (str) – Boolean conditions chained with ‘&’ and/or ‘|’, e.g. “density > 300 & grain_size < 0.5”. (Minimal parser: left-to-right; no parentheses/precedence.)

  • name (str, default "classification_mask") – Name for the attached output.

  • kind ({"mask","coord","data"}, default "mask") –

    • “mask”: return the mask (xr.DataArray)

    • ”coord”: attach as a coordinate and return xsnowDataset

    • ”data”: attach as a data var and return xsnowDataset

  • attrs (dict, optional) – Attributes to add to the mask array.

Return type:

xr.DataArray | xsnowDataset

See also

classify

Generic, customizable classification into multiple classes

Examples

>>> mask = ds.classify_criteria("density < 200 & grain_size > 0.8", kind="mask")
>>> ds_sel = ds.sel(layer=mask)
>>> ds2 = ds.classify_criteria("HS > 1.5", name="deep_snow", kind="coord")
>>> ds2.sel(deep_snow=True)
combine_first(*args, **kwargs)#

Placeholder to prevent accidental use of xarray’s Dataset.combine_first on xsnowDataset.

xsnowDataset provides fill_gaps_with(…) for profile-aware gap filling without introducing new coordinates (except layers).

Raises:

NotImplementedError – always; directs users to xsnowDataset.fill_gaps_with().

Return type:

xsnowDataset

compute_critical_crack_length()#

Compute and attach the variable rc_flat (critical crack length on flat terrain, in meters) to this xsnowDataset.

This wraps the physics in compute_rho_slab() and compute_rc_flat_richter2019() and handles: - pulling data out of the xarray-backed dataset - computing layer thickness from cumulative height - dispatching to NumPy or CuPy depending on backend - wrapping the result back into xarray with metadata

Requirements#

The dataset must contain:
  • density [kg m^-3]

  • grain_size [mm]

  • shear_strength [kPa]

  • height [m], cumulative height from ground upward

We assume:
  • Dimension layer exists and indexes from ground (layer=0) upward.

  • height is monotonically increasing along layer.

  • Layers above the true snow surface are padded with NaN thickness.

Slope angle / aspect:
  • Slope angle (inclination) and aspect (azimuth) are NOT used here. This is a flat-terrain formulation (same as in earlier xsnow versions).

returns:

new_ds – A new xsnowDataset with a new data variable rc_flat [m].

rtype:

xsnowDataset

compute_rho_slab(thickness, xp)#

Compute the average slab density above each physical layer.

For each layer i, we define the “slab” as all layers strictly above i (shallower, closer to the snow surface). We return the mean density of that overlying slab. The topmost physical layer has no slab above, so it receives NaN. Padded layers above the real surface are also NaN.

Assumptions#

  • Axis -1 is the ‘layer’ dimension.

  • layer=0 is the deepest layer (near ground).

  • layer increases upward toward the snow surface.

  • After the true surface layer, remaining entries are padding with NaN in both density and thickness.

  • thickness is the physical thickness [m] of each layer. Zero/NaN thickness means “not a real layer” (padding).

type density:

param density:

Shape (…, L). Layer density [kg m^-3], ordered bottom (0) -> surface.

type density:

xp.ndarray

type thickness:

param thickness:

Shape (…, L). Layer thickness [m], same ordering and shape.

type thickness:

xp.ndarray

type xp:

param xp:

Array backend (e.g. numpy or cupy).

type xp:

module

returns:

rho_slab – Shape (…, L). For each layer i, rho_slab[i] is the mean density of all VALID layers above i. The surface layer itself (no overlying slab) is NaN. Padded layers beyond the surface remain NaN.

rtype:

xp.ndarray

Notes

  • We mask out any padded layers (thickness <= 0 or NaN) so they don’t contaminate the cumulative sums.

  • We reverse along the layer axis, do cumsums from the top down, then reverse back. Finally, we shift so that layer i gets only the slab strictly above i.

fill_gaps_with(other, *, compat='override', combine_attrs='override')#

Fill gaps in this dataset using another dataset, without introducing new coordinates (except ‘layers’).

Semantics#

  • Domain: left join on the core dims

    (location, time, slope, realization) → i.e., only the coordinates present in self are considered.

  • Preference: prefer self (leftmost) whenever both datasets have a valid profile at a given (location, time, slope, realization).

  • Gap filling:
    • If self has a valid profile (profile_status >= 1) at an index, its profile is kept entirely (no mixing of layers).

    • If self has no valid profile and other does, the full profile from other is used there.

    • If neither has a valid profile, self is kept (usually NaNs).

Extra variables & dimensions#

  • Variables that exist only in other are ignored where self has no corresponding coordinates (because we restrict the domain to self).

  • Non-core dimensions (anything beyond location/time/slope/realization) follow xarray’s default outer alignment for the variables that are actually being merged.

type other:

Union[xsnowDataset, Dataset]

param other:

Dataset from which gaps in self will be filled.

type other:

xsnowDataset or xarray.Dataset

type compat:

Literal['broadcast_equals', 'no_conflicts', 'override']

param compat:

Compatibility check for overlapping non-null values. (see xsnow.combine?)

type compat:

{“broadcast_equals”, “no_conflicts”, “override”}, optional

type combine_attrs:

Literal['drop', 'no_conflicts', 'drop_conflicts', 'override']

param combine_attrs:

How to reconcile attrs across inputs. (see xsnow.combine?)

type combine_attrs:

{“drop”, “no_conflicts”, “drop_conflicts”, “override”}, optional

returns:

A new dataset of the same wrapper type as self.

rtype:

xsnowDataset

mask_by_criteria(criteria)#

Convenience wrapper around classify_criteria(…, kind=’mask’): masks non-matching layers (i.e., NaN) while keeping shape/dimensions identical. If the criteria are layer-based, scalars are untouched, and vice versa.

Return type:

xsnowDataset

Returns:

  • Masked xsnowDataset with identical shape/dimensions as the input, but NaNs occur where

  • the criteria are not True.

overwrite_with(other, *, compat='override', combine_attrs='override')#

Overwrite this dataset with another dataset on the union of coordinates.

Semantics#

  • Domain: outer join (a.k.a union) on the core dims

    (location, time, slope, realization).

    New coordinates from other that don’t exist in self are added.

  • Preference: prefer `other` where both datasets have valid profiles. This is implemented by calling combine([other, self], join=’outer’), which uses the leftmost dataset (“other”) as the preferred source.

  • Profile-wise overwrite:
    • Wherever other has a valid profile (profile_status >= 1), its profile is used in full (all layers) at that index.

    • Where only self has a valid profile, self is used.

    • Where neither has a valid profile, self is kept.

Extra variables & dimensions#

  • Variables that exist only in other are added to the output dataset (on the union of coordinates).

  • Non-core dimensions (beyond location/time/slope/realization) are aligned using xarray’s standard outer-alignment behavior on a per-variable basis.

type other:

Union[xsnowDataset, Dataset]

param other:

Dataset whose information should overwrite or extend self. It is treated as “more recent / more authoritative” in overlaps.

type other:

xsnowDataset or xarray.Dataset

type compat:

Literal['broadcast_equals', 'no_conflicts', 'override']

param compat:

Compatibility check for overlapping non-null values. (see xsnow.combine?)

type compat:

{“broadcast_equals”, “no_conflicts”, “override”}, optional

type combine_attrs:

Literal['drop', 'no_conflicts', 'drop_conflicts', 'override']

param combine_attrs:

How to reconcile attrs across inputs. (see xsnow.combine?)

type combine_attrs:

{“drop”, “no_conflicts”, “drop_conflicts”, “override”}, optional

returns:

A new dataset of the same wrapper type as self.

rtype:

xsnowDataset

print(*, force_details=False, **kwargs)#

xsnow-aware print that can force detailed __repr__ on xsnowDataset objects.

Parameters:
  • *values (Any) – Values to print.

  • force_details (bool) – If True, call xsnowDataset.__repr__(force_details=True) on any xsnowDataset arguments.

  • **kwargs (Any) – Passed through to built-in print (sep, end, file, flush).

Return type:

None

to_caaml(path, **kwargs)#

Saves snow profile data to a CAAML V6.0 XML file. (Not Implemented)

to_crocus(path, **kwargs)#

Saves snow profile data to a Crocus model input file. (Not Implemented)

to_json(path, **kwargs)#

Saves the dataset to a structured JSON file. (Not Implemented)

to_netcdf(path, logger=None, **kwargs)#

Saves the dataset to a NetCDF file.

This method provides a convenient wrapper around the underlying xarray.Dataset.to_netcdf() method for easy caching and interoperability.

The location mapping dictionary is converted to NetCDF-compatible format by encoding it as JSON in a string attribute.

Parameters:
  • ds (xsnowDataset) – The dataset instance to save.

  • path (Union[str, Path]) – The destination file path for the .nc file.

  • logger (logging.Logger) – An optional preconfigured logger instance.

  • **kwargs – Additional keyword arguments passed to xarray.Dataset.to_netcdf().

to_pro(path, max_files=None, **kwargs)#

Saves a single snow profile to a SNOWPACK .pro file.

This function iterates through each timestamp in the dataset and writes the vertical profile data (variables with a ‘layer’ dimension) into the .pro format. It only supports writing data for a single location.

Parameters:
  • ds (xsnowDataset) – The dataset instance to save.

  • path (Union[str, Path]) – The destination file path for the .pro file.

  • max_files (int, optional) – The maximum number of profiles (timestamps) allowed. If the dataset contains more profiles, a ValueError is raised. Defaults to None (no limit).

  • **kwargs – Reserved for future filtering options.

Raises:

ValueError – If the dataset is empty, contains more than one location, or if the number of profiles exceeds max_files.

to_smet(path, max_files=None, **kwargs)#

Saves time-series data from the dataset to a SMET file.

This function extracts data variables that do not have a ‘layer’ dimension (e.g., meteorological data, total snow height) and writes them into the SMET format. It only supports writing data for a single location.

Parameters:
  • ds (xsnowDataset) – The dataset instance to save.

  • path (Union[str, Path]) – The destination file path for the .smet file.

  • max_files (int, optional) – The maximum number of locations allowed. If the dataset contains more locations, a ValueError is raised. Defaults to None (no limit).

  • **kwargs – Reserved for future filtering options.

update(*args, **kwargs)#

Placeholder to prevent accidental use of xarray’s Dataset.update on xsnowDataset.

Return type:

xsnowDataset

xsnowDataset provides purpose-built helpers for incremental workflows:
  • append_latest(…) for appending newer data or reloading from a cutoff.

  • overwrite_with(…) for preferring another dataset on overlaps.

Raises:

NotImplementedError – always; directs users to the xsnow-specific methods.

class xsnow.core.DatasetDecorator(base, *, data=None)#

Bases: xsnowDataset

A decorator that is also an xsnowDataset (subclassing), so it exposes the same API as the base wrapper, plus any extra methods/properties that will be added.

Initializes the xsnowDataset from an existing xarray.Dataset.

This is the low-level constructor for the class. The recommended, user-facing way to create an xsnowDataset from files is the top-level xsnow.read() function. This constructor is primarily used internally and by extension functions that return a new xsnowDataset instance.

Parameters:

source – An existing xarray.Dataset object.

property base#

Return the underlying xsnowDataset being decorated.