core module#
Defines the core xsnowDataset class for the xsnow package.
This module contains the primary user-facing object, xsnowDataset. It is designed as a high-level, convenient wrapper around a standard xarray.Dataset object, tailored for handling multi-dimensional snow profile data.
Core Design Philosophy#
The xsnowDataset class aims to simplify the user experience without sacrificing the power and flexibility of the underlying xarray data model. It achieves this through several key features:
Composition over Inheritance: This class contains an xarray.Dataset instance (in self.data) rather than inheriting from it. This design choice is more robust and prevents future updates to xarray from breaking this class, giving us full control over the user-facing API.
Simplified Initialization: The constructor is designed to accept a pre-existing xarray.Dataset. The primary, user-facing way to create a dataset from files is the top-level xsnow.read() function. This makes the extension architecture clean and powerful.
Seamless `xarray` Integration: Through dunder forwarding (__getattr__, __getitem__, __setitem__), the class mimics the behavior of an xarray.Dataset, allowing users to interact with it using the familiar and powerful xarray API (e.g., ds.sel(…), ds[‘density’]).
Automatic Re-wrapping: When a forwarded method returns a new xarray.Dataset, the class automatically wraps the result back into an xsnowDataset instance. This ensures that the user can chain operations (e.g., ds.sel(…).mean(…)) and always retain the convenience of the xsnowDataset wrapper.
Vertical Coordinate System#
The package uses a physically intuitive vertical coordinate system. - z: The primary vertical coordinate, representing depth below the snow
surface in meters. The surface is z=0, and values are negative downwards.
height: A data variable representing the original height above the ground in meters.
- class xsnow.core.xsnowDataset(source)#
Bases:
objectA class to represent and interact with a multi-dimensional snow profile dataset.
This class acts as a high-level wrapper around an xarray.Dataset, providing a convenient API for analysis, I/O, and data manipulation. It forwards unrecognized attribute and method calls to the underlying xarray.Dataset, enabling seamless use of the powerful xarray API.
Initializes the xsnowDataset from an existing xarray.Dataset.
This is the low-level constructor for the class. The recommended, user-facing way to create an xsnowDataset from files is the top-level xsnow.read() function. This constructor is primarily used internally and by extension functions that return a new xsnowDataset instance.
- Parameters:
source (
Dataset) – An existing xarray.Dataset object.
- copy(deep=False)#
Returns a copy of the dataset.
This method provides a convenient wrapper around the underlying xarray.Dataset.copy() method and ensures the result is a new xsnowDataset instance.
- Parameters:
deep (
bool) – If True, the data and metadata are fully copied. If False (default), only the metadata is copied, and the new dataset’s variables will still point to the original data buffers.- Return type:
- Returns:
A new xsnowDataset instance.
Examples
>>> ds = xsnow.read('path/to/data') >>> # Create a shallow copy (changes to ds_shallow affect ds) >>> ds_shallow = ds.copy(deep=False) >>> # Create a deep copy (a completely independent object) >>> ds_deep = ds.copy(deep=True)
- merge_smet(source, var_prefix=None, **kwargs)#
Reads SMET data and merges it into the dataset.
An optional prefix can be added to the variable names from the SMET file to avoid naming conflicts (e.g., when merging input forcing data with model output data).
- Parameters:
source (
Union[str,List[str],Path,List[Path]]) – Path to a .smet file, a directory, or a list of files.var_prefix (Optional[str]) – A string prefix to add to all data variable names read from the SMET files (e.g., ‘input.’).
**kwargs – Additional keyword arguments passed to the underlying xsnow.read() function (e.g., time selectors).
- Return type:
- Returns:
A new xsnowDataset with the combined data.
- to_xarray()#
Returns the underlying xarray.Dataset.
This method unwraps the xsnowDataset and provides direct access to the core xarray.Dataset object. This is useful for interoperability with other libraries or functions that are built to work with standard xarray objects.
- Return type:
Dataset- Returns:
The stored xarray.Dataset object.
Examples
>>> ds = xsnow.read('path/to/data') >>> # Get the raw xarray object for use with another library >>> xr_ds = ds.to_xarray()
- compute_hs()#
Computes the total snow height (HS) from layer heights.
This method calculates the maximum value of the ‘height’ variable along the ‘layer’ dimension for each profile. The result is added to the dataset as a new data variable named ‘HS’ (Total Snow Height).
The resulting ‘HS’ variable will not have a ‘layer’ dimension.
- Return type:
- Returns:
A new xsnowDataset instance with the ‘HS’ variable added.
- Raises:
ValueError – If the dataset is empty or the ‘height’ variable is missing.
Examples
>>> ds = xsnow.read('path/to/data') >>> ds_with_hs = ds.compute_hs() >>> print(ds_with_hs['HS'])
- compute_z_depth()#
Calculates the depth below the surface (z) for each snowpack layer and adds it as a non-dimension coordinate.
This method first computes the total snow height (HS) and then calculates z as (height - HS), adding it as a ‘z’ data coordinate to the dataset. Metadata is pulled from the central registry.
- Return type:
- Returns:
A new xsnowDataset instance with the ‘z’ coordinate added.
- update_profile_status()#
Attach or update profile_status with very simple semantics: :rtype:
Dataset-1 : error during parsing (preserved as-is) 0 : no snow layer data 1 : existing snow layer data (this may include “no snow”, see below!)
Logic#
1. Start from existing profile_status if present, otherwise from 0 over the core dims (location, time, slope, realization that exist in the dataset).
Negative values (< 0) are *never* changed by this method.
2. For profile_status to be positive after this call, the dataset must contain the vertical (layer) variable z and
have negative z values (i.e., snow layers), or
NaN z values (i.e., no snow layers) but a pre-existing profile_status==1, or
NaN z values (i.e., no snow layers) but an existing HS==0.
Note that the profile_status counts *no snow* as existing layer information if the previous criteria are met.
If z doesn’t exists, all non-negative profile_status entries are set to 0 (no layer data). Similarly, if z exists with all-NaN and HS does not exist or is NaN, the status is set to 0.
- profile_status_summary()#
Print a summary table of profile_status occurrences
- Return type:
None
- to_gpu()#
Moves the dataset’s data to the GPU using CuPy if available.
This method checks for an available GPU and the cupy library. If found, it converts the underlying NumPy arrays to CuPy arrays for accelerated computation. If cupy is not installed or a GPU is not available, it returns the original dataset without modification.
- Return type:
- Returns:
A new xsnowDataset instance with data on the GPU, or the original object if a GPU is not available.
Examples
>>> ds_cpu = xsnow.read('path/to/data') >>> # Move to GPU for fast computation >>> ds_gpu = ds_cpu.to_gpu()
- to_cpu()#
Moves the dataset’s data back to the CPU (NumPy).
This is necessary before performing operations that are not supported on the GPU, such as saving to NetCDF. If the data is already on the CPU, it returns the original dataset.
- Return type:
- Returns:
A new xsnowDataset instance with data on the CPU.
Examples
>>> ds_gpu = ds_cpu.to_gpu() >>> # ... perform GPU-accelerated analysis ... >>> ds_cpu_again = ds_gpu.to_cpu() >>> ds_cpu_again.to_netcdf('results.nc')
- append_latest(source, from_time=None, join='left', logger=None, **kwargs)#
Incrementally extend a dataset by appending newer timestamps from files.
This method is a convenience wrapper for time selection, read, and concat. Semantics:
- Determine a start time:
If from_time is None, use (max(existing time) + 1s).
If from_time is provided, drop all times <= from_time from existing_ds and start reading from from_time (inclusive).
Read source with that time filter.
- Concatenate along time using xsnow.concat.
join controls non-time dims (default: ‘left’ keeps existing domains).
Time is always the union along the concat axis (no extra reindexing).
- Parameters:
existing_ds (
xsnowDataset) – The dataset instance to extend.source (
Union[str,List[str],Path]) – Path(s) to files or directories to read new data from (calls read upon it).from_time (
Union[str,datetime,Timestamp,datetime64,None]) – Subset existing data to times < from_time and read new data starting with from_time (this will already be new data). (also read Semantics above) Accepts str/pandas/np datetime or datetime.join (
str) – Join mode for non-time dimensions forwarded to xsnow.concat.logger (
Optional[Logger]) – Optional logger to use for status/warning messages. If None, a logger is created.**kwargs – Additional keyword args forwarded to xsnow.concat like compat or combine_attrs.
- Return type:
- Returns:
xsnowDataset with the appended data (or trimmed-only if nothing new).
- classify(func, *, output_name='class_id', output_kind='coord', input_var=None, attrs=None, **func_kwargs)#
General classification: run a user function to compute a classification array, then attach it as a coordinate or data variable.
- Parameters:
func (callable) – A custom classification function: Either f(xr.Dataset) -> xr.DataArray or f(xr.DataArray) -> xr.DataArray. The returned DataArray must align/broadcast within the dataset.
output_name (str, default "class_id") – Name of the resulting coord or data var.
output_kind ({"coord","data"}, default "coord") –
How to store the classification result. * “coord”: attach as a coordinate (good for use with where(…), e.g.
ds.where(ds.hs_class == 2, drop=True), or for grouping).
- ”data”: attach as a data variable (good if you’ll plot it, compute stats on it, or
don’t need to use it as a selector).
input_var (str, optional) – If set, call func(self.data[input_var]); else func(self.data).
attrs (dict, optional) – Metadata to attach to the resulting DataArray.
**func_kwargs – Extra keyword arguments forwarded to func.
- Returns:
A new dataset wrapper with the classification attached.
- Return type:
See also
classify_citeriaA generic, boolean classification function.
classify_by_binsBin a variable into integer classes (template classifier).
Examples
>>> def by_hs_tertiles(hs: xr.DataArray) -> xr.DataArray: ... c = xr.zeros_like(hs, dtype=np.int64) ... c = c.where(hs <= 0.5, 1) # > 0.5 --> at least 1 ... c = c.where(hs <= 1.5, 2) # > 1.5 --> 2 ... c.attrs["mapping"] = {0: "shallow", 1: "medium", 2: "deep"} ... return c >>> ds2 = ds.classify(by_hs, output_name="hs_class", output_kind="coord", input_var="HS") >>> ds2.where(ds2.hs_class == 2, drop=True)
>>> from xsnow.extensions.classification import classify_by_bins >>> hs_bins = np.array([0.25, 0.50, 1.00, 1.50, 2.00]) >>> hs_mapping = { 0: "(-inf, 0.25]", 1: "(0.25, 0.50]", 2: "(0.50, 1.00]", 3: "(1.00, 1.50]", 4: "(1.50, 2.00]", 5: "(2.00, +inf)" } >>> class_attrs = { "mapping": hs_mapping, "long_name": "Snow depth class (6 bins)", "method": "np.digitize", "bin_edges": hs_bins.tolist(), "right_closed": True, } >>> ds2 = ds.classify( classify_by_bins, output_name="HS_class", output_kind="coord", input_var="HS", # classifier gets ds["HS"] attrs=class_attrs, bins=hs_bins, # forwarded to classify_by_bins() right=True, ) >>> ds_extremely_deep = ds2.where(ds2.HS_class == 5, drop=True)
- classify_criteria(criteria, *, name='classification_mask', kind='mask', attrs=None)#
Criteria-based boolean classification: build a boolean mask from a string expression. The mask can be returned, or attached as a coordinate/data variable.
- Parameters:
criteria (str) – Boolean conditions chained with ‘&’ and/or ‘|’, e.g. “density > 300 & grain_size < 0.5”. (Minimal parser: left-to-right; no parentheses/precedence.)
name (str, default "classification_mask") – Name for the attached output.
kind ({"mask","coord","data"}, default "mask") –
“mask”: return the mask (xr.DataArray)
”coord”: attach as a coordinate and return xsnowDataset
”data”: attach as a data var and return xsnowDataset
attrs (dict, optional) – Attributes to add to the mask array.
- Return type:
xr.DataArray | xsnowDataset
See also
classifyGeneric, customizable classification into multiple classes
Examples
>>> mask = ds.classify_criteria("density < 200 & grain_size > 0.8", kind="mask") >>> ds_sel = ds.where(mask, drop=True)
>>> ds2 = ds.classify_criteria("HS > 1.5", name="deep_snow", kind="coord") >>> ds2.where(ds2.deep_snow, drop=True)
- classify_spatially(polygons, *, name='polygon', kind='coord', attrs=None, outside_label='outside', polygons_crs=None, feature_name_field='id')#
Spatial classification of points into named polygons/regions.
This attaches a region label (or returns a per-polygon mask) based on the dataset’s horizontal coordinates. Typical use-cases:
select all data from a specific region
aggregate over regions via
groupby
The dataset CRS is taken from
self.attrs['crs']. If it is EPSG:4326, this method uses thelongitude/latitudecoordinates. Otherwise it expects projected coordinateseasting/northing.If the polygon CRS differs from the dataset CRS, polygons are transformed to the dataset CRS using
pyproj(optional dependency).- Parameters:
polygons (
Any) –- One of:
path to a GeoJSON file (.geojson/.json)
GeoJSON mapping (FeatureCollection/Feature/Polygon/MultiPolygon)
WKT string (POLYGON/MULTIPOLYGON)
mapping of
{name: wkt}sequence of WKT strings (names auto: poly_0, poly_1, …)
sequence of
(name, wkt)pairs
Coordinate order is assumed to be
(x, y). For EPSG:4326 this means(longitude, latitude); for projected CRSs,(easting, northing).name (str, default "polygon") – Name for the attached label coordinate/data var (or the returned mask name).
kind ({"mask","coord","data"}, default "coord") –
“mask”: return a boolean mask with dims
('polygon', <point-dims...>)”coord”: attach a string label per point as a coordinate and return xsnowDataset
”data”: attach a string label per point as a data variable and return xsnowDataset
attrs (dict, optional) – Metadata to attach to the output.
outside_label (str, default "outside") – Label assigned to points not contained in any provided polygon.
polygons_crs (str, optional) – CRS of the input polygons (e.g.
"EPSG:4326"). If not provided, the CRS is only auto-detected for GeoJSON that contains an explicit (deprecated) top-level"crs"member. If the polygon CRS cannot be identified unambiguously, this method raises.feature_name_field (str, default "id") – For GeoJSON inputs, this field is read from each feature’s
propertiesto determine the polygon/region name. If absent, it falls back to common alternatives (name,id,title,label) and finallypoly_<i>.
- Return type:
xr.DataArray | xsnowDataset
Examples
>>> import xarray as xr >>> import xsnow >>> xs = xsnow.sample_data.snp_gridded_ds().compute() >>> xs.longitude.values array([11.191828, 11.286182, 11.2973 , 11.392236, 11.5008 ]) >>> xs.latitude.values array([47.146392, 47.436152, 47.148326, 47.437999, 47.367788]) >>> xs.altitude.values array([2372., 1749., 1860., 1801., 2066.])
Create two regions (rectangles) and leave one location outside:
>>> regions = { ... "valley": "POLYGON ((11.15 47.12, 11.15 47.17, 11.33 47.17, 11.33 47.12, 11.15 47.12))", ... "ridge": "POLYGON ((11.25 47.42, 11.25 47.46, 11.43 47.46, 11.43 47.42, 11.25 47.42))", ... } >>> xs2 = xs.classify_spatially(regions, name="region", kind="coord", polygons_crs="EPSG:4326") >>> xs_valley = xs2.where(xs2.region == "valley", drop=True)
Group by region and slope (slope is already an existing coordinate):
>>> xs_stats = xs2.groupby(["region", "slope"]).mean()
Group by region, slope, and an elevation band (>=2000 m) created via
classify_criteria:>>> xs3 = xs2.classify_criteria( ... "altitude >= 2000", ... name="band", ... kind="coord", ... attrs={"mapping": {False: "<2000m", True: ">=2000m"}}, ... ) >>> xs_stats2 = xs3.groupby(["region", "slope", "band"]).mean()
- combine_first(*args, **kwargs)#
Placeholder to prevent accidental use of xarray’s Dataset.combine_first on xsnowDataset.
xsnowDataset provides fill_gaps_with(…) for profile-aware gap filling without introducing new coordinates (except layers).
- Raises:
NotImplementedError – always; directs users to xsnowDataset.fill_gaps_with().
- Return type:
- compute_critical_crack_length()#
Compute and attach the variable rc_flat (critical crack length on flat terrain, in meters) to this xsnowDataset.
This wraps the physics in compute_rho_slab() and compute_rc_flat_richter2019() and handles: - pulling data out of the xarray-backed dataset - computing layer thickness from cumulative height - dispatching to NumPy or CuPy depending on backend - wrapping the result back into xarray with metadata
Requirements#
- The dataset must contain:
density [kg m^-3]
grain_size [mm]
shear_strength [kPa]
height [m], cumulative height from ground upward
- We assume:
Dimension layer exists and indexes from ground (layer=0) upward.
height is monotonically increasing along layer.
Layers above the true snow surface are padded with NaN thickness.
- Slope angle / aspect:
Slope angle (inclination) and aspect (azimuth) are NOT used here. This is a flat-terrain formulation (same as in earlier xsnow versions).
- returns:
new_ds – A new xsnowDataset with a new data variable rc_flat [m].
- rtype:
xsnowDataset
- compute_rho_slab(thickness, xp)#
Compute the average slab density above each physical layer.
For each layer i, we define the “slab” as all layers strictly above i (shallower, closer to the snow surface). We return the mean density of that overlying slab. The topmost physical layer has no slab above, so it receives NaN. Padded layers above the real surface are also NaN.
Assumptions#
Axis -1 is the ‘layer’ dimension.
layer=0 is the deepest layer (near ground).
layer increases upward toward the snow surface.
After the true surface layer, remaining entries are padding with NaN in both density and thickness.
thickness is the physical thickness [m] of each layer. Zero/NaN thickness means “not a real layer” (padding).
- type density:
- param density:
Shape (…, L). Layer density [kg m^-3], ordered bottom (0) -> surface.
- type density:
xp.ndarray
- type thickness:
- param thickness:
Shape (…, L). Layer thickness [m], same ordering and shape.
- type thickness:
xp.ndarray
- type xp:
- param xp:
Array backend (e.g. numpy or cupy).
- type xp:
module
- returns:
rho_slab – Shape (…, L). For each layer i, rho_slab[i] is the mean density of all VALID layers above i. The surface layer itself (no overlying slab) is NaN. Padded layers beyond the surface remain NaN.
- rtype:
xp.ndarray
Notes
We mask out any padded layers (thickness <= 0 or NaN) so they don’t contaminate the cumulative sums.
We reverse along the layer axis, do cumsums from the top down, then reverse back. Finally, we shift so that layer i gets only the slab strictly above i.
- fill_gaps_with(other, *, compat='override', combine_attrs='override')#
Fill gaps in this dataset using another dataset, without introducing new coordinates (except ‘layers’).
Semantics#
- Domain: left join on the core dims
(location, time, slope, realization) → i.e., only the coordinates present in self are considered.
Preference: prefer self (leftmost) whenever both datasets have a valid profile at a given (location, time, slope, realization).
- Gap filling:
If self has a valid profile (profile_status >= 1) at an index, its profile is kept entirely (no mixing of layers).
If self has no valid profile and other does, the full profile from other is used there.
If neither has a valid profile, self is kept (usually NaNs).
Extra variables & dimensions#
Variables that exist only in other are ignored where self has no corresponding coordinates (because we restrict the domain to self).
Non-core dimensions (anything beyond location/time/slope/realization) follow xarray’s default outer alignment for the variables that are actually being merged.
- type other:
Union[xsnowDataset,Dataset]- param other:
Dataset from which gaps in self will be filled.
- type other:
xsnowDataset or xarray.Dataset
- type compat:
Literal['broadcast_equals','no_conflicts','override']- param compat:
Compatibility check for overlapping non-null values. (see xsnow.combine?)
- type compat:
{“broadcast_equals”, “no_conflicts”, “override”}, optional
- type combine_attrs:
Literal['drop','no_conflicts','drop_conflicts','override']- param combine_attrs:
How to reconcile attrs across inputs. (see xsnow.combine?)
- type combine_attrs:
{“drop”, “no_conflicts”, “drop_conflicts”, “override”}, optional
- returns:
A new dataset of the same wrapper type as self.
- rtype:
xsnowDataset
- mask_by_criteria(criteria)#
Convenience wrapper around classify_criteria(…, kind=’mask’): masks non-matching layers (i.e., NaN) while keeping shape/dimensions identical. If the criteria are layer-based, scalars are untouched, and vice versa.
- Return type:
- Returns:
Masked xsnowDataset with identical shape/dimensions as the input, but NaNs occur where
the criteria are not True.
- overwrite_with(other, *, compat='override', combine_attrs='override')#
Overwrite this dataset with another dataset on the union of coordinates.
Semantics#
- Domain: outer join (a.k.a union) on the core dims
(location, time, slope, realization).
New coordinates from other that don’t exist in self are added.
Preference: prefer `other` where both datasets have valid profiles. This is implemented by calling combine([other, self], join=’outer’), which uses the leftmost dataset (“other”) as the preferred source.
- Profile-wise overwrite:
Wherever other has a valid profile (profile_status >= 1), its profile is used in full (all layers) at that index.
Where only self has a valid profile, self is used.
Where neither has a valid profile, self is kept.
Extra variables & dimensions#
Variables that exist only in other are added to the output dataset (on the union of coordinates).
Non-core dimensions (beyond location/time/slope/realization) are aligned using xarray’s standard outer-alignment behavior on a per-variable basis.
- type other:
Union[xsnowDataset,Dataset]- param other:
Dataset whose information should overwrite or extend self. It is treated as “more recent / more authoritative” in overlaps.
- type other:
xsnowDataset or xarray.Dataset
- type compat:
Literal['broadcast_equals','no_conflicts','override']- param compat:
Compatibility check for overlapping non-null values. (see xsnow.combine?)
- type compat:
{“broadcast_equals”, “no_conflicts”, “override”}, optional
- type combine_attrs:
Literal['drop','no_conflicts','drop_conflicts','override']- param combine_attrs:
How to reconcile attrs across inputs. (see xsnow.combine?)
- type combine_attrs:
{“drop”, “no_conflicts”, “drop_conflicts”, “override”}, optional
- returns:
A new dataset of the same wrapper type as self.
- rtype:
xsnowDataset
- print(*, force_details=False, **kwargs)#
xsnow-aware print that can force detailed __repr__ on xsnowDataset objects.
- Parameters:
*values (
Any) – Values to print.force_details (
bool) – If True, call xsnowDataset.__repr__(force_details=True) on any xsnowDataset arguments.**kwargs (
Any) – Passed through to built-in print (sep, end, file, flush).
- Return type:
None
- reorder_dims(preferred_order=('location', 'time', 'slope', 'realization', 'layer'), last_dims=None)#
Reorder dataset dimensions without dropping data, coords, or attrs.
- Parameters:
ds (xsnowDataset or xr.Dataset) – Dataset to reorder.
preferred_order (sequence of str) – Dimensions to place first in the given order.
last_dims (sequence of str, optional) – Dimensions to force to the end in the given order.
- Return type:
Union[xsnowDataset,Dataset]
Notes
Any dimensions not in
preferred_orderorlast_dimskeep their original relative order and are placed between the two groups.All coordinates, data variables, attrs, and encoding are preserved.
Rebuilds the dataset to enforce deterministic discovery order for
ds.dims.
- to_caaml(path, **kwargs)#
Saves snow profile data to a CAAML V6.0 XML file. (Not Implemented)
- to_crocus(path, **kwargs)#
Saves snow profile data to a Crocus model input file. (Not Implemented)
- to_json(path, **kwargs)#
Saves the dataset to a structured JSON file. (Not Implemented)
- to_netcdf(path, logger=None, **kwargs)#
Saves the dataset to a NetCDF file. This method provides a convenient wrapper around the underlying xarray.Dataset.to_netcdf() method for easy caching and interoperability.
Automatically converts slope-varying coordinates (azimuth, aspect, inclination) to data variables to ensure files are mergeable without conflicts.
The location mapping dictionary is converted to NetCDF-compatible format by encoding it as JSON in a string attribute.
- Parameters:
ds (xsnowDataset) – The dataset instance to save.
path (Union[str, Path]) – The destination file path for the .nc file.
logger (logging.Logger) – An optional preconfigured logger instance.
**kwargs – Additional keyword arguments passed to xarray.Dataset.to_netcdf().
- to_pro(path, max_files=None, **kwargs)#
Saves a single snow profile to a SNOWPACK .pro file.
This function iterates through each timestamp in the dataset and writes the vertical profile data (variables with a ‘layer’ dimension) into the .pro format. It only supports writing data for a single location.
- Parameters:
ds (xsnowDataset) – The dataset instance to save.
path (Union[str, Path]) – The destination file path for the .pro file.
max_files (int, optional) – The maximum number of profiles (timestamps) allowed. If the dataset contains more profiles, a ValueError is raised. Defaults to None (no limit).
**kwargs – Reserved for future filtering options.
- Raises:
ValueError – If the dataset is empty, contains more than one location, or if the number of profiles exceeds max_files.
- to_smet(path, max_files=None, **kwargs)#
Saves time-series data from the dataset to a SMET file.
This function extracts data variables that do not have a ‘layer’ dimension (e.g., meteorological data, total snow height) and writes them into the SMET format. It only supports writing data for a single location.
- Parameters:
ds (xsnowDataset) – The dataset instance to save.
path (Union[str, Path]) – The destination file path for the .smet file.
max_files (int, optional) – The maximum number of locations allowed. If the dataset contains more locations, a ValueError is raised. Defaults to None (no limit).
**kwargs – Reserved for future filtering options.
- to_zarr(path, mode='w', append_dim=None, consolidated=True, **kwargs)#
Save dataset to Zarr format with automatic slope coordinate handling.
Zarr format provides better stability and performance for large datasets compared to NetCDF, with pure Python implementation (no HDF5 crashes).
This function automatically converts 1D slope coordinates (azimuth, inclination) to 2D data variables with (location, slope) dimensions, ensuring compatibility with the rest of xsnow and avoiding merge conflicts.
- Parameters:
ds (
xsnowDataset) – The xsnowDataset to savepath (
Union[str,Path]) – Output path (should end in .zarr)mode (
Literal['w','w-','a','a-','r+','r']) – Write mode - ‘w’ (overwrite), ‘a’ (append), ‘w-’ (fail if exists)append_dim (
Optional[str]) – Dimension to append along (e.g., ‘time’ for time-series updates)consolidated (
bool) – Whether to consolidate metadata (improves read performance)**kwargs – Additional arguments passed to xarray.Dataset.to_zarr()
- Return type:
None- Returns:
None
Examples
>>> # Save complete dataset >>> ds.to_zarr('archive_2024.zarr')
>>> # Append new data along time dimension >>> new_data.to_zarr('archive_2024.zarr', mode='a', append_dim='time')
>>> # Create with custom compression >>> import numcodecs >>> encoding = {var: {'compressor': numcodecs.Blosc(cname='zstd', clevel=5)} ... for var in ds.data_vars} >>> ds.to_zarr('archive.zarr', encoding=encoding)
See also
read_zarr : Read Zarr archives to_netcdf : Save to NetCDF format
- trim_padding(dims=None, keep_buffer=0, inplace=False)#
Remove NaN padding from layer dimension to reduce memory usage.
After reading files with max_layers padding, many trailing layers may be entirely NaN. This method detects the actual maximum layer with data and trims the excess.
IMPORTANT: For lazy-loaded datasets, this will trigger computation of all dask arrays to check for NaN values. For large datasets, consider using eager loading if you plan to trim padding, or trim after subsetting: ds.sel(location=’X’).compute().trim_padding()
- Parameters:
ds (
xsnowDataset) – The dataset instance to trimdims (
Optional[List[str]]) – List of dimensions to trim. Default: [‘layer’]keep_buffer (
int) – Number of extra layers to keep beyond the maximum non-NaN layer. Useful if you expect to add more data later. Default: 0 (trim to exact maximum)inplace (
bool) – If True, modify dataset in place. If False (default), return a new dataset.
- Return type:
- Returns:
Trimmed xsnowDataset (new instance if inplace=False)
Examples
>>> # Dataset padded to 50 layers, but only 12 have data >>> ds.sizes['layer'] 50 >>> ds_trimmed = ds.trim_padding() >>> ds_trimmed.sizes['layer'] 12
>>> # Keep 2 extra layers as buffer >>> ds_trimmed = ds.trim_padding(keep_buffer=2) >>> ds_trimmed.sizes['layer'] 14
>>> # Trim in place >>> ds.trim_padding(inplace=True) >>> ds.sizes['layer'] 12
Notes
Only works on dimensions where padding is at the end (e.g., layer)
Checks all data variables that have the specified dimension
For lazy datasets, this triggers computation - consider eager loading
- update(*args, **kwargs)#
Placeholder to prevent accidental use of xarray’s Dataset.update on xsnowDataset.
- Return type:
- xsnowDataset provides purpose-built helpers for incremental workflows:
append_latest(…) for appending newer data or reloading from a cutoff.
overwrite_with(…) for preferring another dataset on overlaps.
- Raises:
NotImplementedError – always; directs users to the xsnow-specific methods.
- class xsnow.core.DatasetDecorator(base, *, data=None)#
Bases:
xsnowDatasetBase class for extension “decorators” that layer functionality onto an
xsnowDataset.A
DatasetDecoratoris itself anxsnowDataset(it subclasses it) so it exposes the normal wrapper API (and the underlyingxarray.Datasetviaself.data), but it also keeps a reference to another dataset wrapper inself.base.The
basereference is what makes the decorator chain work: a decorator can wrap either a plainxsnowDatasetor anotherDatasetDecorator. This allows multiple extensions to be stacked while still delegating higher-level functionality to the wrapped object.The keyword-only
dataparameter in__init__is intentionally not part of the public end-user API. It exists so internal machinery (primarily_rewrapand__getattr__forwarding) can create a new decorated instance around a new dataset produced by an xarray operation (e.g.sel,drop_vars,assign) without mutating any existing instances.Initialize a decorator instance.
- Parameters:
base (
xsnowDataset) –The wrapped dataset object (either a plain
xsnowDatasetor anotherDatasetDecorator).baseis required so that:attribute access can delegate to the already-configured wrapper chain (including previously applied extensions), and
_rewrap()can rebuild that chain around a newxarray.Datasetwhen an xarray operation returns a new dataset.
data (
Union[Dataset,xsnowDataset,None]) –The concrete
xarray.Datasetto expose asself.datafor this wrapper instance. When omitted, the decorator usesbase.data.This is primarily an internal hook used by
_rewrap(): when an operation produces a new dataset, the decorator chain is rebuilt withbaseupdated to wrap the new dataset, anddatais passed explicitly so that the newly returned object consistently points at that same dataset.For convenience,
datamay also be anxsnowDataset(including another decorator instance). In that case its underlying.datais used.
- property base: xsnowDataset#
Return the underlying xsnowDataset being decorated.