{ "cells": [ { "cell_type": "markdown", "id": "eb2e616b", "metadata": {}, "source": [ "# Grain types and other labeled data\n", "\n", "For efficiency reasons, data variables across a large dataset are best handled as numeric codes. To ease interpreting numeric codes, `xsnow` implements strategies for translating these codes into human-readable labels. \n", "\n", "Grain types are a stereotypical example for this concept. While the International Classification for Seasonal Snow on the Ground (ICSSG) lists grain type names (and acronyms) like *Precipitation Particles* (PP), *Faceted Crystals* (FC), and more, SNOWPACK uses a one-to-three-digit code to describe the *(primary grain shape, secondary grain shape, grain shape cycle)*.\n", "\n", "**Accessors** translate those codes to human-readable labels on demand without mutating the stored data:\n", "\n", "* `.icssg.*`---accessor for translating grain codes to ICSSG acronyms.\n", "* `.labels.get()`---generic accessor for custom codes whose mappings are stored in the metadata." ] }, { "cell_type": "code", "execution_count": 1, "id": "6d9fe671", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[i] xsnow.xsnow_io: Loading 2 datasets eagerly with 13 workers...\n", "[i] xsnow.utils: Slope coordinate 'inclination' varies by location. Preserving (location, slope) dimensions as allow_per_location=True.\n", "[i] xsnow.utils: Slope coordinate 'azimuth' varies by location. Preserving (location, slope) dimensions as allow_per_location=True.\n" ] } ], "source": [ "import xsnow\n", "xs = xsnow.single_profile_timeseries()" ] }, { "cell_type": "markdown", "id": "885575b9", "metadata": {}, "source": [ "## Grain codes vs labels\n", "The `icssg` accessor can be applied to all variables with grain type units known to `xsnow`. For example, in our sample dataset, this would be the data variable *grain_type*:" ] }, { "cell_type": "code", "execution_count": 2, "id": "f4d8e9c5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Swiss Code F1F2F3'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xs[\"grain_type\"].attrs[\"units\"]" ] }, { "cell_type": "markdown", "id": "89b8f17c", "metadata": {}, "source": [ "The `icssg` accessor has two methods, `primary()` and `secondary()`, that extract the two labels from one grain code:" ] }, { "cell_type": "code", "execution_count": 3, "id": "eaf5355b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Grain codes: [[770. 770. 772. 772. 330. 330. 330. 330. 330. 330. 330. 330. 330. 330.\n", " 330. 330. 330. 330. 772. 440. 440. 440. 440. 440. 772. 772. 450. 450.\n", " 450. 450. 450. 440. 440. 440. 440. 440. 440. 440. 440. 660. 450. 431.\n", " 440. 430. 330. 330. 330. 330. 341. 341. 990. 990. 330. 772. 772. 341.\n", " 230. 230. 110.]]\n", "Primary labels: [['MF' 'MF' 'MFcr' 'MFcr' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG'\n", " 'RG' 'RG' 'RG' 'RG' 'RG' 'MFcr' 'FC' 'FC' 'FC' 'FC' 'FC' 'MFcr' 'MFcr'\n", " 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'SH'\n", " 'FC' 'FC' 'FC' 'FC' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'FCxr' 'FCxr' 'RG'\n", " 'MFcr' 'MFcr' 'RG' 'DF' 'DF' 'PP']]\n", "Secondary labels: [['MF' 'MF' 'MF' 'MF' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG' 'RG'\n", " 'RG' 'RG' 'RG' 'RG' 'MF' 'FC' 'FC' 'FC' 'FC' 'FC' 'MF' 'MF' 'DH' 'DH'\n", " 'DH' 'DH' 'DH' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'FC' 'SH' 'DH' 'RG'\n", " 'FC' 'RG' 'RG' 'RG' 'RG' 'RG' 'FC' 'FC' 'FCxr' 'FCxr' 'RG' 'MF' 'MF'\n", " 'FC' 'RG' 'RG' 'PP']]\n" ] } ], "source": [ "# Subset to one profile\n", "profile = xs.isel(time=-1, slope=0, realization=0)\n", "\n", "grain_codes = profile.grain_type\n", "grain_types_pri = grain_codes.icssg.primary()\n", "grain_types_sec = grain_codes.icssg.secondary()\n", "\n", "print(\"Grain codes:\", grain_codes.to_numpy())\n", "print(\"Primary labels:\", grain_types_pri.to_numpy())\n", "print(\"Secondary labels:\", grain_types_sec.to_numpy())" ] }, { "cell_type": "markdown", "id": "72c52a61", "metadata": {}, "source": [ "You can create new data variables that store the labels:" ] }, { "cell_type": "code", "execution_count": 4, "id": "8320aeda", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Size: 180kB\n", "array([[[[['MF', 'MF', 'MFcr', ..., None, None, None]]],\n", "\n", "\n", " [[['MF', 'MF', 'MFcr', ..., None, None, None]]],\n", "\n", "\n", " [[['MF', 'MF', 'MFcr', ..., None, None, None]]],\n", "\n", "\n", " ...,\n", "\n", "\n", " [[['MF', 'MF', 'MFcr', ..., 'DF', 'DF', 'PP']]],\n", "\n", "\n", " [[['MF', 'MF', 'MFcr', ..., 'DF', 'DF', 'PP']]],\n", "\n", "\n", " [[['MF', 'MF', 'MFcr', ..., 'DF', 'DF', 'PP']]]]],\n", " shape=(1, 381, 1, 1, 59), dtype=object)\n", "Coordinates:\n", " altitude (location) float64 8B 1.681e+03\n", " azimuth (location, slope) int64 8B 270\n", " inclination (location, slope) int64 8B 16\n", " latitude (location) float64 8B 47.1\n", " * location (location) \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'profile_status' (location: 1, time: 381, slope: 1,\n",
       "                                    realization: 1)> Size: 3kB\n",
       "array([[[['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "...\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']],\n",
       "\n",
       "        [['existing layer data']]]], dtype=object)\n",
       "Coordinates:\n",
       "    altitude     (location) float64 8B 1.681e+03\n",
       "    azimuth      (location, slope) int64 8B 270\n",
       "    inclination  (location, slope) int64 8B 16\n",
       "    latitude     (location) float64 8B 47.1\n",
       "  * location     (location) <U29 116B 'Kasererwinkl_C6'\n",
       "    longitude    (location) float64 8B 11.62\n",
       "  * time         (time) datetime64[ns] 3kB 2024-01-17T16:00:00 ... 2024-02-02...\n",
       "  * slope        (slope) int64 8B 0\n",
       "  * realization  (realization) int64 8B 0
" ], "text/plain": [ " Size: 3kB\n", "array([[[['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", "...\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']],\n", "\n", " [['existing layer data']]]], dtype=object)\n", "Coordinates:\n", " altitude (location) float64 8B 1.681e+03\n", " azimuth (location, slope) int64 8B 270\n", " inclination (location, slope) int64 8B 16\n", " latitude (location) float64 8B 47.1\n", " * location (location)