{ "cells": [ { "cell_type": "markdown", "id": "9110dbc0", "metadata": {}, "source": [ "# Adding new functionality\n", "or more verbosely:\n", "# Scenario 2: Adding new dimensions from alternate data streams and providing entirely new functionality\n", "\n", "If you want to add new dimensions to an `xsnowDataset` or more generally extend the xsnow functionality beyond a single method, write your own extension class. `xsnow` has everything prepared to make this very straightforward for you. As for a *scenario-1-extension*, write your class in a python module and decide whether you want to keep this module private to yourself or host it in a public repository. \n", "\n", "Here is a cheat sheet for the steps you have to take. You will find more detailed explanations and a demonstration further below:\n" ] }, { "cell_type": "markdown", "id": "6ffd7671", "metadata": {}, "source": [ "```{admonition} Recipe\n", ":class: note\n", "\n", " 1. Import: `from xsnow import DatasetDecorator`\n", " 2. Define your extension class: e.g., `class EnsembleFX(DatasetDecorator):`\n", " 3. Define your class methods (and possibly generic functions)\n", " 4. Whenever a function returns an object of your new class, `_rewrap()` the newly generated dataset\n", "\n", "```" ] }, { "cell_type": "markdown", "id": "27c932fc", "metadata": {}, "source": [ "Regarding 1. and 2.)\n", "\n", " * It is important that you define your class as a *subclass* of the [`DatasetDecorator`](../../api/_generated/xsnow.core). This allows `xsnow` to *configure* your class to feel and behave like an `xsnowDataset`, while allowing multiple extensions to be enchained in custom order.\n", "\n", "Regarding 3.)\n", "\n", " * Code all functionality you need and want. Prepend private methods or helper functions with an underscore (e.g., `_my_private_helper`).\n", "\n", " Regarding 4.)\n", "\n", " * Rewrapping is important to ensure different extensions can be enchained. Use the pattern `xs_out = self._rewrap(xs_modified)`.\n", "\n", " *Scenario-2-extensions* can look quite different. Therefore, the next two sections demonstrate two extensions that extend the `xsnow` functionality in their own ways." ] }, { "cell_type": "markdown", "id": "5e653a55", "metadata": {}, "source": [ "## Example: Ensemble-forecast extension---new data streams and dimensions\n", "\n", "The [ensemble forecasts extension](../../api/_generated/xsnow.extensions.ensemble_forecasts) aims to facilitate research on the performance of forecasts with different lead times and from different model realizations such as deterministic or ensemble members. As such, it provides a special read routine that parses a defined directory structure into the dimensions `realization` and model `run`. This read routine is actually the heart of the extension, while the `EnsembleFx` class does not do anything except put its label onto the resulting dataset for consistent naming.\n", "\n", "### Excerpt from implementation" ] }, { "cell_type": "code", "execution_count": null, "id": "aaed3661", "metadata": { "tags": [ "skip-execution", "remove-output" ] }, "outputs": [], "source": [ "from xsnow import DatasetDecorator\n", "\n", "class EnsembleFX(DatasetDecorator):\n", " \"\"\"\n", " Decorator that enriches an xsnowDataset with run context and leadtimes.\n", "\n", " Dimensions/coordinates guaranteed after ``read_ensemble_fx``:\n", " - ``run`` (string) with attrs including optional ``timezone``.\n", " - ``realization`` (string) describing the ensemble member label.\n", " - ``run_start`` coordinate on ``run`` (datetime64[ns], tz in attrs; NaT if unknown).\n", " - ``leadtime`` coordinate on ``time`` (float hours from run_start to valid time).\n", "\n", " All existing xsnowDataset API remains available via inheritance.\n", " \"\"\"\n", " # note that the class is basically empty (no methods, etc)\n", "\n", "\n", "\n", "# note that this is not a class method, but a module-level function\n", "def read_ensemble_fx(\n", " source: Union[str, Path],\n", " # < more parameters >\n", ") -> Optional[EnsembleFX]:\n", " \"\"\"\n", " Read a forecast collection into an ``EnsembleFX`` dataset.\n", "\n", " Layout: ``source/{run}/{member}/{station}.{smet|pro|nc}``. Runs and members are\n", " derived from folder names; station IDs from filenames. Only requested runs,\n", " members, and filename bases are read to keep I/O minimal.\n", "\n", " < ... >\n", "\n", " Parameters\n", " ----------\n", " < ... >\n", "\n", " Returns\n", " -------\n", " EnsembleFX or None\n", " Decorated dataset when data were found; otherwise ``None``.\n", " \"\"\"\n", " \n", " # < iterate through directory tree and read >\n", "\n", " # < concatenate individual datasets >\n", "\n", " xs_out = EnsembleFX(xs_combined)\n", "\n", " return xs_out\n" ] }, { "cell_type": "markdown", "id": "e547957e", "metadata": {}, "source": [ "### Demo application" ] }, { "cell_type": "code", "execution_count": 7, "id": "8c08efb3", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Downloading file 'smets/ens-fx/2024-01-17T00Z/det/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-17T00Z/det/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-17T00Z/det/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-17T00Z/det/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-17T00Z/p01/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-17T00Z/p01/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-17T00Z/p01/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-17T00Z/p01/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-19T10Z/det/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-19T10Z/det/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-19T10Z/det/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-19T10Z/det/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-19T10Z/p01/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-19T10Z/p01/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-19T10Z/p01/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-19T10Z/p01/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-25T12Z/det/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-25T12Z/det/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-25T12Z/det/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-25T12Z/det/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-25T12Z/p01/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-25T12Z/p01/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/2024-01-25T12Z/p01/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/2024-01-25T12Z/p01/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/analysis/det/VIR1A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/analysis/det/VIR1A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n", "Downloading file 'smets/ens-fx/analysis/det/VIR5A.smet' from 'https://gitlab.com/avacollabra/postprocessing/sample-data/-/raw/main/smets/ens-fx/analysis/det/VIR5A.smet' to '/home/flo/.cache/xsnow-snp-ensfx'.\n" ] } ], "source": [ "import xsnow\n", "from xsnow.extensions.ensemble_forecasts import read_ensemble_fx\n", "\n", "datapath = xsnow.sample_data.snp_ensfx_dir()" ] }, { "cell_type": "code", "execution_count": 8, "id": "d3f83b41", "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data location: /home/flo/.cache/xsnow-snp-ensfx\n", "xsnow-snp-ensfx/\n", " smets/\n", " ens-fx/\n", " analysis/\n", " det/\n", " VIR1A.smet\n", " VIR5A.smet\n", " 2024-01-25T12Z/\n", " det/\n", " VIR1A.smet\n", " VIR5A.smet\n", " p01/\n", " VIR1A.smet\n", " VIR5A.smet\n", " 2024-01-19T10Z/\n", " det/\n", " VIR1A.smet\n", " VIR5A.smet\n", " p01/\n", " VIR1A.smet\n", " VIR5A.smet\n", " 2024-01-17T00Z/\n", " det/\n", " VIR1A.smet\n", " VIR5A.smet\n", " p01/\n", " VIR1A.smet\n", " VIR5A.smet\n" ] } ], "source": [ "# cell hidden through metadata\n", "import os\n", "print(f\"Data location: {datapath}\")\n", "for root, dirs, files in os.walk(datapath):\n", " level = root.replace(datapath, \"\").count(os.sep)\n", " indent = \" \" * 4 * level\n", " print(f\"{indent}{os.path.basename(root)}/\")\n", " subindent = \" \" * 4 * (level + 1)\n", " fcounter = 0\n", " for f in sorted(files):\n", " if fcounter < 3 or fcounter > len(files)-3:\n", " print(f\"{subindent}{f}\")\n", " elif fcounter == 3:\n", " print(f\"{subindent}...\")\n", " fcounter += 1\n" ] }, { "cell_type": "code", "execution_count": 9, "id": "f2148bee", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " Locations: 2\n", " Timestamps: 360 (2024-01-16--2024-01-31)\n", " Profiles: 5760 total | 0 valid | unavailable with HS>0\n", "\n", " employing the Size: 406kB\n", " Dimensions: (location: 2, time: 360, slope: 1, realization: 2, run: 4)\n", " Coordinates:\n", " altitude (location) float64 16B 2.372e+03 2.066e+03\n", " latitude (location) float64 16B 47.15 47.37\n", " * location (location) object 16B 'VIR1A' 'VIR5A'\n", " longitude (location) float64 16B 11.19 11.5\n", " leadtime (time, run) float64 12kB nan nan nan nan ... nan nan nan nan\n", " * time (time) datetime64[ns] 3kB 2024-01-16T03:00:00 ... 2024-01...\n", " azimuth (slope) float64 8B nan\n", " inclination (slope) float64 8B nan\n", " * slope (slope) int64 8B 0\n", " * realization (realization) object 16B 'det' 'p01'\n", " * run (run) object 32B '2024-01-17T00Z' ... 'analysis'\n", " run_start (run) datetime64[ns] 32B 2024-01-17 ... NaT\n", " Data variables:\n", " DW (location, time, slope, realization, run) float64 46kB na...\n", " ISWR (location, time, slope, realization, run) float64 46kB na...\n", " PSUM (location, time, slope, realization, run) float64 46kB na...\n", " RH (location, time, slope, realization, run) float64 46kB na...\n", " TA (location, time, slope, realization, run) float64 46kB na...\n", " TAU_CLD (location, time, slope, realization, run) float64 46kB na...\n", " VW (location, time, slope, realization, run) float64 46kB na...\n", " VW_MAX (location, time, slope, realization, run) float64 46kB na...\n", " profile_status (location, time, slope, realization, run) float32 23kB na...\n", " Attributes:\n", " Conventions: CF-1.8\n", " crs: EPSG:4326\n" ] } ], "source": [ "xs = read_ensemble_fx(f\"{datapath}/smets/ens-fx/\")\n", "print(xs)" ] }, { "cell_type": "markdown", "id": "ab76cf2d", "metadata": {}, "source": [ "The resulting dataset is of class `EnsembleFx`. It has an additional dimension `run` with 4 entries. Two additional coordinates were added, `run_start`: dimension (run) and `leadtime`: dimension (time, run). You can now work with the dataset as you know it from an `xsnowDataset`. \n", "\n", "For example, we could look into the first 100 values of `'TA'` and `'leadtime'` for the *deterministic* member of the *2024-01-17T00Z* run at the first location:" ] }, { "cell_type": "code", "execution_count": 12, "id": "72ddedb5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ nan nan nan nan nan nan nan nan nan nan\n", " nan nan nan nan nan nan nan nan nan nan\n", " nan 269.48 269.72 269.99 270.52 270.58 270.1 269.25 270.03 272.68\n", " 273.89 274.46 274.91 275.49 275.3 274.4 273.82 273.22 272.24 271.08\n", " 271.16 271.88 273.1 272.81 271.62 271.51 271.72 271.81 271.49 271.55\n", " 271.54 271.7 272.01 272.42 272.83 272.95 273.3 273.98 274.13 273.91\n", " 273.35 272.19 271.77 271.35 270.84 268.75 267.23 265.45 263.89 262.66\n", " 261.81 261.06 260.68 260.13 259.72 259.38 259.1 258.85 258.6 258.48\n", " 258.46 258.49 258.53 258.58 258.46 258.15 257.73 257.22 257.04 257.04\n", " 256.89 256.97 256.82 256.68 nan nan nan nan nan nan]\n", "[nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan\n", " nan nan nan 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.\n", " 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32.\n", " 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50.\n", " 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68.\n", " 69. 70. 71. 72. nan nan nan nan nan nan]\n" ] } ], "source": [ "sub = xs.sel(run=\"2024-01-17T00Z\", realization='det').\\\n", " isel(location=0, time=slice(100)).squeeze()\n", "\n", "print(sub['TA'].values)\n", "print(sub['leadtime'].values)" ] }, { "cell_type": "markdown", "id": "5f522e84", "metadata": {}, "source": [ "## Example: Hazard chart extension---entirely new functionality\n", "\n", "```{warning}\n", "\n", "Coming soon. In the meantime, you can checkout the source code of the [Hazard chart extension](../../api/_generated/xsnow.extensions.hazard_chart) directly. \n", "```" ] } ], "metadata": { "kernelspec": { "display_name": "xsnow-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.5" } }, "nbformat": 4, "nbformat_minor": 5 }