{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# VirES Python Client Data Handling\n", "\n", "> Abstract: The VirES Python Client provides helpful functions to handle the retrieved data" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-01-11T10:18:13.772344Z", "iopub.status.busy": "2024-01-11T10:18:13.772187Z", "iopub.status.idle": "2024-01-11T10:18:14.612359Z", "shell.execute_reply": "2024-01-11T10:18:14.611964Z" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python implementation: CPython\n", "Python version : 3.9.7\n", "IPython version : 8.0.1\n", "\n", "viresclient: 0.11.0\n", "pandas : 1.4.1\n", "xarray : 0.21.1\n", "matplotlib : 3.5.1\n", "\n" ] } ], "source": [ "# Display important package versions used\n", "%load_ext watermark\n", "%watermark -i -v -p viresclient,pandas,xarray,matplotlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How to use the **`viresclient`** to find and retrieve Aeolus data has been described in the previous sections.\n", "This tutorial provides further insights on data manipulation options to help you further interact with the data.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What to do when data has been retrieved\n", "Once we have retrieved the data with the `get_between` function, we have a data object (of the type `ReturnData`) which provides some great useful functions to convert and manipulate it to your preferred data type object.\n", "Lets first request some data so that we can further manipulate it afterwards:\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-01-11T10:18:14.614289Z", "iopub.status.busy": "2024-01-11T10:18:14.614156Z", "iopub.status.idle": "2024-01-11T10:18:19.109030Z", "shell.execute_reply": "2024-01-11T10:18:19.108572Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "64e8d0c1376f43e199aad78458780537", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Processing: 0%| | [ Elapsed: 00:00, Remaining: ? ] [1/1] " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c5120e8cb1b14608ad20c6cae9dff789", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | [ Elapsed: 00:00, Remaining: ? ] (0.101MB)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# We import the AeolusRequest class from the viresclient\n", "from viresclient import AeolusRequest\n", "# We create a new AeolusRequest instance\n", "request = AeolusRequest()\n", "DATA_PRODUCT = \"ALD_U_N_2A\"\n", "request.set_collection(DATA_PRODUCT)\n", "\n", "# Fetch some example parameters, for example from two different field_types\n", "request.set_fields(\n", " sca_fields=[\"SCA_extinction\"],\n", " ica_fields=[\"ICA_extinction\"],\n", ")\n", "\n", "# Retrieve the data\n", "return_data = request.get_between(\n", " start_time=\"2020-04-10T06:21:58Z\",\n", " end_time=\"2020-04-10T06:22:33Z\",\n", " filetype=\"nc\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Additional information on response\n", "The response data object has also a `sources` attribute that provides an array of tuples that describe from which products the returned data has been extracted. Each tuple contains 3 elements, which are filename, baseline and processor identifier.\n", "This information is also passed to the xarray Attributes as `Sources`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-01-11T10:18:19.111131Z", "iopub.status.busy": "2024-01-11T10:18:19.110995Z", "iopub.status.idle": "2024-01-11T10:18:19.118683Z", "shell.execute_reply": "2024-01-11T10:18:19.118358Z" } }, "outputs": [ { "data": { "text/plain": [ "[('AE_OPER_ALD_U_N_2A_20200410T062135020_005424001_009457_0004',\n", " '2A11',\n", " 'ADM_L2aP/03.11')]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We can see the sources files from which the data was extracted\n", "# by looking at the sources attribute\n", "return_data.sources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Convert data\n", "Now that we have the `return_data` object which is a wrapper to the retrieved netCDF file we can use some conversion functions:\n", "* as_xarray: Returns an xarray object - groups are not possible in xarray so all parameters are flattened to the same level, will create issues when requesting multiple field_types where there are parameters with the same indicator identifier (naming conflicts)\n", "* as_xarray_dict: Return as dictionary object with field_type as key and xarray objects as value\n", "* as_dataframe: Returns a pandas dataframe object\n", "\n", "Throughout the previous tutorials we have seen already some examples of this, but here are again the methods listed as overview.\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-01-11T10:18:19.120266Z", "iopub.status.busy": "2024-01-11T10:18:19.120132Z", "iopub.status.idle": "2024-01-11T10:18:19.171461Z", "shell.execute_reply": "2024-01-11T10:18:19.171068Z" } }, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset>\n", "Dimensions: (ica_dim: 357, array_24: 24, sca_dim: 3)\n", "Dimensions without coordinates: ica_dim, array_24, sca_dim\n", "Data variables:\n", " ICA_extinction (ica_dim, array_24) float64 -1e+06 0.0 ... -1e+06 -1e+06\n", " SCA_extinction (sca_dim, array_24) float64 -1e+06 0.0 ... -1e+06 -1e+06\n", "Attributes:\n", " Sources: [('AE_OPER_ALD_U_N_2A_20200410T062135020_005424001_009457_0004'...
\n", " | \n", " | \n", " | ICA_extinction | \n", "SCA_extinction | \n", "
---|---|---|---|---|
ica_dim | \n", "array_24 | \n", "sca_dim | \n", "\n", " | \n", " |
0 | \n", "0 | \n", "0 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "
1 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "||
2 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "||
1 | \n", "0 | \n", "0.0 | \n", "0.0 | \n", "|
1 | \n", "0.0 | \n", "0.0 | \n", "||
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
356 | \n", "22 | \n", "1 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "
2 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "||
23 | \n", "0 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "|
1 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "||
2 | \n", "-1000000.0 | \n", "-1000000.0 | \n", "
25704 rows × 2 columns
\n", "<xarray.Dataset>\n", "Dimensions: (ica_dim: 357, array_24: 24, sca_dim: 3)\n", "Dimensions without coordinates: ica_dim, array_24, sca_dim\n", "Data variables:\n", " ICA_extinction (ica_dim, array_24) float64 -1e+06 0.0 ... -1e+06 -1e+06\n", " SCA_extinction (sca_dim, array_24) float64 -1e+06 0.0 ... -1e+06 -1e+06