{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Event Synchronization (ES) networks \n", "\n", "Event Synchronization (ES) is a statistical method used to quantify the synchrony between event time series by measuring the relative timing of events in two time series. ES has become particularly useful in climate science for analyzing extreme events across different geographical locations.\n", "\n", "This notebook demonstrates how to use the `dominosee` package to perform Event Synchronization for the temporal positions of events and construct event-based climate networks from time series data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input Data\n", "\n", "We'll start by creating a synthetic dataset to demonstrate the ES workflow. For real-world applications, you would typically load climate data from NetCDF files or other sources. In this example, we'll generate a simple dataset with the Standardized Precipitation Index (SPI) values, which are commonly used to identify drought conditions.\n", "\n", "We'll create a synthetic dataset with the following properties:\n", "- Spatial dimensions: 20x20 grid points (latitude x longitude)\n", "- Temporal dimension: 365 days (daily data for one year)\n", "- Variable: SPI1" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "import xarray as xr\n", "\n", "xr.set_options(display_expand_data=False)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 1MB\n",
       "Dimensions:  (lat: 20, lon: 20, time: 365)\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lon      (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * time     (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n",
       "Data variables:\n",
       "    SPI1     (lat, lon, time) float64 1MB -0.6445 -0.3062 ... 0.3327 -0.5354
" ], "text/plain": [ " Size: 1MB\n", "Dimensions: (lat: 20, lon: 20, time: 365)\n", "Coordinates:\n", " * lat (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lon (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * time (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n", "Data variables:\n", " SPI1 (lat, lon, time) float64 1MB -0.6445 -0.3062 ... 0.3327 -0.5354" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Using smaller dimensions for demonstration\n", "nx_demo, ny_demo, nt_demo = 20, 20, 365\n", "\n", "# Create coordinates\n", "lats = np.linspace(-90, 90, nx_demo)\n", "lons = np.linspace(-180, 180, ny_demo)\n", "times = xr.date_range(\"1950-01-01\", periods=nt_demo, freq=\"D\")\n", "\n", "# Create standard normal data for SPI values (mean=0, std=1)\n", "spi_data = np.random.normal(0, 1, size=(nx_demo, ny_demo, nt_demo)) # Standard normal distribution\n", "\n", "# Create xarray Dataset\n", "spi = xr.Dataset(\n", " data_vars={\n", " \"SPI1\": ([\"lat\", \"lon\", \"time\"], spi_data)\n", " },\n", " coords={\n", " \"lat\": lats,\n", " \"lon\": lons,\n", " \"time\": times\n", " }\n", ")\n", "\n", "spi" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extract the timing of extreme events\n", "\n", "Event Synchronization analysis requires the extraction of discrete event positions from continuous time series data. In this step, we convert our continuous SPI time series into binary event time series by applying a threshold.\n", "\n", "We'll use the `get_event` function from the `dominosee.eventorize` module, which identifies events based on a specified threshold and direction (above or below). For drought events, we're interested in SPI values below a threshold of -1.0. It is recommended to only consider the burst timing for each series of consecutive event occurrences to avoid the effects of temporal clustering on ES calculations, so we set `select=\"start\"` and `select_period=1`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'drought' (lat: 20, lon: 20, time: 365)> Size: 146kB\n",
       "False False True False False False False ... True False True False False False\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lon      (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * time     (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n",
       "Attributes:\n",
       "    threshold:      -1.0\n",
       "    extreme:        below\n",
       "    long_name:      drought events\n",
       "    description:    Events with -1.0 below threshold\n",
       "    event_name:     drought\n",
       "    select:         start\n",
       "    select_period:  1
" ], "text/plain": [ " Size: 146kB\n", "False False True False False False False ... True False True False False False\n", "Coordinates:\n", " * lat (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lon (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * time (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n", "Attributes:\n", " threshold: -1.0\n", " extreme: below\n", " long_name: drought events\n", " description: Events with -1.0 below threshold\n", " event_name: drought\n", " select: start\n", " select_period: 1" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.eventorize import get_event\n", "da_burst = get_event(spi.SPI1, threshold=-1.0, extreme=\"below\", event_name=\"drought\", select=\"start\", select_period=1)\n", "da_burst" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After identifying extreme events with the `get_event` function, we need to extract the timing positions of these events using `get_event_positions`. This is because ES analysis requires the exact timing of events to calculate the time delay between event pairs between two locations." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 106kB\n",
       "Dimensions:          (lat: 20, lon: 20, event: 64)\n",
       "Coordinates:\n",
       "  * lat              (lat) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n",
       "  * lon              (lon) float64 160B -180.0 -161.1 -142.1 ... 161.1 180.0\n",
       "  * event            (event) int64 512B 0 1 2 3 4 5 6 7 ... 57 58 59 60 61 62 63\n",
       "Data variables:\n",
       "    event_positions  (lat, lon, event) int32 102kB 2 12 27 34 37 ... -1 -1 -1 -1\n",
       "    event_count      (lat, lon) int64 3kB 41 42 48 44 51 55 ... 43 47 45 46 56
" ], "text/plain": [ " Size: 106kB\n", "Dimensions: (lat: 20, lon: 20, event: 64)\n", "Coordinates:\n", " * lat (lat) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n", " * lon (lon) float64 160B -180.0 -161.1 -142.1 ... 161.1 180.0\n", " * event (event) int64 512B 0 1 2 3 4 5 6 7 ... 57 58 59 60 61 62 63\n", "Data variables:\n", " event_positions (lat, lon, event) int32 102kB 2 12 27 34 37 ... -1 -1 -1 -1\n", " event_count (lat, lon) int64 3kB 41 42 48 44 51 55 ... 43 47 45 46 56" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.es import get_event_positions\n", "da_burst_ep = get_event_positions(da_burst)\n", "da_burst_ep" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result is a integer Dataset of event positions, where:\n", "- `event_position` indicates the position of the event in the time dimension\n", "- `event_count` indicates the number of events\n", "\n", "This integer representation allows us to easily identify the positions of events in the time dimension, and the number of events. The event Dataset includes attributes that document the threshold and criteria used to define the events." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate Event Synchronization between location pairs\n", "\n", "Event Synchronization quantifies the similarity in the timing of events between two time series. Unlike ECA, which counts coincidences within fixed time windows, ES dynamically adapts the tolerance window based on the local event lags." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'event_positions' (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 160kB\n",
       "39 11 9 16 14 18 14 10 14 14 16 11 15 ... 17 16 16 16 14 18 9 15 13 14 18 12 54\n",
       "Coordinates:\n",
       "  * latA     (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonA     (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB     (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonB     (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "Attributes:\n",
       "    long_name:     Event Synchronization\n",
       "    units:         count\n",
       "    description:   Number of synchronized events between locations A and B\n",
       "    tau_max:       10\n",
       "    output_dtype:  <class 'numpy.uint8'>
" ], "text/plain": [ " Size: 160kB\n", "39 11 9 16 14 18 14 10 14 14 16 11 15 ... 17 16 16 16 14 18 9 15 13 14 18 12 54\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", "Attributes:\n", " long_name: Event Synchronization\n", " units: count\n", " description: Number of synchronized events between locations A and B\n", " tau_max: 10\n", " output_dtype: " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.es import get_event_sync_from_positions\n", "da_es = get_event_sync_from_positions(positionsA=da_burst_ep.event_positions,\n", " positionsB=da_burst_ep.event_positions,\n", " tm=10)\n", "da_es" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `get_event_sync_from_positions` function calculates the event synchronization between all pairs of grid points. The resulting DataArray has four dimensions:\n", "- `latA`, `lonA`: Coordinates of the location A\n", "- `latB`, `lonB`: Coordinates of the location B\n", "\n", "The function parameters control the ES calculation:\n", "- `tm`: The maximum time window (in time steps) within which events must occur to be considered synchronized" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate the statistical confidence of ES\n", "\n", "
\n", "EXPERIMENTAL: The Event Synchronization null model in dominosee is currently experimental. The functionality may change in future releases.\n", "
\n", "\n", "Event Synchronization (ES) is a statistical method used to quantify the synchrony between event time series...\n", "\n", "After calculating the raw event coincidence counts, the next critical step is to determine which connections are statistically significant. This involves:\n", "\n", "1. **Get critical values**: Computing ES values for many realizations of the surrogate data. Unlike ECA, which can utilize analytical formulas for significance testing, ES typically relies on computational methods like bootstrapping to generate surrogate data. The significance level (e.g., p < 0.05 or p < 0.01) determines how conservative our network construction will be - stricter thresholds produce sparser but more reliable networks.\n", "\n", "2. **Construct a null model for location pairs**: Map the critical values to the number of total events in the locations pairs." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray (noeA: 65, noeB: 65, significance: 1)> Size: 34kB\n",
       "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 15 14 16 17 16 15 15 18 17 18 17 17 17 18 17\n",
       "Coordinates:\n",
       "  * noeA          (noeA) int64 520B 0 1 2 3 4 5 6 7 ... 57 58 59 60 61 62 63 64\n",
       "  * noeB          (noeB) int64 520B 0 1 2 3 4 5 6 7 ... 57 58 59 60 61 62 63 64\n",
       "  * significance  (significance) float64 8B 0.001\n",
       "Attributes:\n",
       "    description:  Event synchronization null model for pairs of number of events\n",
       "    tau_max:      10\n",
       "    max_events:   64\n",
       "    min_es:       None
" ], "text/plain": [ " Size: 34kB\n", "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 15 14 16 17 16 15 15 18 17 18 17 17 17 18 17\n", "Coordinates:\n", " * noeA (noeA) int64 520B 0 1 2 3 4 5 6 7 ... 57 58 59 60 61 62 63 64\n", " * noeB (noeB) int64 520B 0 1 2 3 4 5 6 7 ... 57 58 59 60 61 62 63 64\n", " * significance (significance) float64 8B 0.001\n", "Attributes:\n", " description: Event synchronization null model for pairs of number of events\n", " tau_max: 10\n", " max_events: 64\n", " min_es: None" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.es import create_null_model_from_indices\n", "da_cvalues = create_null_model_from_indices(spi.time, tm=da_es.attrs[\"tau_max\"],\n", " max_events=da_burst_ep.event_count.max().values, significances=0.001)\n", "da_cvalues" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray (latA: 20, lonA: 20, latB: 20, lonB: 20)> Size: 1MB\n",
       "12 10 11 10 12 12 12 11 12 12 11 9 12 ... 13 14 11 14 13 11 14 14 12 14 13 13 15\n",
       "Coordinates:\n",
       "    noeA          (latA, lonA) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n",
       "    noeB          (latB, lonB) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n",
       "    significance  float64 8B 0.001\n",
       "  * latA          (latA) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n",
       "  * lonA          (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB          (latB) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n",
       "  * lonB          (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "Attributes:\n",
       "    description:  Event synchronization null model for pairs of locations\n",
       "    tau_max:      10\n",
       "    max_events:   64\n",
       "    min_es:       None
" ], "text/plain": [ " Size: 1MB\n", "12 10 11 10 12 12 12 11 12 12 11 9 12 ... 13 14 11 14 13 11 14 14 12 14 13 13 15\n", "Coordinates:\n", " noeA (latA, lonA) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n", " noeB (latB, lonB) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n", " significance float64 8B 0.001\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", "Attributes:\n", " description: Event synchronization null model for pairs of locations\n", " tau_max: 10\n", " max_events: 64\n", " min_es: None" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.es import convert_null_model_for_locations\n", "da_null = convert_null_model_for_locations(da_cvalues, da_burst_ep.event_count, da_burst_ep.event_count, sig=0.001)\n", "da_null" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Construct network represented by adjacency matrix\n", "\n", "The next step involves constructing the climate network based on statistically significant connections. We'll create an adjacency matrix, which is a fundamental representation of the network structure:\n", "\n", "- **Adjacency Matrix**: A square matrix where both rows and columns correspond to grid points (locations).\n", "- **Matrix Elements**: Each element (i,j) contains a binary value:\n", " - 1 if a statistically significant connection exists between locations i and j\n", " - 0 if no significant connection is present\n", "- **Interpretation**: This matrix provides a concise representation of the network topology, enabling further analysis of the climate system's spatial relationships." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 160kB\n",
       "True True False True True True True ... False True True False True False True\n",
       "Coordinates:\n",
       "  * latA          (latA) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n",
       "  * lonA          (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB          (latB) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n",
       "  * lonB          (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "    noeA          (latA, lonA) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n",
       "    noeB          (latB, lonB) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n",
       "    significance  float64 8B 0.001
" ], "text/plain": [ " Size: 160kB\n", "True True False True True True True ... False True True False True False True\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " noeA (latA, lonA) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n", " noeB (latB, lonB) int64 3kB 41 42 48 44 51 55 ... 51 43 47 45 46 56\n", " significance float64 8B 0.001" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.network import get_link_from_critical_values\n", "\n", "da_link = get_link_from_critical_values(da_es, da_null, rule=\"greater\")\n", "da_link" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The network density is 82.36%.\n" ] } ], "source": [ "print(f\"The network density is {da_link.sum().values/da_link.size*100:.2f}%.\")" ] } ], "metadata": { "kernelspec": { "display_name": "dominosee-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.12" } }, "nbformat": 4, "nbformat_minor": 2 }