{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Event Coincidence Analysis (ECA) networks \n", "\n", "Event Coincidence Analysis (ECA) is a statistical method used to identify synchronous events between event time series (i.e. two locations in DOMINO-SEE), following two user-specified parameters. This approach allows us to construct networks where nodes represent extreme events in locations and edges represent significant coincidences of extreme events between locations.\n", "\n", "This notebook demonstrates how to use the `dominosee` package to perform ECA and construct event-based climate networks from time series data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input Data\n", "\n", "We'll start by creating a synthetic dataset to demonstrate the ECA workflow. For real-world applications, you would typically load climate data from NetCDF files or other sources. In this example, we'll generate a simple dataset with the Standardized Precipitation Index (SPI) values, which are commonly used to identify drought conditions.\n", "\n", "We'll create a synthetic dataset with the following properties:\n", "- Spatial dimensions: 20x20 grid points (latitude x longitude)\n", "- Temporal dimension: 365 days (daily data for one year)\n", "- Variable: SPI1" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "import xarray as xr\n", "\n", "xr.set_options(display_expand_data=False)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 1MB\n",
       "Dimensions:  (lat: 20, lon: 20, time: 365)\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lon      (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * time     (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n",
       "Data variables:\n",
       "    SPI1     (lat, lon, time) float64 1MB -0.1615 1.393 0.208 ... 0.2356 -0.1454
" ], "text/plain": [ " Size: 1MB\n", "Dimensions: (lat: 20, lon: 20, time: 365)\n", "Coordinates:\n", " * lat (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lon (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * time (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n", "Data variables:\n", " SPI1 (lat, lon, time) float64 1MB -0.1615 1.393 0.208 ... 0.2356 -0.1454" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Using smaller dimensions for demonstration\n", "nx_demo, ny_demo, nt_demo = 20, 20, 365\n", "\n", "# Create coordinates\n", "lats = np.linspace(-90, 90, nx_demo)\n", "lons = np.linspace(-180, 180, ny_demo)\n", "times = xr.date_range(\"1950-01-01\", periods=nt_demo, freq=\"D\")\n", "\n", "# Create standard normal data for SPI values (mean=0, std=1)\n", "spi_data = np.random.normal(0, 1, size=(nx_demo, ny_demo, nt_demo)) # Standard normal distribution\n", "\n", "# Create xarray Dataset\n", "spi = xr.Dataset(\n", " data_vars={\n", " \"SPI1\": ([\"lat\", \"lon\", \"time\"], spi_data)\n", " },\n", " coords={\n", " \"lat\": lats,\n", " \"lon\": lons,\n", " \"time\": times\n", " }\n", ")\n", "\n", "spi" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extract the timing of extreme events\n", "\n", "To perform Event Coincidence Analysis, we first need to identify when and where extreme events occur. In this step, we convert our continuous SPI time series into binary event time series by applying a threshold.\n", "\n", "We'll use the `get_event` function from the `dominosee.eventorize` module, which identifies events based on a specified threshold and direction (above or below). For drought events, we're interested in SPI values below a threshold of -1.0." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'drought' (lat: 20, lon: 20, time: 365)> Size: 146kB\n",
       "False False False False False False ... False False False False False False\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lon      (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * time     (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n",
       "Attributes:\n",
       "    threshold:      -1.0\n",
       "    extreme:        below\n",
       "    long_name:      drought events\n",
       "    description:    Events with -1.0 below threshold\n",
       "    event_name:     drought\n",
       "    select:         None\n",
       "    select_period:  None
" ], "text/plain": [ " Size: 146kB\n", "False False False False False False ... False False False False False False\n", "Coordinates:\n", " * lat (lat) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lon (lon) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * time (time) datetime64[ns] 3kB 1950-01-01 1950-01-02 ... 1950-12-31\n", "Attributes:\n", " threshold: -1.0\n", " extreme: below\n", " long_name: drought events\n", " description: Events with -1.0 below threshold\n", " event_name: drought\n", " select: None\n", " select_period: None" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from dominosee.eventorize import get_event\n", "da_event = get_event(spi.SPI1, threshold=-1.0, extreme=\"below\", event_name=\"drought\")\n", "da_event" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result is a binary DataArray with the same dimensions as the original SPI data, where:\n", "- `True` indicates that an extreme event (drought) occurred at that location and time\n", "- `False` indicates no extreme event\n", "\n", "This binary representation allows us to easily identify when and where extreme events occur throughout our dataset. The event DataArray includes attributes that document the threshold and criteria used to define the events." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Identify number of precursor events and trigger events of ECA among location pairs\n", "\n", "Event Coincidence Analysis examines whether events in one time series (location A) are typically followed or preceded by events in another time series (location B) within a specified time window (which is specified by the `window`, `tau` and `sym` parameters). This allows us to identify temporal relationships of extreme events between different locations.\n", "\n", "In ECA, we distinguish between:\n", "1. **Precursor events**: Events at location A that precede events at location B\n", "2. **Trigger events**: Events at location A that are followed by events at location B\n", "However, as we use a symmetrical window (by specifying `sym=True`), we do not distinguish which location has earlier events but to quantify the co-occurrence of events at both locations.\n", "\n", "The ECA metrics quantify the strength of these relationships based on event co-occurrences. We'll use the `dominosee.eca` module to calculate these metrics." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from dominosee.eca import get_eca_precursor_from_events, get_eca_trigger_from_events" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'drought' (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 320kB\n",
       "52 34 29 31 27 21 32 29 28 35 33 31 25 ... 28 23 27 26 32 26 25 27 14 33 26 47\n",
       "Coordinates:\n",
       "  * latA     (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonA     (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB     (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonB     (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "Attributes:\n",
       "    long_name:    Precursor Events\n",
       "    units:        count\n",
       "    description:  Number of precursor events (from location A to location B) ...\n",
       "    eca_params:   {"delt": 2, "sym": true, "tau": 0}
" ], "text/plain": [ " Size: 320kB\n", "52 34 29 31 27 21 32 29 28 35 33 31 25 ... 28 23 27 26 32 26 25 27 14 33 26 47\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", "Attributes:\n", " long_name: Precursor Events\n", " units: count\n", " description: Number of precursor events (from location A to location B) ...\n", " eca_params: {\"delt\": 2, \"sym\": true, \"tau\": 0}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Calculate precursor events\n", "da_precursor = get_eca_precursor_from_events(eventA=da_event, eventB=da_event, delt=2, sym=True, tau=0)\n", "da_precursor" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'drought' (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 320kB\n",
       "52 59 54 60 51 45 59 67 49 64 67 57 50 ... 52 48 53 55 67 51 55 63 49 67 57 47\n",
       "Coordinates:\n",
       "  * latA     (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonA     (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB     (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonB     (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "Attributes:\n",
       "    long_name:    Trigger Events\n",
       "    units:        count\n",
       "    description:  Number of trigger events (from location A to location B) in...\n",
       "    eca_params:   {"delt": 10, "sym": true, "tau": 0}
" ], "text/plain": [ " Size: 320kB\n", "52 59 54 60 51 45 59 67 49 64 67 57 50 ... 52 48 53 55 67 51 55 63 49 67 57 47\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", "Attributes:\n", " long_name: Trigger Events\n", " units: count\n", " description: Number of trigger events (from location A to location B) in...\n", " eca_params: {\"delt\": 10, \"sym\": true, \"tau\": 0}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Calculate trigger events\n", "da_trigger = get_eca_trigger_from_events(eventA=da_event, eventB=da_event, delt=10, sym=True, tau=0)\n", "da_trigger" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `get_eca_precursor_from_events` function calculates the number of precursor events between all pairs of grid points. The resulting DataArray has four dimensions:\n", "- `latA`, `lonA`: Coordinates of the location A\n", "- `latB`, `lonB`: Coordinates of the location B\n", "\n", "The function parameters control the ECA calculation:\n", "- `delt`: The time window (in time steps) within which events must occur to be considered coincident\n", "- `sym`: Whether to use a symmetric time window (before and after)\n", "- `tau`: The required delay between events\n", "\n", "Similarly, `get_eca_trigger_from_events` calculates the number of trigger events - events at location A that are followed by events at location B within the specified time window." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate the statistical confidence of ECA\n", "\n", "After calculating the raw event coincidence counts, the next critical step is to determine which connections are statistically significant. This involves:\n", "\n", "1. **Calculating confidence levels**: By assuming the random occurrence of events, we can obtain the confidence level of the event coincidence counts for both the precursor and trigger events.\n", "\n", "2. **Setting significance thresholds**: We can determine a chosen significance level (e.g., 95% or 99%) above which coincidence counts are considered statistically significant." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "from dominosee.eca import get_eca_precursor_confidence, get_eca_trigger_confidence" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'prec_conf' (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 640kB\n",
       "1.0 0.9997 0.9985 0.9974 0.9944 0.963 ... 0.9821 0.9725 0.3567 0.9988 0.9846 1.0\n",
       "Coordinates:\n",
       "  * latA     (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonA     (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB     (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonB     (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "Attributes:\n",
       "    long_name:    Precursor confidence\n",
       "    units:        \n",
       "    description:  Confidence of precursor rates (from location A to location ...\n",
       "    eca_params:   {"delt": 2, "sym": true, "tau": 0}
" ], "text/plain": [ " Size: 640kB\n", "1.0 0.9997 0.9985 0.9974 0.9944 0.963 ... 0.9821 0.9725 0.3567 0.9988 0.9846 1.0\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", "Attributes:\n", " long_name: Precursor confidence\n", " units: \n", " description: Confidence of precursor rates (from location A to location ...\n", " eca_params: {\"delt\": 2, \"sym\": true, \"tau\": 0}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_prec_conf = get_eca_precursor_confidence(precursor=da_precursor, eventA=da_event, eventB=da_event)\n", "da_prec_conf" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'trigger_conf' (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 640kB\n",
       "1.0 1.0 1.0 1.0 1.0 0.9875 1.0 1.0 0.9998 ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n",
       "Coordinates:\n",
       "  * latA     (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonA     (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB     (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonB     (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "Attributes:\n",
       "    long_name:    Trigger confidence\n",
       "    units:        \n",
       "    description:  Confidence of trigger rates (from location A to location B)...\n",
       "    eca_params:   {"delt": 10, "sym": true, "tau": 0}
" ], "text/plain": [ " Size: 640kB\n", "1.0 1.0 1.0 1.0 1.0 0.9875 1.0 1.0 0.9998 ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", "Attributes:\n", " long_name: Trigger confidence\n", " units: \n", " description: Confidence of trigger rates (from location A to location B)...\n", " eca_params: {\"delt\": 10, \"sym\": true, \"tau\": 0}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_trig_conf = get_eca_trigger_confidence(trigger=da_trigger, eventA=da_event, eventB=da_event)\n", "da_trig_conf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Construct network represented by adjacency matrix\n", "\n", "The next step involves constructing the climate network based on our statistically significant connections. We'll create an adjacency matrix, which is a fundamental representation of the network structure:\n", "\n", "- **Adjacency Matrix**: A square matrix where both rows and columns correspond to grid points (locations).\n", "- **Matrix Elements**: Each element (i,j) contains a binary value:\n", " - 1 if a statistically significant connection exists between locations i and j\n", " - 0 if no significant connection is present\n", "- **Interpretation**: This matrix provides a concise representation of the network topology, enabling further analysis of the climate system's spatial relationships." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from dominosee.network import get_link_from_confidence" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray (latA: 20, latB: 20, lonA: 20, lonB: 20)> Size: 160kB\n",
       "True True True True True False True ... True False False False True False True\n",
       "Coordinates:\n",
       "  * latA     (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonA     (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n",
       "  * latB     (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n",
       "  * lonB     (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0
" ], "text/plain": [ " Size: 160kB\n", "True True True True True False True ... True False False False True False True\n", "Coordinates:\n", " * latA (latA) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonA (lonA) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0\n", " * latB (latB) float64 160B -90.0 -80.53 -71.05 -61.58 ... 71.05 80.53 90.0\n", " * lonB (lonB) float64 160B -180.0 -161.1 -142.1 ... 142.1 161.1 180.0" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "da_link = get_link_from_confidence(da_prec_conf, 0.99) & get_link_from_confidence(da_trig_conf, 0.99)\n", "da_link" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The network density is 41.58%.\n" ] } ], "source": [ "print(f\"The network density is {da_link.sum().values/da_link.size*100:.2f}%.\")" ] } ], "metadata": { "kernelspec": { "display_name": "dominosee-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.12" } }, "nbformat": 4, "nbformat_minor": 2 }