Thailand (single-zone, CEM)¶
A single-zone (Thailand) capacity-expansion run with 13 cascading Mekong-basin reservoirs – Bhumibol, Sirikit, Srinagarind, Vajiralongkorn, Rajjaprabha, etc. Migrated from the v1.4-era xlsx dataset to the v1.9 long-format CSV schema; all per-station reservoir physics (head, tailrace-discharge curves, forebay-volume curves) preserved.
Run the solve from the command line (the dataset is large – 1M+ variables – so a full solve takes ~30 min):
cd examples/thailand
python run.py
The cells below explore the dataset shape and read the result NetCDF. They run fast (no solve).
[1]:
import pathlib
# Walk up to find the example dir (containing config.json + input/).
this_dir = pathlib.Path.cwd()
while not (this_dir / 'config.json').exists() or not (this_dir / 'input').is_dir():
if this_dir == this_dir.parent:
raise RuntimeError(
'Could not locate the example dir; run from inside '
'examples/thailand/.'
)
this_dir = this_dir.parent
INPUT = this_dir / 'input'
print(f'Example dir: {this_dir}')
print(f'Input: {INPUT}')
Example dir: /Users/energy/01-doing/PREP-SHOT-tutorial/PREP-SHOT/examples/thailand
Input: /Users/energy/01-doing/PREP-SHOT-tutorial/PREP-SHOT/examples/thailand/input
Existing fleet (2023)¶
Thailand’s installed capacity in the start year, with Large Hydropower allocated across the 13 reservoir techs by their rated capacity.
[2]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
fleet = pd.read_csv(INPUT / 'tech_existing.csv')
registry = pd.read_csv(INPUT / 'tech_registry.csv')
merged = fleet.merge(registry[['tech', 'carrier']], on='tech')
by_carrier = merged.groupby('carrier')['capacity'].sum().sort_values(ascending=False)
fig, ax = plt.subplots(figsize=(7, 3))
by_carrier.plot.bar(ax=ax, color='steelblue')
ax.set_ylabel('Capacity (MW)')
ax.set_title('Existing capacity by carrier (Thailand, 2023)')
fig.tight_layout()
plt.show()
by_carrier.round(0)
[2]:
carrier
gas 36993.0
coal 6138.0
biomass 4705.0
solar 3186.0
hydro 2925.0
wind 2092.0
pumped_hydro 1000.0
Name: capacity, dtype: float64
Per-station hydropower characteristics¶
Each reservoir is modeled with its own head, capacity, and operational bounds – the feature that distinguishes PREP-SHOT from PyPSA / GenX.
[3]:
head = pd.read_csv(INPUT / 'reservoir_head.csv')
cap = pd.read_csv(INPUT / 'reservoir_capacity_max.csv')
merged = (
head[['tech', 'head']]
.merge(cap[['tech', 'capacity_max']], on='tech')
.sort_values('capacity_max', ascending=False)
)
merged.round(1)
[3]:
| tech | head | capacity_max | |
|---|---|---|---|
| 0 | Bhumibol | 154.0 | 779.2 |
| 10 | Srinagarind | 140.0 | 720.0 |
| 1 | Sirikit | 113.6 | 500.0 |
| 12 | Vajiralongkorn | 92.0 | 300.0 |
| 9 | Rajjaprabha | 94.0 | 240.0 |
| 4 | Pak_Mun | 17.0 | 136.0 |
| 7 | Bang_Lang | 85.0 | 84.0 |
| 2 | Chulabhorn | 70.0 | 40.0 |
| 11 | Tha_Thung_Na | 30.0 | 39.0 |
| 5 | Siridhorn | 42.0 | 36.0 |
| 6 | Ubol_Ratana | 32.0 | 25.2 |
| 8 | Kaeng_Krachan | 58.0 | 19.0 |
| 3 | Nam_Pung | 41.0 | 6.3 |
Inspect a solve (after running python run.py)¶
Once you’ve run the CLI command above and it has written output/baseline.nc, the cell below opens it. Skip if you haven’t run the solve yet.
[4]:
import xarray as xr
nc_path = this_dir / 'output' / 'baseline.nc'
if not nc_path.exists():
print(f'No NetCDF at {nc_path}. Run `python run.py` from {this_dir} first.')
else:
# Try netCDF4 then h5netcdf -- the .nc file is netCDF4-classic and
# may not be readable by every backend on every platform.
ds = None
for engine in ('netcdf4', 'h5netcdf'):
try:
ds = xr.open_dataset(nc_path, engine=engine)
break
except Exception as exc:
last_err = exc
if ds is None:
print(f'Could not open {nc_path} (last error: {last_err}). '
f'Falling back to baseline.xlsx in the cells below.')
else:
print(f'Total cost (NPV): ${float(ds.cost):,.0f}')
print(f'Variables: {sorted(ds.data_vars)[:6]} ...')
Total cost (NPV): $7,317,497,191
Variables: ['carbon', 'carbon_breakdown', 'charge', 'cost', 'cost_fix', 'cost_fix_breakdown'] ...
Common plots from a solved baseline¶
Once python run.py writes output/baseline.nc (and baseline.xlsx), the cells below produce four standard diagnostics:
Installed capacity by technology
Annual generation mix by technology
Daily dispatch profile (stacked area over the year)
Cost breakdown (NPV decomposition)
The cells read from the .xlsx mirror so you don’t need a working netCDF4 build to view results.
[5]:
OUT = this_dir / 'output' / 'baseline.xlsx'
if not OUT.exists():
raise FileNotFoundError(
f'No solved baseline at {OUT}. Run `python run.py` from {this_dir} first.'
)
xl = pd.ExcelFile(OUT)
print('Sheets:', xl.sheet_names[:8], '...')
Sheets: ['trans_export', 'gen', 'install', 'carbon', 'charge', 'cost_var_breakdown', 'cost_fix_breakdown', 'cost_newtech_breakdown'] ...
Installed capacity¶
Single-year, single-zone, so capacity reduces to one horizontal bar per technology.
[6]:
inst = pd.read_excel(xl, 'install')
by_tech = (inst.groupby('tech')['install']
.sum().sort_values(ascending=True))
fig, ax = plt.subplots(figsize=(8, 5))
by_tech.plot.barh(ax=ax, color='steelblue')
ax.set_xlabel('Capacity (MW)')
ax.set_ylabel('')
ax.set_title('Installed capacity in 2023, by tech')
for i, v in enumerate(by_tech.values):
ax.text(v + 100, i, f'{v:,.0f}', va='center', fontsize=8)
plt.tight_layout()
plt.show()
Annual generation mix¶
Annual TWh per technology. Note how much of the load is met by gas and coal — the dataset has no carbon cap, so the dispatch is driven purely by variable cost.
[7]:
gen = pd.read_excel(xl, 'gen')
by_tech_twh = (gen.groupby('tech')['gen']
.sum().div(1e6).sort_values(ascending=True))
fig, ax = plt.subplots(figsize=(8, 5))
by_tech_twh.plot.barh(ax=ax, color='teal')
ax.set_xlabel('Generation (TWh / year)')
ax.set_ylabel('')
ax.set_title('Annual generation by tech')
for i, v in enumerate(by_tech_twh.values):
if v > 0.01:
ax.text(v + 0.5, i, f'{v:,.1f}', va='center', fontsize=8)
plt.tight_layout()
plt.show()
Daily dispatch profile¶
Aggregate hourly generation into daily means and stack by technology to see seasonal dynamics. Hydro output rises with the monsoon; gas plays the swing role; renewables are roughly flat.
[8]:
# gen[hour=1..8760, tech] -> reshape to (day, tech) and stack.
gen_pivot = gen.pivot_table(index='hour', columns='tech',
values='gen', aggfunc='sum').fillna(0)
gen_pivot.index = pd.to_datetime('2022-12-31') + pd.to_timedelta(gen_pivot.index, unit='h')
daily = gen_pivot.resample('D').mean() # MW (daily-mean output)
tech_order = (gen_pivot.sum(axis=0).sort_values(ascending=False).index.tolist())
daily = daily[tech_order]
fig, ax = plt.subplots(figsize=(11, 5))
ax.stackplot(daily.index, daily.T.values, labels=tech_order,
colors=plt.cm.tab20.colors)
ax.set_ylabel('Daily-mean output (MW)')
ax.set_title('Daily dispatch profile, 2023')
ax.legend(loc='upper left', bbox_to_anchor=(1.0, 1.0), fontsize=8, ncol=1)
ax.margins(x=0)
plt.tight_layout()
plt.show()
Cost breakdown¶
All costs are NPV-discounted to the start year. With no expansion (carbon cap is inf in this scenario), cost_newtech and cost_newline are 0 — the model only pays operating costs on the existing fleet.
[9]:
components = ['cost_var', 'cost_fix', 'cost_newtech', 'cost_newline']
values = [pd.read_excel(xl, c).iloc[0, 1] / 1e9 for c in components]
income = pd.read_excel(xl, 'income').iloc[0, 1] / 1e9
total = pd.read_excel(xl, 'cost').iloc[0, 1] / 1e9
labels = components + ['income (-)', 'total']
vals = values + [-income, total]
colors = ['#4477aa', '#66ccee', '#228833', '#ccbb44',
'#ee6677', '#aa3377']
fig, ax = plt.subplots(figsize=(7, 4))
ax.bar(labels, vals, color=colors)
for i, v in enumerate(vals):
ax.text(i, v + 0.05, f'{v:,.2f}', ha='center', fontsize=9)
ax.set_ylabel('NPV cost (B USD)')
ax.set_title('Cost decomposition')
ax.axhline(0, color='black', linewidth=0.5)
plt.tight_layout()
plt.show()
Where to next¶
Quickstart notebook – the canonical 5-minute walkthrough on a smaller 3-zone synthetic dataset.
Southeast Asia notebook – a 5-country regional case with 57 reservoirs.
Model Inputs/Outputs – every file in
input/documented.