load_data¶

This module contains functions for loading and processing data from JSON and Excel files.

prepshot.load_data.check_schema(params_info)[source]¶

Validate that params.json declares a compatible _schema_version.

Raises a RuntimeError with a clear migration hint if the file is missing the stamp or carries a different version than this release supports.

Parameters: params_info (dict) -- Parsed contents of params.json.
Return type: None

prepshot.load_data.compute_active_lines(data_store)[source]¶

Stamp the data store with the sparse set of directed transmission lines that actually exist or are buildable.

A pair (z1, z2) is in the active set iff it appears in transmission_existing (with any commission year) OR in transmission_candidates (with capacity_max > 0). Both directions of every undirected line are kept -- the LP models them as separate flow variables so transmission losses on each direction price independently.

On the Thai PCM 472-bus topology this is ~615 directed pairs, vs 472**2 = 222 784 dense bus-pair combinations. The 360x reduction propagates into trans_export, cap_newline, cap_lines_existing, and the 5 transmission constraint families.

Parameters: data_store (dict) --
Return type: None

prepshot.load_data.compute_active_zone_tech(data_store)[source]¶

Stamp the data store with the sparse (zone, tech) set derived from the loaded inputs.

A (z, te) pair is "active" iff the tech has nonzero capacity at the zone in some commission year (existing_fleet) OR can be built there per expansion_candidates (capacity_max > 0). For everything else the model would carry pure-zero variables and constraints with no impact on the LP.

On the full Thai PCM (472 zones x 212 techs, 100 064 dense pairs) only ~212 pairs are active -- a ~470x reduction. We compute it here at load time so every downstream caller (incl. the per-PCM- window create_model) reuses the same precomputed sparsity instead of re-deriving it from the fleet dict each time.

Stamps:: data_store['active_zt'] -- sorted list of (z, te) tuples data_store['tech_zones'] -- dict te -> [z, ...] data_store['zone_techs'] -- dict z -> [te, ...] data_store['active_zt_storage'] -- active_zt filtered to

techs in tech_registry flagged is_storage.

Parameters: data_store (dict) --
Return type: None

prepshot.load_data.compute_cost_factors(data_store)[source]¶

Calculate cost factors for various transmission investment and operational costs.

Parameters: data_store (dict) -- Dictionary containing loaded parameters.
Return type: None

prepshot.load_data.extract_config_data(config_data)[source]¶

Extract necessary data from configuration settings.

Parameters: config_data (dict) -- Configuration data for the model.
Returns: Dictionary containing necessary configuration data.
Return type: dict

prepshot.load_data.extract_sets(data_store)[source]¶

Extract simple sets from loaded parameters.

Parameters: data_store (dict) -- Dictionary containing loaded parameters.
Return type: None

prepshot.load_data.load_excel_data(input_folder, params_info, data_store)[source]¶

Load input data based on the provided parameters.

The function dispatches on "format":

"format": "long" (default) -- load from a .csv file in tidy form (dimension columns first, value column last). See read_long_csv().
"format": "table" -- load from a .csv file with multiple value columns, returned as a DataFrame so consumers can use groupby, column-by-name access, etc.

Each entry may also declare "required": false and a "default" value. If the file for an optional parameter is missing, the loader silently substitutes the default (or an empty dict if none is given) and logs a debug message. Required parameters with missing files still terminate the process.

The legacy function name is kept for backwards-compatible imports; despite the _excel_ in the name, all on-disk inputs are CSV as of v1.5.0.

Parameters

input_folder (str) -- Path to the input folder.
params_info (dict) -- Dictionary containing parameter names and their corresponding file information.
data_store (dict) -- Dictionary to store loaded data.

Return type

None

prepshot.load_data.load_json(file_path)[source]¶

Load data from a JSON file.

Parameters: file_path (str) -- Path to the JSON file.
Returns: Dictionary containing data from the JSON file.
Return type: dict

prepshot.load_data.process_data(params_info, input_folder)[source]¶

Load and process data from input folder based on parameters settings.

Parameters

params_info (dict) -- Dictionary containing parameters information.
input_folder (str) -- Path to the input folder.

Returns

Dictionary containing processed parameters.

Return type

dict

prepshot.load_data.read_long_csv(filename, dropna=True)[source]¶

Read a long-format ("tidy") CSV input file.

The convention is: dimension columns first, value column last. For example a 2-dim input carbon_tax looks like:

zone,year,value
BA1,2020,0
BA1,2025,5
BA2,2020,0

The returned dict matches the shape produced by the wide-format reader for the same parameter:

1 dimension column -> {key: value} (scalar keys)
2+ dimension columns -> {(d1, d2, ...): value} (tuple keys)

The ORDER of the dimension columns in the CSV determines the order of elements in the output keys, so model-side lookups work unchanged regardless of which format the file is in on disk.

Parameters

filename (str) -- Path to the CSV file.
dropna (bool) -- If True, rows with any NaN are dropped before keying.

Returns

Mapping from dimension key (or tuple of keys) to the value.

Return type

dict