load_data¶
This module contains functions for loading and processing data from JSON and Excel files.
- prepshot.load_data.check_schema(params_info)[source]¶
Validate that
params.jsondeclares a compatible_schema_version.Raises a
RuntimeErrorwith a clear migration hint if the file is missing the stamp or carries a different version than this release supports.- Parameters
params_info (dict) -- Parsed contents of
params.json.- Return type
None
- prepshot.load_data.compute_active_lines(data_store)[source]¶
Stamp the data store with the sparse set of directed transmission lines that actually exist or are buildable.
A pair
(z1, z2)is in the active set iff it appears intransmission_existing(with any commission year) OR intransmission_candidates(withcapacity_max > 0). Both directions of every undirected line are kept -- the LP models them as separate flow variables so transmission losses on each direction price independently.On the Thai PCM 472-bus topology this is ~615 directed pairs, vs 472**2 = 222 784 dense bus-pair combinations. The 360x reduction propagates into
trans_export,cap_newline,cap_lines_existing, and the 5 transmission constraint families.- Parameters
data_store (dict) --
- Return type
None
- prepshot.load_data.compute_active_zone_tech(data_store)[source]¶
Stamp the data store with the sparse
(zone, tech)set derived from the loaded inputs.A
(z, te)pair is "active" iff the tech has nonzero capacity at the zone in some commission year (existing_fleet) OR can be built there perexpansion_candidates(capacity_max > 0). For everything else the model would carry pure-zero variables and constraints with no impact on the LP.On the full Thai PCM (472 zones x 212 techs, 100 064 dense pairs) only ~212 pairs are active -- a ~470x reduction. We compute it here at load time so every downstream caller (incl. the per-PCM- window
create_model) reuses the same precomputed sparsity instead of re-deriving it from the fleet dict each time.- Stamps:
data_store['active_zt']-- sorted list of(z, te)tuplesdata_store['tech_zones']-- dictte -> [z, ...]data_store['zone_techs']-- dictz -> [te, ...]data_store['active_zt_storage']--active_ztfiltered totechs in
tech_registryflaggedis_storage.
- Parameters
data_store (dict) --
- Return type
None
- prepshot.load_data.compute_cost_factors(data_store)[source]¶
Calculate cost factors for various transmission investment and operational costs.
- Parameters
data_store (dict) -- Dictionary containing loaded parameters.
- Return type
None
- prepshot.load_data.extract_config_data(config_data)[source]¶
Extract necessary data from configuration settings.
- Parameters
config_data (dict) -- Configuration data for the model.
- Returns
Dictionary containing necessary configuration data.
- Return type
dict
- prepshot.load_data.extract_sets(data_store)[source]¶
Extract simple sets from loaded parameters.
- Parameters
data_store (dict) -- Dictionary containing loaded parameters.
- Return type
None
- prepshot.load_data.load_excel_data(input_folder, params_info, data_store)[source]¶
Load input data based on the provided parameters.
The function dispatches on
"format":"format": "long"(default) -- load from a.csvfile in tidy form (dimension columns first, value column last). Seeread_long_csv()."format": "table"-- load from a.csvfile with multiple value columns, returned as a DataFrame so consumers can usegroupby, column-by-name access, etc.
Each entry may also declare
"required": falseand a"default"value. If the file for an optional parameter is missing, the loader silently substitutes the default (or an empty dict if none is given) and logs a debug message. Required parameters with missing files still terminate the process.The legacy function name is kept for backwards-compatible imports; despite the
_excel_in the name, all on-disk inputs are CSV as of v1.5.0.- Parameters
input_folder (str) -- Path to the input folder.
params_info (dict) -- Dictionary containing parameter names and their corresponding file information.
data_store (dict) -- Dictionary to store loaded data.
- Return type
None
- prepshot.load_data.load_json(file_path)[source]¶
Load data from a JSON file.
- Parameters
file_path (str) -- Path to the JSON file.
- Returns
Dictionary containing data from the JSON file.
- Return type
dict
- prepshot.load_data.process_data(params_info, input_folder)[source]¶
Load and process data from input folder based on parameters settings.
- Parameters
params_info (dict) -- Dictionary containing parameters information.
input_folder (str) -- Path to the input folder.
- Returns
Dictionary containing processed parameters.
- Return type
dict
- prepshot.load_data.read_long_csv(filename, dropna=True)[source]¶
Read a long-format ("tidy") CSV input file.
The convention is: dimension columns first, value column last. For example a 2-dim input
carbon_taxlooks like:zone,year,value BA1,2020,0 BA1,2025,5 BA2,2020,0
The returned dict matches the shape produced by the wide-format reader for the same parameter:
1 dimension column ->
{key: value}(scalar keys)2+ dimension columns ->
{(d1, d2, ...): value}(tuple keys)
The ORDER of the dimension columns in the CSV determines the order of elements in the output keys, so model-side lookups work unchanged regardless of which format the file is in on disk.
- Parameters
filename (str) -- Path to the CSV file.
dropna (bool) -- If True, rows with any NaN are dropped before keying.
- Returns
Mapping from dimension key (or tuple of keys) to the value.
- Return type
dict