load_data

This module contains functions for loading and processing data from JSON and Excel files.

prepshot.load_data.check_schema(params_info)[源代码]

Validate that params.json declares a compatible _schema_version.

Raises a RuntimeError with a clear migration hint if the file is missing the stamp or carries a different version than this release supports.

参数

params_info (dict) -- Parsed contents of params.json.

返回类型

None

prepshot.load_data.compute_active_lines(data_store)[源代码]

Stamp the data store with the sparse set of directed transmission lines that actually exist or are buildable.

A pair (z1, z2) is in the active set iff it appears in transmission_existing (with any commission year) OR in transmission_candidates (with capacity_max > 0). Both directions of every undirected line are kept -- the LP models them as separate flow variables so transmission losses on each direction price independently.

On the Thai PCM 472-bus topology this is ~615 directed pairs, vs 472**2 = 222 784 dense bus-pair combinations. The 360x reduction propagates into trans_export, cap_newline, cap_lines_existing, and the 5 transmission constraint families.

参数

data_store (dict) --

返回类型

None

prepshot.load_data.compute_active_zone_tech(data_store)[源代码]

Stamp the data store with the sparse (zone, tech) set derived from the loaded inputs.

A (z, te) pair is "active" iff the tech has nonzero capacity at the zone in some commission year (existing_fleet) OR can be built there per expansion_candidates (capacity_max > 0). For everything else the model would carry pure-zero variables and constraints with no impact on the LP.

On the full Thai PCM (472 zones x 212 techs, 100 064 dense pairs) only ~212 pairs are active -- a ~470x reduction. We compute it here at load time so every downstream caller (incl. the per-PCM- window create_model) reuses the same precomputed sparsity instead of re-deriving it from the fleet dict each time.

Stamps:

data_store['active_zt'] -- sorted list of (z, te) tuples data_store['tech_zones'] -- dict te -> [z, ...] data_store['zone_techs'] -- dict z -> [te, ...] data_store['active_zt_storage'] -- active_zt filtered to

techs in tech_registry flagged is_storage.

参数

data_store (dict) --

返回类型

None

prepshot.load_data.compute_cost_factors(data_store)[源代码]

Calculate cost factors for various transmission investment and operational costs.

参数

data_store (dict) -- Dictionary containing loaded parameters.

返回类型

None

prepshot.load_data.extract_config_data(config_data)[源代码]

Extract necessary data from configuration settings.

参数

config_data (dict) -- Configuration data for the model.

返回

Dictionary containing necessary configuration data.

返回类型

dict

prepshot.load_data.extract_sets(data_store)[源代码]

Extract simple sets from loaded parameters.

参数

data_store (dict) -- Dictionary containing loaded parameters.

返回类型

None

prepshot.load_data.load_excel_data(input_folder, params_info, data_store)[源代码]

Load input data based on the provided parameters.

The function dispatches on "format":

  • "format": "long" (default) -- load from a .csv file in tidy form (dimension columns first, value column last). See read_long_csv().

  • "format": "table" -- load from a .csv file with multiple value columns, returned as a DataFrame so consumers can use groupby, column-by-name access, etc.

Each entry may also declare "required": false and a "default" value. If the file for an optional parameter is missing, the loader silently substitutes the default (or an empty dict if none is given) and logs a debug message. Required parameters with missing files still terminate the process.

The legacy function name is kept for backwards-compatible imports; despite the _excel_ in the name, all on-disk inputs are CSV as of v1.5.0.

参数
  • input_folder (str) -- Path to the input folder.

  • params_info (dict) -- Dictionary containing parameter names and their corresponding file information.

  • data_store (dict) -- Dictionary to store loaded data.

返回类型

None

prepshot.load_data.load_json(file_path)[源代码]

Load data from a JSON file.

参数

file_path (str) -- Path to the JSON file.

返回

Dictionary containing data from the JSON file.

返回类型

dict

prepshot.load_data.process_data(params_info, input_folder)[源代码]

Load and process data from input folder based on parameters settings.

参数
  • params_info (dict) -- Dictionary containing parameters information.

  • input_folder (str) -- Path to the input folder.

返回

Dictionary containing processed parameters.

返回类型

dict

prepshot.load_data.read_long_csv(filename, dropna=True)[源代码]

Read a long-format ("tidy") CSV input file.

The convention is: dimension columns first, value column last. For example a 2-dim input carbon_tax looks like:

zone,year,value
BA1,2020,0
BA1,2025,5
BA2,2020,0

The returned dict matches the shape produced by the wide-format reader for the same parameter:

  • 1 dimension column -> {key: value} (scalar keys)

  • 2+ dimension columns -> {(d1, d2, ...): value} (tuple keys)

The ORDER of the dimension columns in the CSV determines the order of elements in the output keys, so model-side lookups work unchanged regardless of which format the file is in on disk.

参数
  • filename (str) -- Path to the CSV file.

  • dropna (bool) -- If True, rows with any NaN are dropped before keying.

返回

Mapping from dimension key (or tuple of keys) to the value.

返回类型

dict