Preanalysis Folder Overview#
The pre-analysis/ workspace separates exploratory data ingestion from model-ready processing.
Use
prepare-data/to turn those datasets into Electricity Planning Model (EPM) inputs such aspAvailability,pVREgenProfile, and demand profiles.Use
open-data/to download, QA, and harmonize external datasets.
Objective#
Produce clean, versioned inputs for EPM by:
reshaping and validating those datasets against the current perimeter (prepare-data stage)
exporting consistent CSVs that match the structure in
epm/input/data_cappcurating third-party climate, hydro, renewable, and generation data (open-data stage)
Workspace Layout#
Path |
Role |
Highlights |
|---|---|---|
|
Exploratory notebooks that ingest APIs, shapefiles, and atlas workbooks, plus QA tools (plots, Folium maps). |
Renewable Ninja & IRENA harvesters, GRDC inflow prep, hydro basin QA, hydro atlas comparisons. |
|
Deterministic workflows that reshape curated datasets into EPM-ready CSVs and diagnostics. |
Climatic overview, load profile builders, representative days, hydro availability, supply-demand balance checks. |
prepare-data workflows#
Notebook / Module |
Purpose |
Key outputs |
|---|---|---|
|
Profiles ERA5-Land temperature/precipitation to define seasons, wet/dry periods, and candidate representative years for each zone. |
Climate diagnostics in |
|
Builds hourly demand profiles by fusing monthly means with hourly shapes and sanity checks. |
Hourly |
|
Cleans historical load measurements (outlier removal, missing-data infill) before feeding the builder. |
Treated historical series in |
|
Generates forecast and QA plots for stakeholder review (peak vs average, growth trends). |
PNG/HTML dashboards in |
|
Clusters climate and load time series to produce reduced time slices. |
|
|
Checks that the supply fleet plus renewables meet the treated demand under each scenario; flags deficits before GAMS runs. |
Balance tables/plots in |
|
Converts monthly hydro shapes into reservoir |
Final hydro CSVs under |
|
Experimental picker for representative hydropower years; use to sample dry/baseline/wet seasons before exporting availability tables. |
Candidate |
|
Shared helpers for ERA5 extraction, aggregation, and plotting. |
Imported across notebooks; no standalone output. |
|
Migration scripts that convert historic SPLAT/EPM spreadsheets into the current column naming. |
Intermediate CSVs stored locally before copying to |
Inputs & outputs: Every subfolder follows the same ruleādrop raw/intermediate assets into input/, and keep notebook-produced artifacts inside output/ until you promote them into epm/input.
open-data notebooks#
Notebook |
Focus & what you get |
Typical outputs |
|---|---|---|
|
Downloads IRENA wind/solar profiles using SPLAT naming, producing hourly capacity-factor tables per zone-season. |
CSV grids plus QA plots under |
|
Calls the Renewable Ninja API using coordinates from the generation catalog; writes harmonized solar/wind profiles. |
Hourly CF CSVs ( |
|
Builds the coordinate list (lat/lon) from generation assets so Renewable Ninja pulls the right plants. |
Coordinate CSV consumed by the Ninja notebook. |
|
Visualizes generation databases on interactive maps to verify coverage and technology tagging. |
HTML/PNG maps in |
|
Compares utility capacity factors with the African Hydropower Atlas before adopting Atlas curves. |
QA plots plus comparison tables (save manually as needed). |
|
Inspects GRDC catchments and HydroRIVERS shapefiles to link plants with upstream basins. |
GeoDataFrames/maps stored under |
|
(WIP) Merges African Hydropower Atlas profiles with Global Hydropower Tracker metadata for a consolidated catalog. |
Draft merged tables in |
|
Processes GRDC NetCDF station data, intersects HydroRIVERS, and exports inflow/runoff diagnostics. |
Cleaned CSVs/GeoPackages plus Folium maps. |
Use these notebooks when you need to refresh the underlying open datasets. Once the exploratory outputs look correct, feed them into the deterministic routines inside prepare-data/.