Get Renewables Coordinates#
Objective Identify the most representative solar and wind project coordinates per Central African country so other notebooks can reuse consistent lat/long inputs.
Data Requirements & Methods
Provide
Global-Integrated-Power-April-2025.xlsxunderpre-analysis/open-data/input/.The workflow filters the Global Integrated Power dataset for focus countries and renewable technologies, prioritizes projects by status and capacity, and saves both tabular and dictionary outputs.
Overview of Steps
Step 1 - Import core libraries.
Step 2 - List the target countries.
Step 3 - Set up input and output folders.
Step 4 - Load and filter facility data.
Step 5 - Select the most relevant solar and wind projects.
Step 6 - Display and export the summary table.
Step 7 - Format coordinates for downstream scripts.
Step 1 - Import required libraries#
import os
import pandas as pd
Step 2 - Define target countries#
countries = ['Angola', 'Burundi', 'Cameroon', 'Central African Republic', 'Chad', 'Republic of the Congo', 'DR Congo', 'Equatorial Guinea', 'Gabon']
Step 3 - Prepare input and output folders#
folder_input = 'input'
if not os.path.exists(folder_input): os.mkdir(folder_input)
folder_output = 'output'
if not os.path.exists(folder_output): os.mkdir(folder_output)
Step 4 - Load and filter the power facilities data#
path = os.path.join(folder_input, 'Global-Integrated-Power-April-2025.xlsx')
if not os.path.exists(path):
raise FileNotFoundError(f"File not found: {path}. Please ensure the file exists in the specified directory.")
data_power = pd.read_excel(path, sheet_name='Power facilities', index_col=None, header=[0])
# Print if missing countries
for c in countries:
if c not in data_power['Country/area'].unique():
print(f"Warning: {c} not found in the dataset.")
data_power = data_power[data_power['Country/area'].isin(countries)]
print(data_power['Country/area'].unique())
data_power = data_power[data_power['Type'].isin(['solar', 'wind'])]
/opt/anaconda3/envs/HydroData/lib/python3.10/site-packages/openpyxl/worksheet/_read_only.py:85: UserWarning: Unknown extension is not supported and will be removed
for idx, row in parser.parse():
Step 5 - Extract the most relevant projects#
def get_most_relevant_projects(df, countries, tech_types):
"""
Extract the most relevant energy project for each country and technology.
Parameters:
- df: pandas DataFrame with columns ['Country/area', 'Type', 'Status', 'Capacity (MW)',
'Plant / Project name', 'Latitude', 'Longitude', 'City']
- countries: list of country names to filter
- tech_types: list of technology types to check (values from 'Type' column)
Returns:
- A DataFrame with the most relevant project per country and technology.
"""
results = []
status_priority = ['operating', 'construction', 'pre-construction', 'announced']
for country in countries:
for tech in tech_types:
filtered = df[(df['Country/area'] == country) & (df['Type'] == tech)]
found = False
for status in status_priority:
sub = filtered[filtered['Status'] == status]
if not sub.empty:
top_project = sub.loc[sub['Capacity (MW)'].idxmax()]
results.append({
'Country': country,
'Type': tech,
'Plant / Project name': top_project['Plant / Project name'],
'Capacity (MW)': top_project['Capacity (MW)'],
'Latitude': top_project['Latitude'],
'Longitude': top_project['Longitude'],
'City': top_project['City']
})
found = True
break # stop at the first status with a result
# Fallback: search all remaining statuses
if not found:
remaining_statuses = set(df['Status'].unique()) - set(status_priority)
sub = filtered[filtered['Status'].isin(remaining_statuses)]
if not sub.empty:
top_project = sub.loc[sub['Capacity (MW)'].idxmax()]
results.append({
'Country': country,
'Type': tech,
'Plant / Project name': top_project['Plant / Project name'],
'Capacity (MW)': top_project['Capacity (MW)'],
'Latitude': top_project['Latitude'],
'Longitude': top_project['Longitude'],
'City': top_project['City']
})
return pd.DataFrame(results)
projects = get_most_relevant_projects(data_power, countries, ['solar', 'wind'])
Step 6 - Display and save results#
display(projects)
projects.to_csv(os.path.join(folder_output, 'most_relevant_projects_solar_wind.csv'), index=False, encoding='utf-8')
# Print missing countries and technologies
for tech in ['solar', 'wind']:
temp = projects[projects['Type'] == tech]
for c in countries:
if c not in temp['Country'].unique():
print(f"Warning: No projects found for {c} for {tech}.")
| Country | Type | Plant / Project name | Capacity (MW) | Latitude | Longitude | City | |
|---|---|---|---|---|---|---|---|
| 0 | Angola | solar | Biópio solar farm | 189.0 | -12.4667 | 15.7442 | Biópio |
| 1 | Angola | wind | Morro do Ouro wind farm | 50.0 | -15.2198 | 12.7258 | NaN |
| 2 | Burundi | solar | Bubanza SB solar project | 8.0 | -3.1479 | 29.3625 | Bubanza |
| 3 | Cameroon | solar | Garoua solar farm | 30.0 | 9.3271 | 13.3621 | Garoua |
| 4 | Cameroon | wind | Cameroon wind farm | 100.0 | 4.8918 | 12.3598 | NaN |
| 5 | Central African Republic | solar | Enerca Danzi solar farm | 25.0 | 4.5167 | 18.4667 | Danzi |
| 6 | Chad | solar | N'Djamena (Qair) solar farm | 21.5 | 12.0905 | 15.1331 | Gassi District |
| 7 | Chad | wind | Centrales d’Energie Renouvelable de N’Djamena ... | 100.0 | 12.1232 | 15.0712 | N'Djamena |
| 8 | Republic of the Congo | solar | District De Kabo solar project | 2.0 | 1.3863 | 16.3364 | NaN |
| 9 | DR Congo | solar | Goma Nyiragongo solar project | 1.5 | -1.6414 | 29.1938 | Goma |
| 10 | Gabon | solar | Ayémé Plaine solar farm | 60.0 | 0.2293 | 9.5315 | Ayémé Plaine |
Warning: No projects found for Equatorial Guinea for solar.
Warning: No projects found for Burundi for wind.
Warning: No projects found for Central African Republic for wind.
Warning: No projects found for Republic of the Congo for wind.
Warning: No projects found for DR Congo for wind.
Warning: No projects found for Equatorial Guinea for wind.
Warning: No projects found for Gabon for wind.
Step 7 - Format project coordinates for downstream scripts#
Transform the summary table into the dictionary structure expected by other notebooks (e.g., get_renewable_ninja_data.ipynb).
def format_project_locations(df_relevant):
"""
Convert the most relevant projects DataFrame into a dictionary formatted like the original 'locations' dict.
Returns:
dict with structure: {tech: {country_key: (lat, lon)}}
"""
result = {}
for _, row in df_relevant.iterrows():
tech = row['Type'].lower() # e.g., 'solar', 'wind'
#country_key = row['Country'].lower().replace(' ', '').replace('-', '') # match the original format
if tech not in result:
result[tech] = {}
result[tech][row['Country']] = (row['Latitude'], row['Longitude'])
return result
# Format the projects into the desired structure
formatted_projects = format_project_locations(projects)
display(formatted_projects)
{'solar': {'Angola': (-12.4667, 15.7442),
'Burundi': (-3.1479, 29.3625),
'Cameroon': (9.3271, 13.3621),
'Central African Republic': (4.5167, 18.4667),
'Chad': (12.0905, 15.1331),
'Republic of the Congo': (1.3863, 16.3364),
'DR Congo': (-1.6414, 29.1938),
'Gabon': (0.2293, 9.5315)},
'wind': {'Angola': (-15.2198, 12.7258),
'Cameroon': (4.8918, 12.3598),
'Chad': (12.1232, 15.0712)}}