Get Renewables Coordinates#

Objective Identify the most representative solar and wind project coordinates per Central African country so other notebooks can reuse consistent lat/long inputs.

Data Requirements & Methods

  • Provide Global-Integrated-Power-April-2025.xlsx under pre-analysis/open-data/input/.

  • The workflow filters the Global Integrated Power dataset for focus countries and renewable technologies, prioritizes projects by status and capacity, and saves both tabular and dictionary outputs.

Overview of Steps

  1. Step 1 - Import core libraries.

  2. Step 2 - List the target countries.

  3. Step 3 - Set up input and output folders.

  4. Step 4 - Load and filter facility data.

  5. Step 5 - Select the most relevant solar and wind projects.

  6. Step 6 - Display and export the summary table.

  7. Step 7 - Format coordinates for downstream scripts.

Step 1 - Import required libraries#

import os
import pandas as pd

Step 2 - Define target countries#

countries = ['Angola', 'Burundi', 'Cameroon', 'Central African Republic', 'Chad', 'Republic of the Congo', 'DR Congo', 'Equatorial Guinea', 'Gabon']

Step 3 - Prepare input and output folders#

folder_input = 'input'
if not os.path.exists(folder_input): os.mkdir(folder_input)
folder_output = 'output'
if not os.path.exists(folder_output): os.mkdir(folder_output)

Step 4 - Load and filter the power facilities data#

path = os.path.join(folder_input, 'Global-Integrated-Power-April-2025.xlsx')
if not os.path.exists(path):
    raise FileNotFoundError(f"File not found: {path}. Please ensure the file exists in the specified directory.")

data_power = pd.read_excel(path, sheet_name='Power facilities', index_col=None, header=[0])

# Print if missing countries
for c in countries:
    if c not in data_power['Country/area'].unique():
        print(f"Warning: {c} not found in the dataset.")

data_power = data_power[data_power['Country/area'].isin(countries)]
print(data_power['Country/area'].unique())
data_power = data_power[data_power['Type'].isin(['solar', 'wind'])]
/opt/anaconda3/envs/HydroData/lib/python3.10/site-packages/openpyxl/worksheet/_read_only.py:85: UserWarning: Unknown extension is not supported and will be removed
  for idx, row in parser.parse():

Step 5 - Extract the most relevant projects#

def get_most_relevant_projects(df, countries, tech_types):
    """
    Extract the most relevant energy project for each country and technology.

    Parameters:
    - df: pandas DataFrame with columns ['Country/area', 'Type', 'Status', 'Capacity (MW)',
                                         'Plant / Project name', 'Latitude', 'Longitude', 'City']
    - countries: list of country names to filter
    - tech_types: list of technology types to check (values from 'Type' column)

    Returns:
    - A DataFrame with the most relevant project per country and technology.
    """

    results = []

    status_priority = ['operating', 'construction', 'pre-construction', 'announced']

    for country in countries:
        for tech in tech_types:
            filtered = df[(df['Country/area'] == country) & (df['Type'] == tech)]

            found = False
            for status in status_priority:
                sub = filtered[filtered['Status'] == status]
                if not sub.empty:
                    top_project = sub.loc[sub['Capacity (MW)'].idxmax()]
                    results.append({
                        'Country': country,
                        'Type': tech,
                        'Plant / Project name': top_project['Plant / Project name'],
                        'Capacity (MW)': top_project['Capacity (MW)'],
                        'Latitude': top_project['Latitude'],
                        'Longitude': top_project['Longitude'],
                        'City': top_project['City']
                    })
                    found = True
                    break  # stop at the first status with a result
                    # Fallback: search all remaining statuses

            if not found:
                remaining_statuses = set(df['Status'].unique()) - set(status_priority)
                sub = filtered[filtered['Status'].isin(remaining_statuses)]
                if not sub.empty:
                    top_project = sub.loc[sub['Capacity (MW)'].idxmax()]
                    results.append({
                        'Country': country,
                        'Type': tech,
                        'Plant / Project name': top_project['Plant / Project name'],
                        'Capacity (MW)': top_project['Capacity (MW)'],
                        'Latitude': top_project['Latitude'],
                        'Longitude': top_project['Longitude'],
                        'City': top_project['City']
                    })

    return pd.DataFrame(results)

projects = get_most_relevant_projects(data_power, countries, ['solar', 'wind'])

Step 6 - Display and save results#

display(projects)
projects.to_csv(os.path.join(folder_output, 'most_relevant_projects_solar_wind.csv'), index=False, encoding='utf-8')

# Print missing countries and technologies
for tech in ['solar', 'wind']:
    temp = projects[projects['Type'] == tech]
    for c in countries:
        if c not in temp['Country'].unique():
            print(f"Warning: No projects found for {c} for {tech}.")
Country Type Plant / Project name Capacity (MW) Latitude Longitude City
0 Angola solar Biópio solar farm 189.0 -12.4667 15.7442 Biópio
1 Angola wind Morro do Ouro wind farm 50.0 -15.2198 12.7258 NaN
2 Burundi solar Bubanza SB solar project 8.0 -3.1479 29.3625 Bubanza
3 Cameroon solar Garoua solar farm 30.0 9.3271 13.3621 Garoua
4 Cameroon wind Cameroon wind farm 100.0 4.8918 12.3598 NaN
5 Central African Republic solar Enerca Danzi solar farm 25.0 4.5167 18.4667 Danzi
6 Chad solar N'Djamena (Qair) solar farm 21.5 12.0905 15.1331 Gassi District
7 Chad wind Centrales d’Energie Renouvelable de N’Djamena ... 100.0 12.1232 15.0712 N'Djamena
8 Republic of the Congo solar District De Kabo solar project 2.0 1.3863 16.3364 NaN
9 DR Congo solar Goma Nyiragongo solar project 1.5 -1.6414 29.1938 Goma
10 Gabon solar Ayémé Plaine solar farm 60.0 0.2293 9.5315 Ayémé Plaine
Warning: No projects found for Equatorial Guinea for solar.
Warning: No projects found for Burundi for wind.
Warning: No projects found for Central African Republic for wind.
Warning: No projects found for Republic of the Congo for wind.
Warning: No projects found for DR Congo for wind.
Warning: No projects found for Equatorial Guinea for wind.
Warning: No projects found for Gabon for wind.

Step 7 - Format project coordinates for downstream scripts#

Transform the summary table into the dictionary structure expected by other notebooks (e.g., get_renewable_ninja_data.ipynb).

def format_project_locations(df_relevant):
    """
    Convert the most relevant projects DataFrame into a dictionary formatted like the original 'locations' dict.

    Returns:
        dict with structure: {tech: {country_key: (lat, lon)}}
    """
    result = {}

    for _, row in df_relevant.iterrows():
        tech = row['Type'].lower()  # e.g., 'solar', 'wind'
        #country_key = row['Country'].lower().replace(' ', '').replace('-', '')  # match the original format
        if tech not in result:
            result[tech] = {}

        result[tech][row['Country']] = (row['Latitude'], row['Longitude'])

    return result

# Format the projects into the desired structure
formatted_projects = format_project_locations(projects)
display(formatted_projects)
{'solar': {'Angola': (-12.4667, 15.7442),
  'Burundi': (-3.1479, 29.3625),
  'Cameroon': (9.3271, 13.3621),
  'Central African Republic': (4.5167, 18.4667),
  'Chad': (12.0905, 15.1331),
  'Republic of the Congo': (1.3863, 16.3364),
  'DR Congo': (-1.6414, 29.1938),
  'Gabon': (0.2293, 9.5315)},
 'wind': {'Angola': (-15.2198, 12.7258),
  'Cameroon': (4.8918, 12.3598),
  'Chad': (12.1232, 15.0712)}}