Addressing a Data Blind Spot for Clean Energy Modelers

A guest blog from Joshua Rhodes, PhD, Webber Energy Group, The University of Texas at Austin

Having an internet-connected computer in our pockets has given everyone the sense that every data source in the world is already connected. But researchers like me spend quite a bit of our time hunting down or creating data sets that fill the gaps between what we know and need to know.

Over the next several months, I’ll be working with Pecan Street to fill one key gap in the energy data world. Specifically, we’ll identify and collect hyperlocal sociodemographic and air quality data and build a modeling overlay that enables grid system modelers to assess the local benefits of shutting down individual power plants.


Well, we’ve known for decades that fossil fuel power plants produce a range of local pollutants that are associated with increased health risks, such as pregnancy complications, respiratory and cardiovascular diseases, and central nervous system diseases. Fortunately, these dirty coal power plants are being decommissioned at an increasing rate. As these power plants are shut down, grid operators and researchers use a system model to decide which plants to shut down and when. The models look at the entire electricity grid and take into account things like the economics of individual power plants and transmission system operations. But they typically don’t take into account local environmental or health impacts.

We know that communities of color are disproportionately impacted by these facilities and their pollution. A recent study by Food and Water Watch found that, in Pennsylvania, people of color (about one-fifth of Pennsylvania’s total population) make up close to half of the population living within three miles of existing and proposed power plants. This finding held across every income level. Upper-income communities of color were twice as likely to be near an existing power plant than the whitest, lower-income areas; overwhelmingly white and lower income communities were about half as likely to live within three miles of a plant.

But these sociodemographic trends and health risks generally aren’t included in the power systems models that utilities and regulators use to design the electric grid. The lack of such information prohibits modelers and grid operators from fully taking into account the public health, equity, and associated economic burdens on communities that surround power plants when making decisions about which plants to retire and which ones to continue operating.

The reason is grid modeling research lives in the engineering world and impact research lives in the health and social sciences. We need to bridge that gap.

Specifically, we need a data layer for grid models that standardizes health and racial disparities. Characterizing the impacts of existing individual (and potential future types of) power plants on surrounding communities and assigning those costs to the plant itself will allow the models to better see the hyperlocal social costs of each decision to run, retire, or build a power plant. Once the overlay is developed, modelers can use information in a variety of ways, such as sorting by direct operational costs and then sorting by community health costs or vice-versa, to seek the lowest cost future while addressing local inequities.

Accomplishing this will require a significant amount of novel and interdisciplinary research to map individual power plants and their associated operational impacts (risk per MWH of energy produced at a certain time, for example). Creation of a valid, interdisciplinary, data-driven methodology is critical for acceptance and adoption of the methodology and tool.

Despite their good intentions, researchers that model the growth of the electricity grid generally lack the skills necessary to develop these datasets on their own. The list of necessary stakeholders is large and diverse, and translating the lived-experience of those impacted by energy infrastructure decisions into a set of data that a computer can use to optimize how the grid will develop, while also decarbonizing, is no small task. But it’s a critical one.

The decisions made by computer models (and the people who rely on them) have real-world impacts. As we move toward new generation resources, like utility-scale renewable energy and distributed energy resources, we have an opportunity and obligation to make sure that everyone impacted by these decisions is heard and acknowledged.