By Cavan Merski, data analyst, Pecan Street

If you’ve read any Pecan Street blogs or posts, you know we frequently discuss the value our residential energy data brings to energy planners, innovators, and researchers, and that Dataport is the primary portal where our clients and partners access it. 

For researchers, analysts, and product developers seeking to understand residential energy consumption, EV charging, rooftop solar production, and load flexibility, Pecan Street offers several free entry points for exploration.

Over the last few months, we’ve been expanding the places where people can access some of our curated public datasets – free of charge – to help them better understand how to use these datasets for research, analysis, or experimentation. 

In this post, I outline the sample datasets available, how you can work with them, and where you can find tools and resources to accelerate your work.

Kaggle Sample Dataset: 10 Homes, 1-Minute Frequency

Our curated dataset on Kaggle provides minute-by-minute, circuit-level electricity data from three consecutive August days at 10 homes in Austin, Texas. This period includes a Four Coincident Peak (4CP) event, which is critical for analyzing stress on the ERCOT grid during summer demand.

The Kaggle Dataset includes:

  • 1-minute resolution, circuit-level data for whole-home, HVAC air handler and condenser, and rooftop solar
  • Electrical parameters, including Real Power (kW), Apparent Power (kVA), Total Harmonic Distortion (THD), Current (Amps) and Phase Angle (degrees)

This data can be used to:

  • Develop and test forecasting or disaggregation algorithms
  • Analyze load shape during grid stress periods
  • Model solar generation offset during peak demand
  • Explore correlations between HVAC usage and outdoor temperature

Extended Dataset Access for University Researchers

Academic researchers affiliated with accredited universities can request free access to a larger sample of Pecan Street’s residential electricity data via the Dataport platform. To get access, users must be a student, faculty member, or researcher at a four-year academic institution, and can use the data only for non-commercial educational and research purposes.

This extended dataset includes:

  • Real Power for up to 75 homes across multiple geographic regions (Austin, New York, California)
  • 1-minute and 1-second resolution
  • Expanded circuit coverage, including EV chargers, water heaters, dishwashers, freezers, lighting, solar inverters, HVAC, and more
  • Contextual metadata, including home characteristics and system configurations

This dataset supports in-depth analysis for projects in:

Solar and EV integration modeling

  • Load forecasting
  • Demand response
  • Appliance classification and disaggregation
  • Energy efficiency evaluation

Pecan Street GitHub Repository

To support onboarding and analysis, Pecan Street provides a set of open-source example notebooks and code resources via GitHub.

Resources include:

  • Jupyter notebooks for visualizing circuit-level electricity usage
  • Scripts to aggregate, clean, and analyze time-series data
  • Tools for disaggregation and flexible demand modeling
  • SQL schema and database setup guides for working with larger datasets

These resources are ideal for researchers who want to:

  • Build reproducible pipelines for high-frequency time-series analysis
  • Understand how to structure and clean residential energy data
  • Quickly generate visualizations and statistics

Getting Started: A Recommended Workflow

Researchers new to Pecan Street’s data can follow a structured path to explore and analyze the data efficiently:

  1. Download the Kaggle sample – Access the 10-home dataset to explore circuit-level behavior at 1-minute resolution.
  2. Clone the GitHub repository – Run example Jupyter notebooks to understand data formatting, analysis workflows, and common visualizations.
  3. Perform exploratory analysis – Use Python or R to compute daily load shapes, identify peak events, and correlate HVAC or solar output with grid conditions.
  4. Apply for university access – If affiliated with a university, request access to the full Dataport sample to expand your analysis to more homes, longer time ranges, and more complex applications.

Whether you’re an academic researcher, energy analyst, or product developer, these datasets offer a rare opportunity to explore high-resolution, real-world energy data from actual homes.

By starting with the Kaggle sample and exploring the pre-built GitHub notebooks, users can quickly become familiar with the structure and opportunities within the data. For those seeking to scale up their work, the Dataport sample offers a broader dataset that supports advanced research and product development.

For additional assistance or collaboration inquiries, contact info@pecanstreet.org.