Yes Energy News and Insights

Calculated Data Types Now Available in DataSignals® Cloud

As part of Yes Energy’s efforts to enrich customer data collections for grid analytics, forecasting, and modeling, users can now explore the addition of calculated data types to DataSignals® Cloud.

Why Add Calculated Data to the Cloud?

Calculated data types, also known as functional, derived, or feature-engineered data sets, are generated by combining multiple base data types to produce new, meaningful data signals. 

Estimated congestion is one example of a calculated data type. Some Independent System Operators (ISOs), such as the Electric Reliability Council of Texas (ERCOT), don’t report the congestion component of their locational marginal pricing (LMP) data – they only provide the price and the energy components from multiple reports. 

To understand a location’s congestion at a point in time, data scientists, data engineers, or financial traders must manually calculate congestion at the node using the LMP and energy data ERCOT publishes. Or they can leverage the Yes Energy DataSignals API.

The DataSignals API calculates multiple derived datasets, including congestion for each ERCOT node. It enables you to search one location and data type at a time for a specific data range. 

While ideal for smaller, targeted use cases, the API has a limited historical bandwidth and throttle limits that dictate how much and how frequently users can pull data. The requested data points are calculated in real time and displayed to the user but aren’t materialized as tables anywhere in the system. Users looking to analyze multiple data sets must pull each separately and concatenate them, creating their own materializations.

This API’s inherent lack of materialization and limited scalability can bottleneck broader historical studies. DataSignals Cloud, on the other hand, is well suited to facilitate such analyses. 

Introducing Calculated Data Types in DataSignals Cloud

DataSignals Cloud, Yes Energy’s data warehouse, is built on the Snowflake platform. Snowflake excels at storing large amounts of historical data at scale, enabling us to provide customers with a faster, more convenient way to access derived data sets than is possible with just the API. 

Our new calculated data type feature leverages the hundreds of millions of rows of historical data stored in DataSignals Cloud to populate well-organized tables. Because these tables live in the cloud, users can query them to quickly gather all the data necessary to complete their analysis, rather than piecing the information together from multiple data sets extracted with the API.

data pipeline

Calculated data types help you simplify your data pipelines.

In the first phase of our calculated data type rollout, DataSignals Cloud has been updated to include 15 new searchable data types. 

First, we’ve added the complete history of calculated congestion for ERCOT’s day ahead and real-time markets, including for every single price node and historical date-time record.

Sample data of our new calculated day ahead estimated congestion for ERCOT, showing the deep history and data structure.

Sample data of our new calculated day ahead estimated congestion for ERCOT, showing the deep history and data structure.

We’ve also integrated the 13 data types that comprise our calculated weather normals  data series across North America, including temperature and dew point norms, cloud cover, wind speed, and other data points critical to understanding the weather patterns and historical trends that inform energy demand predictions.

Sample data of our new calculated weather normals data, showing the various data series and data structure.

Sample data of our new calculated weather normals data, showing the various data series and data structure.

Let’s note one significant difference between the data in the API and the more robust historical data in the DS Cloud. While the API pulls data in real time, the calculated data in the cloud is near real time, updated at least once a day. Because this poses some challenges in orchestrating data updates, we opted for daily updates so we can materialize the vast historical data as tables or views that users can filter. We are looking to increase the data refresh frequency as we continue to develop this feature, assisted by Right Triangle Consulting.

Streamline Your Operations with Calculated Data Types in DS Cloud

Whether you're a financial trader working in the day-ahead market, an asset developer making decisions about where you want to site a plant, or a wind farm operator evaluating how to best optimize an asset, this new approach to gathering calculated data will streamline your operations. By materializing historical values in the cloud, this new feature saves you time and money and better supports your trading and business decisions.  

With DS Cloud’s derived data, you no longer have to establish multiple data-scraping mechanisms – the complete historical data set is already calculated and available to input into your backcast models, streamlining your energy price and grid condition predictions.

Additionally, the feature reduces your extract-load-transform processes because each derived data type is consolidated into an easy-to-use, familiar table structure. 

Finally, the feature accelerates workflows for those wanting to write the calculated data to their storage systems before conducting grid analytics and backcast modeling.

Further Expansion of Calculated Data on the Horizon

Stay tuned. Phase one of the calculated data rollout includes 15 new data types – the day ahead and real-time congestion data sets for all ERCOT price nodes and our 13 weather normals data types for all weather stations. These tables are live and available for use, but they’re just the beginning.

Our team will continue to release additional calculated data types that enrich your data collections for grid analytics, forecasting, and modeling purposes. Ultimately, we plan to have calculated datasets available in the DataSignals Cloud for each ISO and Regional Transmission Organization (RTO). 

We also plan to add metadata in the Snowflake metadata views for each derived data point. 

Finally, while the calculated data type tables are currently only available in DataSignals Cloud, we foresee these datasets being available in DataSignals Lake in the near future.

The addition of calculated data types to the DataSignals Cloud is just one way Yes Energy demonstrates its commitment to better data and evolving and expanding our state-of-the-art DataSignals product. We believe this new feature will save you time and money and streamline your forecasting and backcasting operations to help you make better market decisions.

If you’re a current customer, you can learn more about this on our documentation, and you can check out more in-depth examples such as code snippets and workflows in our code repository. 

If you want to see more, contact Yes Energy to schedule a demo and learn how our new cloud-based calculated data types can help you leverage the latest power market data to improve your energy price predictions, help you make better business and trading decisions, and drive more revenue. 

Sam_Lockshin-1About the author: Sam Lockshin is the product manager of the data products at Yes Energy. He has a passion for programmatically delivering Yes Energy’s high-quality power market data catalog to customers so they can achieve their business goals. You can catch him at karaoke, playing piano, or checking out the latest horror flick. You can catch him on LinkedIn.

Subscribe by email