So far in this string of posts, I have pointed out some of the advantages of using a scatter plot to analyze building performance data and, via a video clip, illustrated how to create a basic scatter chart of kW vs. outdoor air temperature.

Clearly, there are a few shapes in my data cloud. The question is

Whyare there shapes in my data cloud?

In this post and the associated video clip I will illustrate how I go about trying to figure that out and how you can use that information to your benefit once you understand the drivers behind a particular pattern.

## Assessing Real Clouds

For meteorologists, clouds are really important tools for understanding the atmosphere. There is a fairly well defined progression of clouds that will happen as a front approaches and passes. The first time I can say that I really noticed this was during one of our trips to the Oregon coast, which caused me to search for and discover on of the first papers about polar front theory, On The Structure of Moving Cyclones, which was published by J. Bjerknes in 1918.

In that paper, the cloud progression associated with a frontal passage shows up in a number of places including this illustration.

So, by observing cloud progressions, meteorologists can understand what might happen in the immediate future; i.e. they can begin to forecast the weather.

Technology, in particular, computers informed by technology that has been around for a while allow meteorologists to understand, among other things, why a particular cloud type might exist at a given place at a given time. For example, ROAB imports data from radiosondes and uses that data to forecast what might happen next and where clouds might exist. Here is what the sounding for Salem Oregon said the atmosphere looked like earlier today (grey lines), forecasted to about now (red lines).

The software suggests that there might be clouds between about 300 ft above ground level and 7,500 feet above ground level (the gray bars on the left side of the graph). If I go “look out the window”, I observe some high clouds, maybe around 8,000 – 10,000 feet to the southwest …

… and some lower clouds to the north.

So, reality matches the projections to some extent.

The projections were based on applying mathematical algorithms to the data from the radiosonde (a.k.a. a weather balloon) to project what type of cloud might be there and where it might lie in the vertical structure of the atmosphere. Currently, I have the software set to use things like like dry bulb temperature approach to wet bulb at a given altitude and the rate of change in temperature with altitude to make these electronic judgments, but there are other options.

My point in bringing this up is probably two fold. One is that I probably am trying to get you interested in meteorology. I will stop doing that now (for the time being).

But the other, which is more relevant to our discussion, is to illustrate how, for something that is generally pretty important to most of us (the weather forecast), atmospheric scientists (like Meteorologists) apply filters (mathematical algorithms) to field data (atmospheric soundings) to gain insight into what is going on.

As building scientists, we can do the same sort of thing with our data. And it is a lot less complicated than what the Meteorologists have to deal with (sometimes).

## Assessing Building Data Clouds

By applying some basic filters to the data set we were working with in video in the previous post, you can isolate one of the shapes in the cloud …

from the others.

The data in the upper picture represents the base load; i.e. a consumption pattern that is there even during hours and days of the week where there is not much activity going on in the meeting spaces and ball rooms at the hotel. I isolated it by telling Excel to turn of data points for certain hours of the day and days of the week.

Because the isolated data forms a fairly compact cloud, we can do a reasonably good curve fit to it using the Excel trend line feature. And once we have an equation for the line, we can use it to predict the facility base load consumption for a year .

Specifically, by using hourly weather data from one of a number of sources for the “x” in the equation, we can calculate the kW for that hour; the “y” in the equation. And since we are doing the math on an hour by hour basis, the kW for the hour becomes kWh and we can add it up to come up with the annual energy consumption.

## A Model of Facility Energy Use Based on the Reality of Historical Data

The bottom line is that we were able to develop a relationship between a dependent variable (the kW for the facility) and an independent variable that seems to drive it (the outdoor temperature) based on filtered data from the actual facility for a period of 6 or 7 months. We then were able to use that relationship to project what might happen over the course of a year.

Is our model 100% accurate? Certainly not. For one thing, there are many, many more factors that influence how the facility uses energy aside from outdoor air temperature. Things like occupancy, control process stability and accuracy, outdoor humidity, cloud cover, operator preferences, changes in equipment performance over time all can come into play and influence the energy pattern. That is why the data is scattered.

And the techniques used to isolate one shape in the cloud from the others are somewhat arbitrary. As you will see in the videos, to some extent, I go on a fishing trip, trying one filter and then another until I had the shape isolated.

But the result is a relationship that is based in the physics of the facility that we can then use to predict how it might react outdoor air temperature variations over a longer period of time than our data set covered. That can be a very useful and powerful tool for our retrocommissioning tool set.

## Filtering a Data Set

The trick to isolating the upper data set from the lower data set involves being able to eliminate certain hours of the day and certain days of the week from the data set. Depending on the source of the data you are working with, that may or may not be as easy as it sounds.

When you try to apply filters to a basic data set like the one we were working with in the previous post, a number of issues can come up. Some of them are related to getting to the basic information you need to apply the filters, things like day of the week and hour of the day. Others are related to understanding how the spreadsheet tool you are using works with the data and how the features of the tool can be used to reveal additional information.

For example, Excel needs to work with real numbers which are a kind of abstract concept. We represent numbers using numerals. The human analogy for this is that, for instance, my bride is a very distinct individual in my perspective; there is nobody like her in the universe from my perspective. Yet, I reference her as Kathy, Kath, my bride, Love, and Arty (a family nick-name she picked up in her late teens due to her love of Artichokes).

Returning to the data set , the instantaneous demand measured by an electric meter might be one killowatt, a mathematical quantity. But it could be represented as 1, or I, or Uno to document it for our use. We are fine with that as long as we are familiar with the language of the numerology; if I don’t know about the Roman Numeral system, then I might not be able to understand I, II, III, etc.

In terms of doing math, excel only knows about numbers. So unless we do something to tell it that Uno or I or even the numeral 1 represents a mathematical quantity of one kilowatt, it will toss its little computer hands up in the air and not work with it.

Another thing that can mess up the math is a character that is not a number mixed into a data set full of numbers. For example weather data can have question marks (?) or asterisks (*) embedded in it if there was a problem like a power outage that causes data to not be available at that instant. Our eye might have trouble noticing one question mark in the 43,800 cells we imported when we retrieved a year’s worth of weather data. But Excel will pick right up on it and give us odd, meaningless results.

This first video illustrates how to extract the information you need to determine day of week, time of day, and other useful filtration parameters from a data set. In doing that, I also illustrate some of the issues that can come up with a data set and how to resolve them. Finally, I apply filters to the parameters I have extracted to isolate one shape from another in my data cloud and then come up with an equation that represents it using the Excel trend line feature.

The second video illustrates how I go about using the equation and some other features of Excel to project what * might* happen based on what

*happened. In doing that, I illustrate how you can go about obtaining a year’s worth of weather data from a number of resources, including how you might target exactly which data set you want to use. Finally, I compare the consumption calculated by the model to the actual consumption in the data set that the model was based on.*

__has__

So far, the posts and videos have focused on how to use utility data in a scatter plot to project consumption patters. In the next post, I will show you how you can use similar techniques on a system level data set to asses useful information that can help you identify energy consumption patterns and the savings potential associated with making improvements to the system.

David Sellers

Senior Engineer – Facility Dynamics Engineering