Aliasing and Other Factors Affecting the Accuracy of Field Data

One of the challenges we face when we work with building systems is capturing enough sufficiently accurate data to paint a true picture of what is going on.  This is harder than it sounds for a number of reasons, including aliasing, which is what I want to focus on in this post to support a subsequent post.  But before I do that, I thought I should highlight and revisit some of  the other factors that can come into play, many of which are discussed in more detail in previous posts I have done.

The links below will jump you to the indicated topic.  The “Back to Contents” link at the end of each section will bring you back here.

Absolute Sensor Calibration

The accuracy and calibration of our sensors relative to some sort of standard is an obvious potential source of error.  For example, here are the results of a test I did with a couple of my thermocouples using a dry-well calibrator.


Both of the sensors have identical accuracy specifications, but they vary significantly from each other when compared to the NIST traceable accuracy of the dry-well calibrator, even though they are with-in the limits of their accuracy specification.

Back to Contents

Relative Accuracy

For many of our assessments, the accuracy of one sensor relative to another sensor is more important than the absolute accuracy of the sensors because we will base our calculations on the difference between two different parameters in an operating system.   For example, we calculate the load on a coil based on the temperature difference:


Or, we enter a pump curve based on the pressure difference we measure across the pump (the difference between the red and black pointers on the gauge in the picture).


By taking the pump reading above with one gauge instead of two, the gauge error is canceled out.

In other words, if the gauge’s rated accuracy is 2% of full scale, and  it turns out that this particular gauge, when tested against a standard, reads 1.2% high across its entire span, then the pump suction and discharge pressure readings will be off by the same amount relative to the standard.  As a result, the difference between them will be an accurate measurement of the pressure difference across the pump.

Contrast this with what would happen if two different gauges were used, both of which were rated for 2% full scale accuracy or better and one of which was reading 1.2% high relative to a standard while the other was reading 1.3% low relative to the same standard.  The pressure difference developed by these two readings could be off by 0.1% (1.3% – 1.2%) to  2.5% (1.3% +1.2%), depending on which gauge was used to take the high pressure and which gauge was used to take the low pressure.

Back to Contents


HVAC systems are prone to stratification of flow and temperature, especially large air handling systems.  Mixed air plenums are notorious for temperature and velocity stratification as you can see from the test data below, which is the subject of a previous post about stratification in economizer mixing plenums.


And large coils will often show a temperature gradient across them that can vary with the load and the coil circuiting.  This picture, taken by Tim Scruby, another one of the Senior Engineers at FDE on a site in Virginia, illustrates how poor water distribution in a coil at low flow rates can lead to a very significant temperature gradient across the coil


For flowing fluids, the shape of the velocity profile can be profoundly impacted by obstructions upstream and down stream of the measuring location.  You can get a sense of this by watching the water flow in this video clip of a stream. Notice how the rocks in the stream create waves and eddies both upstream and downstream of their location.

That means that it may require multiple sensing points distributed across the plenum or face of the coil to paint a representative picture of the actual temperature or flow profile that exists.

Back to Contents

Thermal Lags

Thermal lags are another thing that can cause us to be misled by our data.  In this case, the issue is not so much about accuracy as it is about what is going on inside the system at a given time relative to a measurement taken by an instrument that has mass between it and the system.


The mass can be in the form of a well that the sensor is in, which was the case in the example above. Or it may be the mass of the sensor itself as illustrated below.


Either way, mass can impact what we think is going on in a system vs. what is actually going on.

Back to Contents

Sensor Installation Issues

To provide accurate data, many sensors need to be installed in a manner that meets the requirements specified by their manufacturer.  For example, the pressure sensor in this video is position sensitive and needs to either be installed vertically or recalibrated in the installed orientation.

On a recent project, when an differential pressure sensor that was used to control the building’s relief fans was replaced with a sensor similar to the one in the picture, the new sensor was mounted horizontally instead of vertically to facilitate maintenance.  But it was not recalibrated in that position and as a result, it “thought” the building was always positive, which caused the control system to operate the relief fans almost continuously.

Since the envelope leakage in the facility usually handled the relief requirements, the change in the relief fan operating profile was picked up by the facility occupants, mostly because the facility is an energy efficiency center so the technical team is much more engaged with the details of the HVAC system operation than your typical office building occupants.  But, had they not noticed the problem, in addition to creating drafts, it would have increased the operating cost of the facility by about $800 – $1,000 annually.

Back to Contents

The Data Sensing Food Chain

Even if you properly address all of the items we have been discussing, there are still a lot of things between the sensors and actuators in the ducts and pipes and the person sitting at the operator work station.  This is illustrated in the slide below and discussed in my post titled 4-20 ma Current Loop Experiments – Thermal Mass Effects under the  Real World Implications topic.

Back to Contents


Aliasing, which is the topic I really wanted to focus on in this post, is also something that can have a major impact on the conclusions  you reach from your data set, even if you have addressed all of the items on the preceding list.

Aliasing is what happens when the process you are measuring changes faster than the rate at which you are taking your samples.

Assuming you have good sensors, all you really know about a process when you are logging data is the information that is reported at the time when the sample was taken. For instance, in the example below, if we are sampling the chilled water valve command signal and discharge temperature signal once every 5 minutes, we know what is happening at the points in time marked by the yellow squares.

Excel (which is what I used to create the chart) has made assumptions about what happened between those points in time using curve fitting techniques to create the purple and blue lines that connects the points.


If I were to increase my sampling rate to once a minute, it could turn out that all of the data points would lie on the purple and blue lines, as illustrated below with the red markers.


But it also could turn out that the actual wave form is totally different from what I was deriving from the 5 minute data simply because I was only capturing a few of the data points in the actual wave form.


So for me, aliasing means that the words “sampling time” and “too fast” are mutually exclusive terms until I know what is going on in a process.

Back to Contents

Putting a Number to the Sampling Time

If you buy into my premise in the preceding paragraph, they you would be inclined to simply set up any trending or data logging operation to gather data at the fastest available sampling rate.    But unfortunately, its not that simple.

Fast Sampling and Network Architecture can be Mutually Exclusive

There are commercial control systems out there that can sample a significant number of their data points once a second if they needed to.   Typically, all of the controllers on these systems have access to a very fast, peer to peer network, meaning they can send or receive data packages quickly and do it on demand, when ever it is necessary to share something or know something.

But, there are also systems out there, especially legacy systems, where the controllers reside on a low speed, polled network.  That means that the rate of transmission for data can be orders of magnitude less than what I alluded to in the preceding paragraph and that some sort of network level device needs to initiate and broker the transaction.  For systems of this type, sampling all of the points in one air handling unit, let alone multiple units or the terminal units associated with the main air handling equipment could be a practical impossibility, potentially crashing the system if you were to try to set it up.

Data Loggers Bridge the Gap

One of the advantages of having some data loggers around even if you have a DDC system is that they typically are capable of sampling rates of once per second or even faster.  So, they can bridge the gap, allowing you to log data for a process at a rate that might not be possible if your only option was to use the DDC system.

But there is a limitation there to;  loggers have limited memory and if you want a continuous data stream from them you have to visit them and pull data from them before their memory is totally full.  The exception to this is some of the new, wireless loggers that are emerging in the marketplace.  But, these are not as common as the more traditional types, require a wireless network and related set-up activities and hardware, and are a bit more expensive.

So if you are using a logger to pick up something that you can not trend fast enough with your DDC system, then you will be faced with the trade-off between memory and sampling rate.   I discuss this in more detail in a post I did a while back titled Data Logger Duration Times, which includes a tool that you can use to help you understand the trade-offs if you don’t happen to have a logger sitting around that you can plug in and program to get your answer.

Determining and Appropriate Sampling Rate

My general rule of thumb for sampling times on an HVAC process is that I need to sample things at least once a minute until I know what is going on.  I may slow that down after I am sure that there is not something going on that I need to look at requiring even faster sampling rates.

But there are also situations where I know from the start I will need to sample faster than once a minute.  For example, if I am watching a refrigeration system and trying to detect things like the pump down cycle or the operation of hot gas bypass, I may want to sample as fast as once a second.

Another example of a situation where a rapid sampling time is important is an application where I am trying to paint an accurate picture of a chain of events, like the start-up sequence for a chiller or a boiler.  There are events in these sequences that can happen in a matter of seconds and if I sample too slowly, I might miss them.

A recent example of just such a situation occurred when I was trying to identify the cycle rate for a boiler that seemed to be short cycling.  I became aware of the problem while scoping out the hot water system and noticing that the boiler cycled on and off quite frequently.  That caused me to pull out my iPhone and use its stopwatch and lap counter to get a sense of the timeline for the  operating sequence.  Here are the results of that effort.


Notice that the off cycle (Lap 1) is the only event that is much longer than a minute and that the pilot cycle (Lap 3) is 13 seconds long.  So if I really want to capture all of that, I need to log faster than the fastest event.

The Nyquist Sampling Theorem

Thanks to Mr. Nyquist, we can actually determine a sampling rate that will avoid aliasing for our data set if we know a bit about the disturbance we are trying to detect .


(And thanks to Steve Briggs of FDE for reminding me of the name, which I can never seem to remember).

The theorem suggests that we need to sample at a rate that is at least twice as fast as the rate of the disturbance we are trying to understand if we want to avoid aliasing.  While the basis of the theorem is in mathematics, the result is actually fairly intuitive if you think about it a bit.  The document I linked to in the previous sentence does a good job of illustrating that so I will let you take a look at it if you want a bit more clarity.

My point here is that if you know a bit about what you are looking for, you can get a pretty good sense of how fast you need to sample to capture things by applying Mr. Nyquist’s theory.

Applying the Concept

If we use the boiler I mentioned above as an example, the fastest event we are trying to capture is the pilot cycle, which lasts about 13 seconds.  So, if we log data every thirteen seconds, we will certainly capture the event with at least one data point being taken at some point during the cycle.

But, what we would not know (if all we had was our data) would be if the cycle lasted a fraction of a second or 12.9999 seconds.  Even if our logger were perfectly synchronized with  the start time of the pilot cycle and we took data just as it started and ended,  we would not know for sure what happened between those to events nor would we know what next.  Where there multiple events between our two data points and we just missed them?  Did the cycle actually end at 14 seconds?

Part of the complexity in this particular case is that the pilot cycle is one event in a string of events.  So, while it lasts 13 seconds and we need to log faster than that to capture it, there are also other events that will happen before it repeats, some of which are very predictable due to the programming of the firing controller (like the pilot cycle), and some of which will vary with other metrics.  For example, the duration of the firing cycle will be driven by the load on the system.

That’s why my little field test using the iPhone was so valuable.  It gave me insight into the nature of the cycle that helped me determine the minimum logger sampling rate based on the shortest element in the sequence.

The Results of the Boiler Logging Effort

Given my field observations, our minimum sampling rate needed to be once every 7 or 8 seconds based on the Nyquist sampling theorem.   But to get a sharper picture of exactly when the pilot cycle started and ended, we decided to use once every 2 seconds as our sampling rate, at least for the first round of logging.  That would give us 5 or 6 data points in during the pilot cycle and define it’s start and end time with-in 2 seconds of reality.

To capture the cycle, we decided to log the power draw of the boiler since each event in the cycle would change that.  Flue gas temperature might have been another option and I explore that in the next post.

For this example, when the pilot cycle was triggered, a solenoid valve would be energized, which would create a current spike due to the inrush and also shift the power consumption up slightly.  To make sure we could distinguish minor changes like the solenoid valve, we selected a CT for the logger whose rating was as close as possible to the maximum current draw we anticipated from the boiler.  Here are what our results looked like.


One thing we could see from the data set was that the boiler was obviously short cycling.  But we also could see the signature of every event in the cycle in the current waveform by virtue of our sampling rate and CT selection.


Back to Contents

Bottom Lines

Being able to identify the number of cycles and the length of the purge cycle was valuable information for us because the system is throwing energy up the flue during the purge cycle.  If we could come up with a way to reduce the number of cycles, we would also reduce the energy thrown away during the purge cycle.  The purge cycle is a necessary evil if you don’t like boiler explosions, so you can’t eliminate it, you only can minimize how often it needs to happen.

I will go into that more in my next post.  Meanwhile, I hope this has given you some insight into some of the things you need to consider when you are looking at a dataset for the first time and deciding if you trust it.  And in particular, I hope it has given you some insight into how to determine the sample time for trends and data logging.


David Sellers
Senior Engineer – Facility Dynamics Engineering

This entry was posted in Data Logging, Excel Techniques, Retrocommissioning Findings. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s