Assessing Steam Consumption with an Alarm Clock: Step 1 – Working with the raw data and Excel time values

Previously, I shared a technique I learned years ago that allows me to create a picture of the load profile on a steam system by monitoring condensate pump and feedwater pump operation. I first used this technique by connecting alarm clocks across the starter coils of the various pumps and manually logging the accumulated run time.

My current approach involves using a data logger (or a DDC control system for that matter) to monitor condensate pump cycles. I then take the logger data and import it into a spreadsheet to determine the steam consumption and load profile. If I’m using a control system, I typically just do the math inside the control system and present the results in a graphic.

Since others may find this approach to be useful, I thought I would use the next few posts to walk you through the steps in the process of taking the raw data from a logger and converting it to a steam load that can be totalized and to paint a picture of the system load profile. As with most things in engineering, there are many paths to a solution. You may have a different approach that will yield the same results, so don’t be constrained by what I’m showing you; its just an example of an approach that has worked for me.

Here is a picture of the raw data I pulled from my logger around the time I plugged in the CT and installed the logger in the field.


Many times, when I go out in the field to deploy loggers, I actually set them up sitting in the office and let them start logging there. I do that for a number of reasons.

  • For one thing, it lets me verify that they are actually working. Most loggers let you launch them and then stay connected and look at what they are doing. Nothing is more frustrating (or embarrassing) than launching a bunch of loggers and returning a few weeks later to retrieve the data, only to discover that you forgot a step and they have been sitting there idle the whole time.
  • Launching the loggers ahead of time means I don’t necessarily need to have a computer with me in the field. In my specific case, I use a laptop and I tend to have it with me everywhere I go. But if you are working with a desktop system, its not very practical to lug it out to the field with you. And, the reality is that opening up your laptop in a mechanical space exposes it to hazards that you may just as soon avoid, like leaking pipes and tumbles to a concrete floor from an unstable work surface.
  • Another reason for launching loggers ahead of time is that if I have several similar inputs, I can perform a relative calibration. For instance, if I am using a logger to measure the temperature rise across a coil, if I plug both sensors into the logger, put the sensors in a location where they see the same temperature, and log the results, in a perfect world, I would get two data lines that were right on top of each other. In the real world, I usually get two lines that are slightly different, typically a couple tenths of a degree apart for the loggers I use.

Since I know the sensors were seeing the exact same temperature, I can use the average difference between the sensors during this time as a calibration offset when I analyze the data. This is important since I typically am interested in the temperature difference across something in the field and a difference that was the result of a calibration issue rather than a real difference could give me misleading results. For this particular application, I was only using one current transformer (CT) as an input, so relative calibration was not an issue.

  • Its also worth noting that many loggers allow you to launch them and set them up so that they don’t start to actually log data until some time in the future. I’ve actually used that feature to set up loggers in my office and mail them out to someone on a job site, who then simply needs to insert them in the system at the appropriate location. Note that if you do this, you may need to do a separate relative calibration run and save the data for future reference before you send the loggers out. You also want to make sure that the person installing them does not reconfigure the sensors, otherwise your relative calibration and maybe the data will be meaningless.

If you look at the data in the picture above, you will notice that the current appears to be running at about 10 to 11 amps and then drops to only a few hundredths of an amp. At first, you may think that the pump was running at full load and then something unloaded it. While that is a valid conclusion given what you currently know, what you are actually seeing is me plugging the CT into the logger out in the field when I installed it. In other words, the 10-11 amp numbers are the numbers the logger generates if its input is configured for a 20 amp CT and nothing is hooked up.  The hundredths of an amp readings are the value the CT generates with no load; in other words it’s not perfect and even though there is no current flowing through it, you get a little bit of an indication.

If I was going to use current to calculate energy consumption, I may want to log this no load condition for a while, average the values, and use the result as a calibration offset for my data analysis. In this particular case, I’m only using the data to indicate when the pump ran. The picture below illustrates what happened for the first pump cycle.


As you can see, when the pump cycles there is a clear jump from a couple of hundredths of an amp to a bit over 2 amps, so I have a clear indication of what I am trying to detect.

You may also note from looking at my data that I am sampling once every 8 seconds. This is pretty fast, and may be faster than some legacy control systems can sample, hence the need for a data logger even if you have a DDC system to work with.

The reason I selected 8 seconds is that a general rule of thumb for data collection is that you need to sample a disturbance 4 or 5 times faster than it is occurring to create an accurate picture of it. In the case of my condensate pump, I had watched it for a while and discovered that when it ran, it typically ran 30-45 seconds. So, I needed to sample faster than that to capture the data. If for instance, I sampled once every 5 minutes, I could miss an entire cycle unless it happened to be occurring when the logger took its sample.  The bottom line is that if you are using this technique, you need to observe the pumps for a while to make sure that you set up the logger in a way that allows you to capture the data you want.

The down side to sampling fast is that you use up the logger memory very quickly. In this case, sampling every 8 seconds filled the memory up in about 4 days. Thankfully, Erich (my partner in crime on this project) is on site every couple of days and could stop by and retrieve the data with out much fuss. But, were it not for Erich, this could have been a challenge with me being in Portland, Oregon and the project being in St. Louis, Missouri.

Some loggers would let you work around this by having the sample occur only if there was a change of value (termed COV in the jargon of the industry) at the input. For instance, I could set such a logger up to record data if the data changed 1 amp.   Since the pump draws about 1.75 – 2 amps when it runs , the only data points I would get would be when the pump cycled on and when the pump cycled off.

For the cycle in the picture above, that would mean I would have two data points about 8 seconds apart and then nothing until the next cycle, which happened about 8 minutes later. The logger I was using could not be configured to do COV based samples, so I had to sample quickly to make sure I saw the pump operate.

A subtle but potentially important difference between my approach and the COV based approach is that the COV based approach would have told me pretty much exactly how long the pump ran. In other words, it would log exactly (with in the limitations of the logger’s response time) when the pump amperage increased by one or more amps and exactly when it dropped by one or more amps.

My “sample every 8 seconds” approach documents that the pump was running at the time the logger took the data sample. So, all I really know is that the pump was running at Thu 10/8/2009 11:54:32 AM and at Thu 10/8/2009 11:54:40 AM. It may have cycled off and back on between those samples (unlikely, but possible), and it may have been running 7.9 seconds before 11:54:32 AM and for 7.9 seconds after 11:54:40 AM. So the actual run time could have been any thing from a couple of fractions of a second to 31.8 seconds (7.9 plus 8 plus 8 plus 7.9). Since I only needed to know the pump ran, this is not an issue. But if you were making calculations based on how long the pump ran, the sampling time for the non-COV based approach becomes a limitation on the accuracy of your results.

One final but subtle thing to notice about the raw data pertains to the time stamp. Here is a picture of the beginning of the file.


Notice that the time stamp column is labeled GMT-07:00 (Pacific). That’s because I launched the logger using my laptop, which has its clock running based on the Pacific Time zone since that’s were I live. But, as you may recall from earlier dicussion, the project from which the data is taken is in St. Louis, Missouri, which is in the Central Time Zone. So, if I want to correlate this data with what is going on in the building and other data, I need to shift the time stamp by 2 hours (the difference between the Pacific Time Zone and the Central Time Zone).

That’s actually not to hard to do once you understand how Excel deals with time. In Excel, dates are actually formatted serial numbers where each day increments the number by “1” and the first day was Sun 1/1/1900 or January 2, 1904; one is the IBM type machine standard and the other is the Mac type machine standard. The serial number for today (on my IBM type machine) the day I am writing the blog post (Saturday January 23, 2010) is 40,201. Tomorrow, the number would be 40,202 (40,201; the value for today’s date plus “1” the value associated with one day).

The value for one hour is 1/24 or 0.04166667. The value for 1 minute is 1/(24*60) or 0.00069444. Once you understand this concept, you can see that its easy to shift the time by two hours using a formula. Here is is a screen shot of my data with the time shifted. I’ve “clicked” into one of the cells so you can see the formula, which also shows up in the formula window.


So to shift the time, all you need to do is

  1. Insert a new column.
  2. Type the formula in one cell.
  3. Verify that it gives you the result you want.
  4. Cut and paste the formula into the remaining cells in the new column.

The entire process probably takes less time than it took you to read this once you get the hang of it.

Time stamps are more than a data logger issue.  Two other common issues related to this are as follows:

  1. If you are logging data, either with a control system or a logger, and the place you are logging data switches from standard time to daylight savings time, you will need to adjust the data accordingly for all events after the time change.  Some control systems and loggers will do this for you, but many won’t and as a result, you could find yourself wondering why things started happening an hour earlier or later when in fact, nothing changed.
  2. Many control systems, especially older control systems, work with data in their data base in terms of Greenwich Mean Time and make the conversion to display the data in the correct time zone when its written on the local operator workstation.  But, if you export the data as a comma separated value (CSV) file so you can work with it in Excel or another spreadsheet or data base application, the conversion does not take place.  If you don’t realize this, you may think you have discovered a major problem because the air handling systems are starting 7 or 8 hours earlier than they need to when in fact, things are purring along right on schedule; you are just looking at the schedule in the wrong time zone.

Understanding how Excel works with dates and times comes in handy for formatting the time axis in time series graphs, something that you will be doing a lot of if you really get into data logging. To make that easier, I made myself a little spreadsheet that has the values for the common increments of time that I use. It also has a cell where I can type a date or time of interest and see the serial number associated with it in the next cell.


The spreadsheet is not too hard to make, but if you are interested and want to experiment with a copy of mine, I posted it in a Google Documents account associated with the blog and you should be able to download it there by following this link. There won’t be a preview available  when you go there, but the download seems to work, at least when I’ve tested it.  If you page up to the top of the sheet, you’ll find a text box that discusses how Excel deals with time.  I’ve never had a problem with the spreadsheet although I can’t guarantee it’s perfect and am not putting it out there as something I have time to support should you have a problem. But it may be a handy resource for you that I’m happy to share. Feel free to use it as a starting point for your own spreadsheet or just as it is.

In the next post, we’ll look at how you process the raw data to detect a pump cycle. Then we will look at how you count the cycles and convert them into a steam load and load profile.  Until then, happy data logging.


David Sellers
Senior Engineer – Facility Dynamics Engineering

Click here for a recent index to previous posts

This entry was posted in Data Logging. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s