Posts Tagged ‘PUE’

PUE lives on with Revised Metric

Monday, June 21st, 2010

When folks from the data center industry got together about 3-5 years ago to create a data center efficiency metric, we knew that we should tie it to the actual work being created within the data center (i.e. transactions per watt, IOPs/watt, FLOPs/watt…). However, every data center and even more so, every computer has a different work being completed and thus metric to be applied. For example, a science research computer might complete one transaction per month with a lot of network and storage traffic for that one big “transaction”, while an eBay data center might have 1,000’s of transactions per second for one computer system.

So, we came up with a compromise, knowing that all data centers and their workloads were different, yet needing something to push us as an industry to higher efficiency. Well, the holly grail of data center metrics got released….P U E. Yes, Power Usage Effectiveness. While it was only a start, and a best compromise, and we knew we needed to improve upon it or come up with something better, yet is has had perhaps more influence on energy efficiency of our data centers than any other metric or industry movement.

While improving PUE only affects the infrastructure side of the data center, not the hardware or software–leaving that always equal to 1.0 with everything else being above one–more power use the higher the PUE. Our data centers have been averaging above 2.0 (meaning that at 2.0 the infrastructure power load is equal to the server load, higher than 2 means it uses more power than the server load). A recent report from EPA of about 200 data centers last year across the US shows that we are averaging north of 2.0. Other studies show that we had been averaging around 3.0 worldwide, so we have improved greatly but still can improve so much more. While 2.0 is much better than 3.0, using 50% less power for the infrastructure, we know we should be able to achieve PUEs of at most 1.5 any where in the world, any TIER level, yet at 2.0 we are using more than double the power we need to support the non-hardware loads.

One problem with the PUE metric is that it is instantaneous since it measures power and not energy and all data center power usage fluctuates with weather and usage, and really what we care about to reduce costs is total energy use over a period of time. Energy is power use over time, while power is instantaneous. Otherwise, PUE could be measured on the coldest day of the year when all systems are running more efficiently but that is not a good gauge of annual energy use and thus costs. In all of the projects I get involved with, and all of the PUEs I quote, I use total annual energy instead of one-time power measurement, and hardware measured at the rack to account within PUE UPS, PDU and other electrical distribution losses. So, PUE should be an annual average, and that is exactly what member representatives of Green Grid, SVLG, 7×24, EPA, DOE, USGBC, ASHRAE and UpTime recommended in December of this year at a meeting in DC. I provided recommendations from the SVLG along with Chris Page, Scott Noteboom and Tim Crawford representing the SVLG at this meeting with input from Olivie Sanche of Apple and many others.

Essentially the outcome was a revised PUE metric that now measures annual usage of infrastructure and IT load, which is fantastic! Also, a little more clarity or definition on how it should be measured and what should and should not be included. (Such as on-site power generation should never reduce one’s PUE, as energy in is energy in, regardless of source.) We’ll soon see PUE and PUE subscript 1, 2 & 3. These clarify where the server load was measured (UPS output, PDU or rack). Ideally, we’d all be measuring at the rack input, but many folks do not have this meter & monitoring capability, so the compromise was to allow for some acceptance of any of these points of measurement.

Even though the location of measurement will affect the measured PUE–meaning different measuring locations will result in different PUEs for the same data center–at least it’s an improvement, and will hopefully drive folks to measure at the rack–the most accurate location of measurement. It will also drive us to think about annual usage and costs, not one time or instantaneous, another big improvement in our thoughts about all buying and operational decisions. These are the key to improving efficiency and reducing costs: long-term measurement, long-term constant improvements, and buying decisions based on long-term economic analysis.

With our new PUE metric, it should re-invigorate the PUE discussions, comparisons, and improvements. Perhaps driving us all to lower PUEs, regardless of actual resulting PUE and type of data center. After all, we all gain when we each improve.

Is it possible, a data center PUE of 1.04, today?

Saturday, August 22nd, 2009

I’ve been involved in the design and development of over $6 billion of data centers, maybe about $10 billion now, I lost count after $5 billion a few years ago, so I’ve seen a few things. One thing I do see in the data center industry is more or less, the same design over and over again. Yes, we push the envelope as an industry, yes, we do design some pretty cool stuff but rarely do we sit down with our client, the end-user, and ask them what they really need. They often tell us a certain Tier level, or availability they want, and the MWs of IT load to support, but what do they really need? Often everyone in the design charrette assumes what a data center should look like without really diving deep into what is important.

When we do that, we can get some very interesting results. For example, I’ve been fortunate to have been involved with the design of three data centers this year and all three we were able to push the envelope of design and ask some of these difficult questions. Rarely did I get the answers from the end-users I wanted to hear, where they really questioned the traditional thinking and what a data center should be and why, but we did get to some unconventional conclusions about what they needed instead of automatically assuming what they needed or wanted. As a consequence, we designed three data centers with low PUEs, or even what I like to call “ultra-low PUEs“, those below 1.10. The first was at 1.08, the next at 1.06 and now we have a 1.046, OK, let’s call it 1.05 since the other two are rounded up as well. (We know we can get that one down to about 1.04 with a few more tweaks to that “what is really needed” question.)

Now, I figured that a PUE of 1.05 was going to take a few years to get to because the hardware needed to improve, i.e. chillers, UPS, transformers, etc. But what I didn’t take into account was that when we really look at what the client needs, not wants, and what we can do to design for efficiency without jumping to the same old way of designing a data center, we can reach some great results. I assume that this principal can apply to almost anything in life.

Now, you ask, how did we get to a PUE of 1.05? Let me hopefully answer a few of your questions: 1) yes, based on annual hourly site weather data; 2) all three have densities of 400-500 watts/sf; 3) all three are roughly Tier III to Tier III+, so all have roughly N+1 (I explain a little more below); 4) all three are in climates that exceed 90F in summer; 5) none use a body of water to transfer heat (i.e. lake, river, etc); 6) all are roughly 10 MWs of IT load, so pretty normal size; 7) all operate within TC9.9 recommended ranges except for a few hours a year within the  allowable range; and most importantly, 8) all have construction budgets equal to or LESS than standard data center construction. Oh, and one more thing: even though each of these sites have some renewable energy generation, this is not counted in the PUE to reduce it; I don’t believe that is in the spirit of the metric.

Now, for some of the juicy details (email or call me for more or read future blog posts). We questioned what they thought a data center should be: how much redundancy did they really need? Could we exceed ASHRAE TC9.9 recommended or even allowable ranges? Did all the IT load really NEED to be on UPS? Was N+1 really needed during the few peak hours a year or could we get by with just N during those few peak hours each year and N+1 the rest of the year?, etc. The main point of this blog post is to say that low PUEs, like that of 1.05, can be achieved, yes, been there and done that now, for the same cost or LESS than a standard design, and done TODAY, saving millions of dollars per year in energy, millions of tons of CO2, millions of dollars of capital cost up front, less maintenance, etc. We just need to really dive deep as to what we need, not what we want or think we need, and we’ll be better at achieving great things. Now, I need to apply this concept to other parts of my life; how about you?