Archive for August, 2010

Smooth-Stone & ARM-based servers

Sunday, August 15th, 2010

In the latest data centers that we have been involved in the design of, we have achieved designed PUEs of 1.04-1.09, which means that the electrical and cooling systems will use, on average over a year, 4-9% of the hardware loads (aka IT, servers, storage, network). This is a huge accomplishment and doesn’t come without a lot of experience, knowledge and constant effort to make those electrical and cooling systems ultra-efficient compared to average industry PUEs of over 2.0, meaning that cooling and electrical systems use more energy than the IT load. However, it also points out that these ultra-efficient data centers we are part of designing are now so efficient that we have to focus on the IT load to make a material affect on reducing energy use as there is very little more to save on the infrastructure side of the energy demand.

We work with clients to choose the most efficient servers and software solutions, but now is coming along an entirely game-changing technology, chips that use dramatically less power, about 1/20th, over existing technologies, and also more importantly, even much less energy in that they can turn off when not in use and immediately turn back on again when needed as processing demand increases again. In steps ARM-based processors for servers—the same technology used in our mobile devices today—that uses much less power and turns on and off much quicker than server processors of today.

Microsoft made a statement that they have been working with ARM based chips since 1997 and are now working with ARM in a new licensing agreement “to enhance our research and development activities for ARM-based products”. Quite a change from the strong partnership Microsft has had with x86 chips from Intel and Advanced Micro Devices. Even Apple has made acquisitions of companies and hiring ARM chip engineers and ARM specialists, likely for use in their growing line of smaller devices—the iPad, iPhone, iPod, etc.

Startup SeaMicro unveiled a server running on Intel Atom chips with a fabric that puts CPUs to sleep, allow for a lower energy use rack. They claim they can have as many as 2,048 CPUs into a 40U rack and use 8 kWs of power. These calculations seem to be with half of the processors off per rack, a good improvement over standard racks, but can we do even better?

Enter Smooth-Stone with a even more advanced approach with ARM-based processors designed for servers. The company is working to build a rack with a similar number of processors, but with ARM based processors, they can shut off and back on again these processors in a much more rapid fashion. These processors should be able to withstand much higher temperatures, vibration and other tolerance—after all, they come from mobile applications where these are requirements unlike the soft-glove approach in most data centers. The advantages of Smooth-Stone’s ARM-based servers should provide a significantly greater amount of onboard memory, compute performance, network bandwidth, lower costs and many other advantages in a chip that only uses single digit Watts, providing a significant performance to Watt advantage. Plus being able to quickly turn on and off every processor in a rack except one running at 1/10 to 1/20th or even less than a traditional processor when the rack is unneeded, and bringing the rack’s processors back up individually within microseconds is a distinct advantage in energy savings, the largest operating expense and driver of capital expense in data centers.

With Smooth-Stone’s SoC and software and experienced team designing the lowest power consuming servers, this may likely be the game-changing technology in IT and data centers.

Smaller, modular data centers & data center news and job post

Tuesday, August 10th, 2010

As I’ve worked in the data center industry for over 12 years, I’ve seen data centers get larger and larger. When I was with Sun Microsystems, we had over a 1,000 data center closets, labs and rooms, but no large data centers. There were many challenges to providing and maintaining all of these “mini” data centers, each wanting its own UPS and generator support while needing to run house air conditioning units 24/7 in office buildings that should had been shut off on weekends and evenings. I ran the numbers and realized we could supply all of these needs in a larger, shared data center for a much lower total cost. I proposed to Michael Lehman, then Sun’s CFO, the plan of an internal co-location data center complete with separate cages for each group to securely house their servers. This was around 1998.
Next when I was with Exodus Communications, the company that started the co-location industry, again the math played in favor of bigger is better, or at least lower cost. As a member of the “build team” running around the world finding and negotiating the next spot to locate a data center, then designing them, the larger we made the data centers the lower the total cost per unit to build, own and operate them.

Later, I was with Google operating and acquiring large data centers, and I had the privilege of running the largest data center in square feet that I’ve ever known. Now fast forward years of data center design, construction, operations and efficiency programs for data centers, and I’ve come to see that while larger may be lower cost to build and operate at the time of construction, in most cases larger data centers cost more over time than ‘medium-sized’ (relative size to time to load up) data centers. Why? Large data centers are rarely future proof. Our server technology leaps ahead a generation every 18-months; software generations are often 6-12 months. New infrastructure solutions are coming along every year, and capacity planning is rarely good more than 6-12 months out and sorely inaccurate at that. Case in point: a data center I built for Yahoo that would take 3-5 years to fill wanted to be modified to accommodate new cooling technologies and wanted to be moved to a different state to take advantage of changing tax laws. Yet modifying a concrete shell and relocating an entire and new data center are impractical and often costly solutions.

If we build data centers that are scalable in that they the are smaller and thus at the scale of 1-3 years of capacity, then we can always implement new technologies, solutions and changing business capacity and needs quickly and at lower cost over time. This may mean we plan for the next ten years when we site select our data center “campus”, but build out smaller shells in more frequent build cycles than one large building that can last for 5+years. I spoke about this in a recent ComputerWorld article.

The article also mentions the “butterfly” data center concept from HP, discussed further in this article. The key tenant of the plan is to build smaller buildings on a campus with shared network, personal and other benefits. Why not also build a small computer factory on the campus to serve the data centers and surrounding area? Another concept to this approach is with containers, which eBay just added themselves into the fold of those considering containers for future modular scalability.

I’ve built many of the lowest cost data centers ($4-8 million per MW of IT load) and the most energy efficient data centers (PUE 1.04-1.10), and I believe these smaller data centers can be built for very comparable cost and efficiency figures, likely the same, and perhaps even lower, especially over time as retrofits are less likely and less costly. I’ve seen examples of this in many of the previous data center projects I’ve been working on, which are 10+ MWs in size, ‘medium’ in relative terms.

Various data center news:
Fast growing company looking for a data center manager position in Santa Clara, IT focused, Linix & storage. Write me if you are interested.

Terremark Worldwide release solid Q1 results, raising annual revenue projections. Guidance for 2011 assumes no federal IT project revenue although that has been accounting for 10% of revenue. Cloud revenue growing and now accounts for 8% of total revenue.

Internap Network Services released its Q2 results, which were on target with revenue and margins, significantly increasing gross margin by exiting partner data centers into company owned data centers. I see this trend over time, as folks move from outsourced into company owned data centers. I can help you with this transition, having completed it and the best strategy to do so many times now with very excellent results.

Investment in “greener data centers” to increase over 5x in 5 years per Pike Research or put another way, capture 28% of global market by 2015 in this story, that I highlighted in a previous blog post.

SVLG Data Center Efficiency Summit

Monday, August 9th, 2010

For two years now I have been co-chairing the Silicon Valley Leadership Group Data Center Efficiency Program and Summit. This program showcases data center efficiency innovations via end-user case studies and end-user presentations at the summit. This will be the third year of the summit and we have an excellent agenda planned with many new innovations, new ideas, and comparisons about how to reduce energy use in your data center. The Summit will be an all-day event on October 14th at Brocade Systems in San Jose, CA (we move the summit to a different end-user each year; past summit hosts were Sun Microsystems and NetApp). You can learn more about what Brocade is doing to make their campus and data centers efficient here.

The main difference of this program and summit is that all presenters talk about their data centers, actual end-users of data centers. No presenter pays to speak via a sponsorship; this event has actual end-users presenting actual data, often validated by a third-party.

This program us put on mostly by a committee of volunteers, including my co-chairs: Ray Pfeifer, Tim Crawford and Brian Brogen. Many thanks to them for helping to pull this program and case studies together. Also many thanks to Kelly Aaron, Ralph Renne and Mukesh Khattar for helping with marketing, agenda and event planning. Zen Kishimoto, Joyce Dickerson and John Noh also helped with their valuable input. Dale Sartor and Bill Tschudi of LBNL helped immensely with leadership and ideas, and Paul Roggensack of the CEC with funding of the program. And most importantly, thanks to Bob Hines and Anne Smart with the SVLG for helping to plan the Summit.

You can view the program of the summit here. If you would like to submit a case study for this year or next year’s program, contact me or send an email to Case studies must be from data center end-users.

If you would like to sponsor the upcoming Summit , contact Bob Hines at the SVLG via the info on the event website. You can register for the Summit here or via the Summit website. The event has sold out the last two years and I expect a sell-out crowd again, so register early. I look forward to seeing you at the SVLG Data Center Efficiency Summit October 14th.

The cute little button that makes you money

Wednesday, August 4th, 2010

Greening Greater Toronto study finds that data center servers operate at only 4% average utilization: “The statement is the result of a recent “Green Exchange” meeting on greening IT practices hosted by Greening Greater Toronto in partnership with the Ontario Institute of the Purchasing Management Association of Canada.”

“One of the other lessons learned from the meeting is that central-control systems are more effective at reducing energy consumption than relying on employee practices. Purchasers who implemented employee training programs to have people turn off their machines at the end of the day reported maximum penetration rates of 65 percent, declining rapidly over time.”

“In contrast, most organizations have focused on control solutions, where IT staff program computers to turn off on a timed cycle. This is often matched with settings to turn off monitors or put computers into sleep-modes after a certain period of inactivity. Purchasers report almost no user resistance to these solutions and consider it part of a larger trend of centralizing control of individual computers over a network. Most purchasers have solved common concerns about timed off-cycles with a software-based solution like the NightWatchman or Surveyor Windows server monitoring software.”

When I was at Sun Microsystems (late 90’s-2000), I found similar results. When I asked the 40,000+ Sun community to turn off desktop monitors and computers at night, rarely did they, even though s study I commissioned showed the savings to be well into millions of dollars per year (as most were left on during nights and weekends and on average, employees were only at their desk about 4 hours per work day). But when I had a third party switching device (MonitorMiser) added to all desktops that automatically turned off monitors when desktops were inactive (no mouse or keyboard input for 15 minutes), only three people of over 40,000 complained. Savings = over $3,000,000 a year in US. (Most monitors were 17-24” CRTs, and each employee averaged over two as many had several.)

I then took this a step further and asked that the Operating System turn desktops off when inactive. This was a bit more challenging, as what was inactive to user input might be actively running code all night. So I had the software engineers put in some more code to look at processor state, network activity, and keep it user selectable. This was a very crude “sleep” mode for the OS and a beginning of those for the industry. The industry followed what we did I think not for energy savings, but because Sun sales engineers started selling computers on TCO including energy use and winning deals. The sales team was realizing by the late 90’s that lower total cost, and consequently lower energy use of the equipment, helped to make sales. These and other changes led to over $10,000,000 in annual energy savings I implemented and led to earning my second EPA EnergyStar Partner of the Year Award.

It’s been great to see this early and rather crude OS function automatically put monitors and desktops to sleep and/or off states. Wow! Now look at what our desktops, servers, even networking and storage equipment can do to help it reduce energy use when underutilized.
Take this one more evolutionary step forward, and you have what I call server power management software (1E, PowerAssure, Surveyor, and many others) that automatically determines hardware utilization and state, and either puts it to sleep or off and then automatically turns it back on when needed. How far this has come from our early and rather crude versions of this with desktops at Sun.

Think about it: as a company you want to utilize all of the assets you have to perform work that maximizes revenue (or profit). But you also need assets for peak periods that are underutilized during low demand periods. Think New York City taxis. They are busy as heck on a Friday night or during rush hour yet rather idle at 4 AM on a Sunday. You wouldn’t want every one of those cabs with a paid driver idling their engine burning dollars out the tailpipe now would you? So the cabs are parked, the drivers are home asleep, and they work when demand warrants it. So why do we leave our servers (and storage and network ‘taxis’) idling 8,760 hours a day when average peak times are well less than 1,000 hours per year (often less than 100)? We do this and wonder why we have average processor utilizations of less than 10%. (Processor capacity is rarely the limiting factor in most applications these days, but that is a topic for another blog.) And yet those servers consume about 2/3 of peak power when at 0% processor utilization, so why leave them running, burning precious company dollars out through the power meter? Is it charity to our utility companies? I doubt it. So power down those servers when not needed and save precious dollars for more important tasks than burning power unproductively. After all, those servers do have an on/off button. And call the experts at MegaWatt Consulting for these and more solutions to increase your dollars. Power on…productively.