When searching for flights on Google, you may have noticed that each flight’s carbon emissions estimate is now presented alongside its cost. It is a way to inform customers about their environmental impact and allow them to take this information into account in their decision making.
A similar kind of transparency does not yet exist for the computer industry, despite its carbon emissions. excessive those of the entire airline industry. artificial intelligence models increase this energy demand. Huge, popular models like ChatGPT point to a large-scale ai trend, driving forecasts that predict data centers will absorb up to 21 percent of the world’s electricity supply by 2030.
The M.I.T. Lincoln Laboratory Supercomputing Center (LLSC) is developing techniques to help data centers reduce energy use. Their techniques range from simple but effective changes, such as power-limiting hardware, to adopting novel tools that can stop ai training in its tracks. Crucially, they have found that these techniques have minimal impact on model performance.
In the bigger picture, their work is mobilizing research on green computing and promoting a culture of transparency. “Energy-aware computing is not really a research area, because everyone has held on to their data,” says Vijay Gadepally, a senior staff at the LLSC who leads energy-aware research efforts. “Someone has to start and we hope others will follow suit.”
Brake power and cool
Like many data centers, the LLSC has seen a significant increase in the number of ai jobs running on its hardware. Noticing an increase in energy use, LLSC computer scientists became curious about ways to run jobs more efficiently. Green computing is a principle of the center, which runs entirely on carbon-free energy.
Training an ai model (the process by which it learns patterns from huge data sets) requires the use of graphics processing units (GPUs), which are power-hungry hardware. As an example, the GPUs that trained GPT-3 (the precursor to ChatGPT) are estimated to have consumed 1,300 megawatt-hours of electricity, roughly equal to that used by 1,450 average US households by month.
While most people look to GPUs because of their computing power, manufacturers offer ways to limit the amount of power a GPU can consume. “We studied the effects of limiting power and found that we could reduce power consumption by about 12 percent to 15 percentdepending on the model,” says Siddharth Samsi, a researcher at LLSC.
The trade-off for limiting power is increasing task time: GPUs will take about 3 percent longer to complete a task, an increase that Gadepally says is “barely noticeable” considering models are often trained for days. or even months. In one of their experiments training the popular BERT language model, limiting GPU power to 150 watts resulted in a two-hour increase in training time (from 80 to 82 hours), but saved time. equivalent to a week of energy in an American home.
The team then created software that connects this power-limiting capability to the widely used programming system, Slurm. The software allows data center owners to set limits on their entire system or job by job.
“We can implement this intervention today and have done so across all of our systems,” Gadepally says.
Secondary benefits have also emerged. Since power restrictions were implemented, the GPUs in the LLSC supercomputers have been running about 30 degrees Fahrenheit cooler and at a more constant temperature, reducing stress on the cooling system. Running the hardware cooler can also potentially increase reliability and lifespan. They may now consider delaying the purchase of new hardware (reducing the facility’s “embodied carbon,” or emissions created during equipment manufacturing) until the efficiencies gained through the use of new hardware offset this aspect of the carbon footprint. They are also finding ways to reduce cooling needs by strategically scheduling jobs to run at night and during the winter months.
“Data centers can use these easy-to-implement approaches today to increase efficiency, without requiring code or infrastructure modifications,” says Gadepally.
Taking this holistic look at data center operations to find reduction opportunities can be time-consuming. To make this process easier for others, the team, in collaboration with Professor Devesh Tiwari and Baolin Li of Northeastern University, recently developed and published a complete framework for analyzing the carbon footprint of high-performance computing systems. Systems professionals can use this analysis framework to better understand how sustainable their current system is and consider changes for next-generation systems.
Adjust how models are trained and used
In addition to making adjustments to data center operations, the team is devising ways to make ai model development more efficient.
When training models, ai developers often focus on improving accuracy and rely on previous models as a starting point. To achieve the desired result, they have to determine which parameters to use, and getting it right may require trying thousands of settings. This process, called hyperparameter optimization, is an area that LLSC researchers have found ripe for reducing energy waste.
“We’ve developed a model that basically looks at the speed at which a given configuration learns,” Gadepally says. Given that rate, his model predicts likely performance. Poorly performing models are stopped early. “We can give you a very accurate estimate right off the bat that the best model will be in the top 10 models in operation,” she says.
In their studies, this early interruption generated spectacular savings: a 80 percent reduction in the energy used for training the model. They have applied this technique to models developed for computer vision, natural language processing, and materials design applications.
“In my opinion, this technique has the greatest potential to improve the way ai models are trained,” says Gadepally.
Training is only part of the outputs of an ai model. The largest contributor to emissions over time is model inference, or the process of running the model live, such as when a user chats with ChatGPT. To answer quickly, these models use redundant hardware, running all the time, waiting for a user to ask a question.
One way to improve inference efficiency is to use the most suitable hardware. Also with Northeastern University, the team created an optimizer which matches a model with the most carbon-efficient combination of hardware, such as high-power GPUs for the computationally intensive inference parts and low-power central processing units (CPUs) for the less demanding aspects. This work recently won the best paper award at the ACM International Symposium on High Performance Parallel and Distributed Computing.
Using this optimizer can decrease energy usage by 10 to 20 percent while meeting the same “quality of service goal” (how quickly the model can respond).
This tool is especially useful for cloud customers, who rent data center systems and must select hardware from thousands of options. “Most customers overestimate what they need; they choose over-capable hardware simply because they don’t know any better,” says Gadepally.
Growing awareness of green computing
The energy saved by implementing these interventions also reduces the costs associated with ai development, often by a one-to-one ratio. In fact, cost is often used as an indicator of energy consumption. Given these savings, why aren’t more data centers investing in green techniques?
“I think it’s a misalignment of incentives problem,” Samsi says. “There has been such a rush to build bigger and better models that almost all secondary considerations have been pushed aside.”
They note that while some data centers purchase renewable energy credits, these renewables are not enough to meet growing energy demands. Most of the electricity that powers data centers comes from fossil fuels, and the water used for cooling is contributing to the stress of watersheds.
There may also be doubts because no systematic studies have been conducted on energy saving techniques. That’s why the team has been pushing its research into peer-reviewed venues in addition to open source repositories. Some big industry players, such as Google DeepMind, have applied machine learning to increase data center efficiency, but they have not made their work available to others to implement or replicate.
Major ai conferences are now pushing for ethical statements that consider how ai could be misused. The team sees the climate aspect as an ethical issue in ai that hasn’t received much attention yet, but which also appears to be slowly changing. Some researchers are now revealing the carbon footprint of training the latest models, and the industry is also showing a shift in energy transparency, as in this case. ai.meta.com/research/publications/scaling-autoregressive-multi-modal-models-pretraining-and-instruction-tuning/”>recent report by Meta ai.
They also recognize that transparency is difficult without tools that can show ai developers their consumption. Reporting is on LLSC’s roadmap for this year. They want to be able to show each LLSC user, for each job, how much energy they consume and how this amount compares to others, similar to home energy reports.
Part of this effort requires working more closely with hardware manufacturers to make getting this data out of hardware easier and more accurate. If manufacturers can standardize how data is read, then reporting and power saving tools can be applied across different hardware platforms. A collaboration between LLSC and Intel researchers is underway to work on this very problem.
Even ai developers who are aware of the intense energy needs of ai cannot do much on their own to curb this energy use. The LLSC team wants to help other data centers apply these interventions and provide users with energy-conscious options. Its first partnership is with the US Air Force, the sponsor of this research, which operates thousands of data centers. The application of these techniques can significantly reduce your energy consumption and cost.
“We are putting control in the hands of ai developers who want to reduce their footprint,” says Gadepally. “Do I really need to train unpromising models for free? Am I willing to run my GPUs slower to save power? As far as we know, no other supercomputing center allows you to consider these options. Using our tools, today you can decide.”
Visit ai-energy-reduction”>this web page to see the group’s publications related to energy-aware computing and the findings described in this article.