The French association Data forever threw a White paper explore the social and environmental issues surrounding generative ai. He was particularly interested in the environmental impact of linguistic models, which is less discussed than the ethical aspects. Here are my key learnings:
- Context: world leaders committed to reduce our emissions by 2050 well below 2°C. This means reducing our emissions by 43% between 2020 and 2030 (to limit warming to 1.5°C, see section C.1.1 in the IPCC report). However, in the digital space, emissions are not reduced but increased. 2 to 7% annual.
- GPT-3 training emitted a whopping 2,200 tons of CO2 equivalent — comparable to 1,600 round-trip flights from Paris to New York.
- With 13 million users, ChatGPT’s monthly usage is equivalent to 10,000 tons of CO2. It would contribute 0.1% to the annual carbon footprint of people in France and the UK if everyone used it today and 0.5% of our target footprint in 2050.
- The impact of ChatGPT+, based on GPT-4, could be 10 to 100 times greateradding up to 10% of our current annual carbon footprint… or 50% of our target footprint.
- There are many ways to reduce the impact of using such models.: use them reasonably and opt for cloud services with proven environmental performance.
To evaluate the environmental impact of anything, we can estimate its carbon footprint: it measures the total greenhouse gas emissions caused directly and indirectly by an individual, organization or product, expressed in tons of carbon dioxide equivalent (CO2e).
To put it in perspective, the average annual carbon footprint is approximately 8 to 13 tons per person in the world. United Kingdom either OfnorthEC, 21 tons in the US and 6 tons worldwide.. I will consider 10 tons as our current footprint.
Some examples (with sources):
To keep global temperature rise below 2 degrees, us ought aim Reduce our global carbon footprint to 2 tonnes per person by 2050.
There is still a lot of work to do to reduce our emissions by 80 or 90%, and the growing demand for digital services Overcoming efficiency improvements doesn’t help. How does generative ai fit into this equation and what can we do to align our digital advancements with our environmental goals?
In it ai-inference/difference-between-deep-learning-training-and-inference.html” rel=”noopener ugc nofollow” target=”_blank”>training In this phase, we feed the language models some selected data so that they can learn from it and be able to respond to our requests.
The study analyzed two major language models:
1. Bloom open source
2. OpenAI Proprietary GPT-3
Key results:
– Bloom’s Carbon Footprint: Initially estimated at 30 tons, it was revised to 120 tons after exhaustive analysis.
– GPT-3 carbon footprint: Extrapolated to 2,200 tons, equivalent to 1,600 return flights from Paris to New York.
A common view is that it is okay for these models to have high training costs because they are widely used by many users.
ai-inference/difference-between-deep-learning-training-and-inference.html” rel=”noopener ugc nofollow” target=”_blank”>Inference Machine Learning is when we use a trained model to make predictions on live data. We are now analyzing the impact of running ChatGPT.
Assuming that Chatgpt has 13 million active users making 15 requests on average, the monthly carbon footprint is 10,000 tons of CO2.
And the key learning for me is that this is much greater than the impact of the training.
For a user, the addition to the annual carbon footprint is 12 months * 10,000 tonnes / 13 million users = 9 kilos of CO2eq per year per user, equivalent to 0.1% of the current average annual carbon footprint, or to 0.5% of our target footprint. .
But what if that person uses ChatGPT plus with GPT-4? The footprint of GPT-4 is 10 to 100 times larger than that of GPT-3. This footprint is equivalent to between 100 kilos of CO2e and 1 additional ton, up to 10% of the carbon footprint of a French citizen, and twice as much if everything possible is done to reduce it. If we consider our target footprint in 2050, that’s 50%!
That sucks.
What if, one day, every interaction you have with any application in your life makes requests to language models? Scary thought.
The good news is. Extensive use of the gpt4 API is so expensive that we cannot allow our users to make 15 requests per day unless they are willing to pay a monthly subscription of over $100, which is my target market for the product I am building ( a personal meditation assistant) is not willing to pay. And it’s not just small businesses that can’t afford it: Google and Microsoft also can’t afford to replace their search engines with a model the size of GPT4, which would increase the cost of their queries by 100.
The recommendations are the following:
- Stay sober: It may be tempting to replace an entire IT project with ChatGPT-4, but instead, we can question the usefulness of the project, the real need to use a language model, and limit its use to specific cases that really require it. Use a much smaller model than GPT-4 whenever you can. Think twice before using ChatGPT+.
- Optimize training and usage: At this point, the techniques are numerous, they are constantly evolving and data scientists should use them now… to reduce costs. They mainly consist of reducing the use of infrastructure, which in turn reduces electricity consumption and, therefore, carbon emissions. In essence, we only train a model if necessary; If we train, we plan it to avoid wasting resources. And we use the smallest model that meets the needs.
- Select the best country to host your server based on the carbon footprint of your energy. And here comes French pride: the carbon footprint of our mainly nuclear energy is 7 times smaller than that of the United States. But suppose we all start hosting our linguistic models here: in that case, we will probably import coal power from our dear neighbors .
- Select the best cloud service based on their environmental performance (this data is sometimes public; otherwise, there are tools to measure/estimate it such as https://mlco2.github.io/impact/): They favor cloud services that use their servers for longer (however, hyperscalers tend to keep their hardware for no more than 4 years) and data centers with a high level of sharing.
Whether you are an individual or a corporation, there are resources and experts available to guide you on a sustainable path.
At the individual level:
– If you like evaluate your carbon footprint, there are many tools online. On a personal level, measuring my carbon footprint opened my eyes and prompted me to explore ways to make a positive impact. If you live in the UK, please tick https://footprint.wwf.org.uk/
– TO Get a 3-hour crash course on the fundamental science behind climate change: https://climatefresk.org/
– TO Research the actions you can take and estimate how much you would reduce your footprintanother 3 hour workshop: https://en.2tonnes.org/
At the corporate level:
Many companies are exploring these topics and here is what they can do:
- educate your employees (with the workshops suggested above),
- perform audits and measure your carbon footprint,
- establish strategies to improve your ESG criteria (Environmental, Social and Corporate Governance) scores.
I found out about this brilliant study thanks to some excellent people I recently met, Toovalu and wavy stone. Look what they do!
Please comment if you found any errors in my estimates or want to add your thoughts and share if you found it interesting.
Thank you For taking the time to read this article, I hope it was insightful! Many thanks to Thibaut, Léo, Benoit and Diane for their valuable comments and additions to this article .
And if you want to stay up to date on generative ai and responsible ML, follow me on LinkedIn .